Enzymatic compositions for carbohydrate antigen cleavage on donor organs, methods and uses related thereto

文档序号:589291 发布日期:2021-05-25 浏览:8次 中文

阅读说明:本技术 用于供体器官上的碳水化合物抗原切割的酶促组合物、与其相关的方法和用途 (Enzymatic compositions for carbohydrate antigen cleavage on donor organs, methods and uses related thereto ) 是由 马塞洛·塞普拉 王艾舟 沙菲克·克沙夫吉 斯蒂芬·G·威瑟斯 彼得·拉费尔德 加雅善德兰· 于 2019-08-16 设计创作,主要内容包括:本文提供了用于酶促切割来自供体器官的A抗原的灌注流体,以及与其相关的方法、用途。具体地,所述灌注流体包含两种酶GalNAc脱乙酰酶和半乳糖胺酶,并且所述流体还可以包含缓冲的细胞外溶液和/或拥挤剂。此外,发现本文所述的组合物在适于细胞存活的温度和pH水平下具有活性。(Provided herein are perfusion fluids for enzymatic cleavage of a antigen from a donor organ, and methods, uses related thereto. In particular, the perfusion fluid comprises the two enzymes GalNAc deacetylase and galactosaminase, and the fluid may also comprise a buffered extracellular solution and/or a crowding agent. In addition, the compositions described herein were found to be active at temperatures and pH levels suitable for cell survival.)

1. Perfusion fluid for enzymatic cleavage of A antigen from a donor organ comprising

(a) Purified GalNAc deacetylase protein; and

(b) purified galactosaminidase protein.

2. The perfusion fluid of claim 1, wherein:

(a) the GalNAc deacetylase is a purified protein selected from one or more of the following: 2, SEQ ID NO; 4, SEQ ID NO; 5, SEQ ID NO; 17 in SEQ ID NO; 23, SEQ ID NO; 29 in SEQ ID NO; 31, SEQ ID NO; 32 in SEQ ID NO; 33, SEQ ID NO; 34 and 35; and

(b) the galactosamine enzyme is a purified protein selected from one or more of the following: 7 in SEQ ID NO; 9, SEQ ID NO; 10 in SEQ ID NO; 19 in SEQ ID NO; 21, SEQ ID NO; SEQ ID NO:36 and SEQ ID NO: 37.

3. The perfusion fluid of claim 1, wherein the perfusion fluid comprises: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 90% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 90% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

4. The perfusion fluid of claim 1 or 2, wherein the perfusion fluid comprises an enzyme selected from one or more of:

(a) the purified GalNAc deacetylase protein is a GalNAc deacetylase protein of purified Clostridium prodigiosus (flavobacterium planutii) of SEQ ID NO. 2, SEQ ID NO. 4 and SEQ ID NO. 5; and

(b) the purified galactosamine enzyme protein is the purified galactosamine enzyme protein of Clostridium proverb of SEQ ID NO 7, SEQ ID NO 9 and SEQ ID NO 10.

5. The perfusion fluid of claim 1 or 2, wherein the perfusion fluid comprises one or more of:

(a) the purified GalNAc deacetylase protein is a GalNAc deacetylase protein of purified Bacillus terns (Clostridium terricum) of SEQ ID NO 17 or SEQ ID NO 32; and

(b) the purified galactosamine enzyme protein is the galactosamine enzyme protein of the purified third clostridium with SEQ ID NO. 19 or SEQ ID NO. 36.

6. The perfusion fluid of any one of claims 1-5, wherein the GalNAc deacetylase and the galactosaminase are capable of cleaving an A antigen at 1 μ g/ml or less than 1 μ g/ml.

7. The perfusion fluid of any one of claims 1-6, wherein the GalNAc deacetylase and the galactosaminase have A antigen-cleaving activity at a pH of about 6.5 to about 7.5.

8. The perfusion fluid of any one of claims 1-7, wherein the GalNAc deacetylase and the galactosaminase have A antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃.

9. The perfusion fluid of any one of claims 1-8, wherein the perfusion fluid further comprises a buffered extracellular solution.

10. The perfusion fluid of claim 9, wherein the buffered extracellular solution is selected from the group consisting of: steenTM;PerfadexTM;Perfadex PlusTM(ii) a EuroCollins solution; histidine-tryptophan-ketoglutarate (HTK) solution; university of wisconsin solution (UW); celsior solution; kidney perfusate (KPS-1); kyoto university solution; IGL-1 solution; and a citrate solution.

11. A method for enzymatic cleavage of a antigen ex vivo from a donor organ, the method comprising:

(a) perfusing a donor organ displaying a type a antigen with a fluid comprising GalNAc deacetylase protein and galactosaminase protein for a time sufficient to allow the enzyme to cleave a antigen from the donor organ; or

(b) Incubating a donor organ displaying a type a antigen with a fluid comprising GalNAc deacetylase protein and galactosaminase protein for a time sufficient to allow the enzyme to cleave a antigen from the donor organ.

12. The method of claim 11, wherein

The GalNAc deacetylase is a purified protein selected from one or more of the following: 2, SEQ ID NO; 4, SEQ ID NO; 5, SEQ ID NO; 17 in SEQ ID NO; 23, SEQ ID NO; 29 in SEQ ID NO; 31, SEQ ID NO; 32 in SEQ ID NO; 33, SEQ ID NO; 34 and 35; and

the galactosamine enzyme is a purified protein selected from one or more of the following: 7 in SEQ ID NO; 9, SEQ ID NO; 10 in SEQ ID NO; 19 in SEQ ID NO; 21, SEQ ID NO; SEQ ID NO:36 and SEQ ID NO: 37.

13. The method of claim 11, wherein the composition comprises: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 90% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 90% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

14. The process of claim 11, wherein said GalNAc deacetylase is the purified GalNAc deacetylase protein of Clostridium prospermi of SEQ ID NO. 4 or SEQ ID NO. 5, and said galactosaminase is the purified C galactosaminase protein of Clostridium prospermi of SEQ ID NO. 9 or SEQ ID NO. 10.

15. The process of any one of claims 11 to 14, wherein the GalNAc deacetylase protein and the galactosaminase protein are in a buffered extracellular solution.

16. The method of claim 15, wherein the buffered extracellular solution is selected from the group consisting of: steenTM;PerfadexTM;Perfadex PlusTM(ii) a EuroCollins solution; histidine-tryptophan-ketoglutarate (HTK) solution; university of wisconsin solution (UW); celsior solution; kidney perfusate (KPS-1); kyoto university solution; IGL-1 solution; and a citrate solution.

17. The method of any one of claims 11-16, wherein the donor organ is a solid organ.

18. The method of claim 17, wherein the solid organ is selected from one of: a lung; the kidney; a liver; a heart; pancreas and intestine.

19. The method of claim 18, wherein the solid organ is a lung.

20. The method of claim 17, wherein said GalNAc deacetylase protein and said lactamine enzyme protein are mixed with an ex vivo buffered extracellular lung solution and circulated in the lungs, whereby said GalNAc deacetylase protein and said galactosaminase protein are in contact with the vasculature of said donor organ for a time sufficient to substantially clear said a antigen from the vasculature of said lungs.

21. The method of claim 20, wherein the time to clear the a antigen from the vasculature of the lung is about 1 hour.

22. The method of any one of claims 11-21, wherein the method further comprises washing the donor organ to remove GalNAc deacetylase, galactosaminase, and cleaved a antigen.

23. The process of any one of claims 11-22, wherein said GalNAc deacetylase and galactosidase are capable of cleaving the a antigen at 1 μ g/ml or less than 1 μ g/ml.

24. The process of any one of claims 11-23, wherein said GalNAc deacetylase and galactosidase have a-antigen cleaving activity at a pH of about 6.5 to about 7.5.

25. The process of any one of claims 11-24, wherein the GalNAc deacetylase and galactosaminase have a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃.

Technical Field

The present invention relates to the field of enzyme compositions. In particular, the present invention relates to an enzyme composition for cleaving an antigen on a donor organ, and provides methods and uses for cleaving an antigen using the composition.

Background

Correct matching of blood groups is a major requirement of transfusion medicine, since the plasma of blood group a individuals contains antibodies against the B antigen and vice versa, incompatible transfusions can lead to complement activation and Red Blood Cell (RBC) lysis (Daniels 2010). These cell surface antigens are carbohydrate structures of alpha-1, 3-linked-N-acetylgalactosamine (GalNAc) or galactose (Gal) terminating in type A and type B blood, respectively. On the other hand, O-RBCs do not contain these terminal sugars and can be transfused universally (garretty 2008). Thus, in an emergency situation where the patient's blood type is unknown or unclear, a good supply of O-RBCs is needed in the blood bank. However, the supply is usually limited.

Goldstein first proposed and demonstrated the concept of enzymatic removal of GalNAc or Gal structures from A RBC or B RBC as a means of converting A RBC or B RBC to O RBC (Goldstein 1982; US4609627 and CA 2272925). B-type RBCs were converted to O-type RBCs using alpha-galactosidase from green coffee beans, followed by successful transfusion (Kruskall 2000). However, the amount of enzyme required makes this approach impractical. The shift of type a is more challenging, mainly because type a blood exists in many subtypes with different internal connections (Clausen 1989). Similarly, alpha-galactosidase has been used to remove type B antigens (see, e.g., EP 2243793). By screening bacterial libraries with A and B converting activity using tetrasaccharide substrates, the actual conversion to including type A has taken an important step forward. Two new families of glycosidases were found to exhibit high antigen cleavage activity at neutral pH: CAZy GH109 α -N-acetylgalactosaminidases (α -N-acetylgalactosaminidases) and GH110 α -galactosidase (Liu 2007). Both enzymes convert their respective RBCs, with complete removal of the respective antigens. However, the conversion still requires a large amount of enzyme, especially type a (60mg enzyme/blood unit), which limits further development. An enzyme with higher efficiency in cleaving carbohydrate antigens from cells would be useful.

SUMMARY

The present invention is based in part on the unexpected discovery that the combination of Galactosaminidase (Galactosaminidase) and GalNAc deacetylase as described herein is orders of magnitude more effective than the a antigen cleaving enzymes previously identified. For example, under some conditions, some GalNAc deacetylases and galactosaminases are capable of cleaving the a antigen at 1 μ/ml or below 1 μ/ml. In addition, the cleavage efficiency of the enzyme combination is maintained at a pH suitable for maintaining red blood cell viability (i.e., a pH of about 6.5 to about 7.5). In addition, the enzyme was found to be active at temperatures between 4 ℃ and 37 ℃, which is also suitable for blood collection, washing and storage protocols. In addition, the efficiency of the enzyme is further improved by the addition of a crowding agent (e.g., dextran). It has also been realized that the same two-step cutting procedure can be applied to a donor organ.

However, it will be appreciated by those skilled in the art that more enzyme may be used to reduce the time in which the donor organ may be perfused, or less enzyme may be used, as long as the perfusion time of the donor organ is longer.

According to one embodiment, a perfusion fluid for enzymatic cleavage of a antigen from a donor organ, comprising: (a) purified GalNAc deacetylase protein; and (b) a purified galactosaminidase protein.

According to another embodiment, there is provided a perfusion fluid, wherein the perfusion fluid comprises: (a) the GalNAc deacetylase is a purified protein selected from one or more of the following: 2, SEQ ID NO; 4, SEQ ID NO; 5, SEQ ID NO; 17 in SEQ ID NO; 23, SEQ ID NO; 29 in SEQ ID NO; 31, SEQ ID NO; 32 in SEQ ID NO; 33, SEQ ID NO; 34 and 35; and (b) the galactosamine enzyme is a purified protein selected from one or more of the following: 7 in SEQ ID NO; 9, SEQ ID NO; 10 in SEQ ID NO; 19 in SEQ ID NO; 21, SEQ ID NO; SEQ ID NO:36 and SEQ ID NO: 37.

According to another embodiment, there is provided a perfusion fluid, wherein the perfusion fluid comprises: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 90% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 90% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

The enzyme may be selected from one or more of the following: (a) the purified GalNAc deacetylase eggThe albumin is the GalNAc deacetylase protein of purified Clostridium prodiginosum (Flavonifractor platutii) of SEQ ID NO. 2, SEQ ID NO. 4 and SEQ ID NO. 5; and (b) the purified galactosamine enzyme protein is SEQ ID NO 7, SEQ ID NO 9 and SEQ ID NO 10 purified Clostridium prodigiosus galactosamine enzyme protein. The enzyme may be selected from one or more of the following: (a) the purified GalNAc deacetylase protein is a GalNAc deacetylase protein of purified Bacillus terns (Clostridium terricum) of SEQ ID NO 17 or SEQ ID NO 32; and (b) the purified galactosamine enzyme protein is the purified galactosamine enzyme protein of Bacillus terns of SEQ ID NO:19 or SEQ ID NO: 36. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 1. mu.g/ml or less than 1. mu.g/ml. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.5 to about 7.5. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃. The perfusion fluid may also comprise a buffered extracellular solution. The buffered extracellular solution may be selected from: steenTM;PerfadexTM;Perfadex PlusTM(ii) a EuroCollins solution; histidine-tryptophan-ketoglutarate (HTK) solution; university of wisconsin solution (UW); celsior solution; kidney perfusate (KPS-1); kyoto university solution; IGL-1 solution; and a citrate solution.

According to another embodiment, there is provided a method for enzymatically cleaving a antigen from a donor organ ex vivo, the method comprising: (a) perfusing a donor organ displaying a type a antigen with a fluid comprising GalNAc deacetylase protein and galactosaminase protein for a time sufficient to allow the enzyme to cleave a antigen from the donor organ; or (b) incubating a donor organ displaying a type a antigen with a fluid comprising GalNAc deacetylase protein and galactosaminase protein for a time sufficient to allow the enzyme to cleave a antigen from the donor organ.

The GalNAc deacetylase may be a purified protein selected from one or more of the following: 2, SEQ ID NO; 4, SEQ ID NO; 5, SEQ ID NO; 17 in SEQ ID NO; 23, SEQ ID NO; 29 in SEQ ID NO; 31, SEQ ID NO; 32 in SEQ ID NO; 33, SEQ ID NO; 34 and 35; and the galactosamine enzyme may be a purified protein selected from one or more of the following: 7 in SEQ ID NO; 9, SEQ ID NO; 10 in SEQ ID NO; 19 in SEQ ID NO; 21, SEQ ID NO; SEQ ID NO:36 and SEQ ID NO: 37.

The purified enzyme having GalNAc deacetylase activity can comprise substantially an amino acid sequence which is at least 90% identical to the sequence shown in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31 and 32 to 35; and the purified enzyme having galactosaminidase activity can essentially comprise an amino acid sequence which is at least 90% identical to the sequence shown in one of SEQ ID NO 7, 9, 10, 19, 21, 36 and 37.

The GalNAc deacetylase can be the purified GalNAc deacetylase protein of Clostridium prosbergii of SEQ ID NO. 4 or SEQ ID NO. 5, and the galactosaminase can be the purified GalNAc galactosidase protein of Clostridium prosbergii of SEQ ID NO. 9 or SEQ ID NO. 10.

The GalNAc deacetylase protein and the galactosaminase protein can be in a buffered extracellular solution. The buffered extracellular solution may be selected from: steenTM;PerfadexTM;Perfadex PlusTM(ii) a EuroCollins solution; histidine-tryptophan-ketoglutarate (HTK) solution; university of wisconsin solution (UW); celsior solution; kidney perfusate (KPS-1); kyoto university solution; IGL-1 solution; and a citrate solution. The donor organ may be a solid organ. The solid organ may be selected from one of: a lung; the kidney; a liver; a heart; pancreas and intestine. The solid organ may be a lung.

The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an ex vivo buffered extracellular lung solution and circulated through the lungs, whereby the GalNAc deacetylase protein and the galactosaminase protein contact the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the lungs. The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an ex vivo buffered extracellular kidney solution and circulated through the kidney, whereby the GalNAc deacetylase protein and the galactosaminase protein contact the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the kidney. The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an ex vivo buffered extracellular liver solution and circulated through the liver, whereby the GalNAc deacetylase protein and the galactosaminase protein are in contact with the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the liver. The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an extracellularly buffered cardiac solution and circulated through the heart, whereby the GalNAc deacetylase protein and the galactosaminase protein contact the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the heart. The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an ex vivo buffered extracellular pancreatic solution and circulated through the pancreas, whereby the GalNAc deacetylase protein and the galactosaminase protein contact the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the pancreas. The GalNAc deacetylase protein and the lactamine protease protein can be mixed with an ex vivo buffered extracellular enteric solution and circulated through the intestine, whereby the GalNAc deacetylase protein and the galactosaminase protein contact the vasculature of the donor organ for a time sufficient to substantially clear the a antigen from the vasculature of the intestine.

The time to clear the a antigen from the vasculature may be about 1 hour. The time to clear the a antigen from the vasculature may be less than 1 hour. The time to clear the a antigen from the vasculature may be about 2 hours.

The method can further comprise washing the donor organ to remove GalNAc deacetylase, galactosaminase, and cleaved a antigen. The GalNAc deacetylase and the galactosidase may be capable of cleaving the A antigen at 1. mu.g/ml or less than 1. mu.g/ml. The GalNAc deacetylase and the galactosidase can have an a antigen cleaving activity at a pH of about 6.5 to about 7.5. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃.

According to another embodiment, there is provided a composition comprising: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 85% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 85% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

According to another embodiment, there is provided a composition comprising: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 80% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 80% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

According to another embodiment, there is provided a composition comprising: a purified enzyme having GalNAc deacetylase activity, which consists essentially of an amino acid sequence that is at least 75% identical to a sequence set forth in one of SEQ ID NOs 2, 4, 5, 17, 23, 29, 31, and 32-35; and a purified enzyme having galactosaminidase activity, consisting essentially of an amino acid sequence which is at least 75% identical to the sequence shown in one of SEQ ID NOs 7, 9, 10, 19, 21, 36 and 37.

The composition may include: (a) the purified GalNAc deacetylase and the purified galactosaminase can be immobilized; (b) the purified GalNAc deacetylase can be immobilized; or (c) the purified galactosamine enzyme may be immobilized.

The immobilized enzyme may be attached to a surface, which may be selected from one or more of the following: (a) beads or microspheres; (b) a container;(c) a tube; (d) a column; and (e) a substrate. The composition may also include a crowding agent. The crowding agent may be selected from one or more of the following: dextran, dextran sulfate, dextrin, pullulan, poly (ethylene glycol), and polysucroseTMAnd inert proteins.

According to another embodiment, there is provided a purified enzyme comprising the GalNAc deacetylase of Clostridium pratense of SEQ ID NO. 2, SEQ ID NO. 4 or SEQ ID NO. 5.

According to another embodiment, there is provided a purified enzyme comprising the C.previosus galactosaminase of SEQ ID NO 7, SEQ ID NO 9 or SEQ ID NO 10.

According to another embodiment, there is provided a purified enzyme comprising the GalNAc deacetylase of Bacillus trifurcatus of SEQ ID NO 17 or SEQ ID NO 32.

According to another embodiment, there is provided a purified enzyme comprising the galactosaminidase of Bacillus latus III of SEQ ID NO 19 or SEQ ID NO 36.

The protein tag may be selected from one or more of the following: albumin Binding Protein (ABP); alkaline Phosphatase (AP); AU1 epitope; AU5 epitope; an Avi tag; the bacteriophage T7 epitope (T7 tag); phage V5 epitope (V5 tag); biotin-carboxy carrier protein (BCCP); bluetongue virus label (B label); single domain camelid antibodies (C-tags); calmodulin binding peptides (CBP or calmodulin tag); chloramphenicol Acetyltransferase (CAT); a cellulose binding domain (CBP); a Chitin Binding Domain (CBD); a Choline Binding Domain (CBD); dihydrofolate reductase (DHFR); DogTag; an E2 epitope; e, labeling; a FLAG epitope (FLAG tag); galactose Binding Protein (GBP); green Fluorescent Protein (GFP); Glu-Glu (EE tag); glutathione S Transferase (GST); human influenza Hemagglutinin (HA); (ii) a HaloTagTM(ii) a Alternating histidine and glutamine tags (HQ tags); alternating histidine and asparagine tags (HN-tag); histidine Affinity Tag (HAT); horseradish peroxidase (HRP); an HSV epitope; isopeptag (Isopep tag); steroid Isomerase (KSI); the KT3 epitope; LacZ; a luciferase enzyme; maltose Binding Protein (MBP); myc epitopes (Myc tags); NE tag(ii) a NusA; a PDZ domain; a PDZ ligand; poly-arginine (Arg tag); polyaspartic acid (Asp tag); polycysteine (Cys tag); polyglutamic acid (Glu tag); polyhistidine (His-tag); polyphenylalanine (Phe tag); definition eXact; protein C; rho1D4 label; an S1 label; s, labeling; softag 1; softag 3; snoottagjr; snoottag; a Spot label; SpyTag (Spy tag); streptavadin Binding Peptide (SBP); staphylococcal protein a (protein a); staphylococcal protein G (protein G); a Strep tag; streptavadin (SBP tag); strep tag II; sdy a label; small ubiquitin-like modifier (SUMO); tandem Affinity Purification (TAP); the T7 epitope; a tetra-cysteine tag (TC tag); thioredoxin (Trx); TrpE; a Ty label; ubiquitin; the method is general; a V5 label; VSV-G or VSV tags, and Xpress tags.

According to another embodiment, there is provided a method for enzymatic cleavage of a antigen from a donor organ, the method comprising: (a) combining GalNAc deacetylase and galactosaminase proteins with a donor organ displaying a type a antigen; (b) perfusing the enzyme into the blood vessel of the donor organ for a time sufficient for the enzyme to cleave the A antigen from the lumen of the blood vessel of the donor organ.

The method may further comprise adding a crowding agent. The crowding agent may be selected from one or more of the following: dextran; dextran sulfate; dextrin; pullulan; poly (ethylene glycol); polysucroseTM(ii) a Hyperbranched glycerol and inert proteins. The method may comprise perfusing the donor organ with an organ perfusion or organ preservation solution comprising the enzyme composition described herein.

The method may further comprise washing the donor organ to remove GalNAc deacetylase, galactosaminase, and/or the crowding agent.

The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 1. mu.g/ml or less than 1. mu.g/ml. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.5 to about 7.5. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃.

The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 100. mu.g/ml or less than 100. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 90. mu.g/ml or below 90. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 80. mu.g/ml or below 80. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 70. mu.g/ml or less than 70. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 60. mu.g/ml or less than 60. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 50. mu.g/ml or less than 50. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 40. mu.g/ml or below 40. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 30. mu.g/ml or less than 30. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 20. mu.g/ml or below 20. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 15. mu.g/ml or less than 15. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 14. mu.g/ml or below 14. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 13. mu.g/ml or below 13. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 12. mu.g/ml or less than 12. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 11. mu.g/ml or less than 11. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 10. mu.g/ml or less than 10. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 9. mu.g/ml or less than 9. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 8. mu.g/ml or below 8. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 7. mu.g/ml or below 7. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 6. mu.g/ml or below 6. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 5. mu.g/ml or less than 5. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 4. mu.g/ml or below 4. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 3. mu.g/ml or less than 3. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 2. mu.g/ml or less than 2. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 1. mu.g/ml or less than 1. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.9. mu.g/ml or less than 0.9. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.8. mu.g/ml or less than 0.8. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.7. mu.g/ml or less than 0.7. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.6. mu.g/ml or less than 0.6. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.5. mu.g/ml or less than 0.5. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.4. mu.g/ml or less than 0.4. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.3. mu.g/ml or less than 0.3. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.2. mu.g/ml or less than 0.2. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.1. mu.g/ml or less than 0.1. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.09. mu.g/ml or less than 0.09. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.08. mu.g/ml or less than 0.08. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.07. mu.g/ml or less than 0.07. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.06. mu.g/ml or less than 0.06. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.05. mu.g/ml or less than 0.05. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.04. mu.g/ml or less than 0.04. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.03. mu.g/ml or less than 0.03. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.02. mu.g/ml or less than 0.02. mu.g/ml. The GalNAc deacetylase and the galactosaminase may be capable of cleaving the A antigen at 0.01. mu.g/ml or less than 0.01. mu.g/ml.

The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.5 to about 7.5. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.0 to about 8.0. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.8 to about 7.8. The GalNAc deacetylase and the galactosamine enzyme can have a-antigen cleaving activity at a pH of about 6.9 to about 7.9. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen cleaving activity at a pH of about 6.4 to about 7.8.

The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 3 ℃ to 38 ℃. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 40 ℃. The GalNAc deacetylase and the galactosamine enzyme can have an a antigen-cleaving activity at a temperature of 4 ℃ to 37 ℃. The GalNAc deacetylase and the galactosaminase can have an a antigen-cleaving activity at a temperature of 5 ℃ to 37 ℃.

According to another embodiment, there is provided a purified enzyme of GalNAc deacetylase of Clostridium provenii comprising SEQ ID NO 2, SEQ ID NO 4 or SEQ ID NO 5.

According to another embodiment, there is provided a purified enzyme of the galactosaminidase of Clostridium proverbyi comprising SEQ ID NO 7, SEQ ID NO 9 or SEQ ID NO 10.

According to another embodiment, there is provided a purified enzyme comprising the purified third Clostridium GalNAc deacetylase of SEQ ID NO. 14 and a galactosaminase fusion protein. .

According to another embodiment, there is provided a vector comprising a nucleic acid as described herein and a heterologous nucleic acid sequence.

According to another embodiment, the method may be performed in vitro or ex vivo. Ex vivo as used herein means that the method is performed in vitro. For example, ex vivo would include Ex Vivo Lung Perfusion (EVLP) and the processing of donated blood. As used herein, ex vivo refers to an experiment or measurement or treatment performed in or on tissue or cells (e.g., red blood cells or donor organs) from an organism in an external environment, where the tissue or cells are under minimal or some change in conditions while in vivo.

Brief Description of Drawings

FIG. 1 shows a schematic representation of the cell surface antigen carbohydrate structures of type A, H and B ending in alpha-1, 3-linked N-acetylgalactosamine (GalNAc) or galactose (Gal), wherein the triangles mark the cleavage points of the alpha-N-acetylgalactosaminidase EmGH109 and alpha-galactosidase BfGal 110.

FIG. 2 shows the deacetylation enzymatic pathway for cleavage of the A antigen, whereby the GalNAc deacetylase of Clostridium provenii (Fp) cleaves terminal α -N-acetylgalactosamine (-42) from the A antigenm/z) Then galactosamine intermediate by the galactosaminidase of Clostridium putida (Fp) (-161)m/z) Cleavage, which was analyzed by the corresponding Mass Spectrometry (MS).

FIG. 3 shows A treated with different concentrations of EmGH109 or GalNAc deacetylase of Clostridium pratensis (FpGalNAc deacetylase) plus galactosaminase of Clostridium pratensis (Fp galactosaminase)+RBC or A treated at 37 deg.C for 1h+FACS analysis of RBCs, wherein for visualization anti-H antibodies (plus secondary FITC labeled antibody) and APC labeled anti-a antibodies are used, with the region where H antigen appears in the box in the upper left corner. Rows A-D compare the difference between the values at 5. mu.g/ml (A); 10 μ g/ml (B); 50. mu.g/ml (C) and 50. mu.g/ml + dextran 40k (D) of EmGH109 and FpGalNAcDeAc + FpGalNase.

FIG. 4 shows a comparison of EmGH109 and FpGalNAcDeAc + FpGalNase at various enzyme concentrations with (■) and without (. diamond-solid.) dextran at different temperatures (i.e., 4 ℃, Room Temperature (RT) and 37 ℃).

FIG. 5 shows HPAE-PAD analysis of A + B + and O + erythrocyte cleavage products, and comparison of the GalNAc deacetylase of full-length Clostridium proudenreichii (FpGalNAcDeAc) + the galactosaminase of Clostridium proudenreichii (FpGalNase) enzyme with the truncated FpGalNAcDeAc + FpGalNase enzyme on A + erythrocytes.

FIG. 6 shows the pH spectrum of each of (A) FpGalNAc deacetylase and (B) Fp galactosaminase.

FIG. 7 shows the conversion of A antigen to H antigen on A RBC by FACS analysis for (A) A + RBC control, (B) GalNAc deacetylase of Clostridium prodigiosus (FpGalNAcDeAc) + galactosaminase of Clostridium prodigiosus (FpGalNase) (10. mu.g/mL), (C) FpGalNAcDeAc + GalNase of third Clostridium prodigiosus (Ct) Ct 5757577 _ GalNase (10ug/mL) and (D) GalNAcDeAc + galactosaminase of Robinsoniella peoriensis (Rp) (Rp1021) GalNase (10 ug/mL).

FIG. 8 shows the enzyme in different perfusion solutions (i.e., PBS, Steen)TMAnd PerfadexTM) For removing a antigen from type a human erythrocytes.

Figure 9 shows the dose escalation effect of enzyme on type a human arteries in STEEN solution, where the percentage of type a antigen was quantified by immunohistochemical analysis of biopsies obtained from untreated (control), treated (treated) type a arteries and type O arteries as a negative control.

Figure 10 shows a graph testing the effect of 1 hour enzymatic treatment on ex vivo perfused human donor lungs, where immunohistochemical staining of biopsied human donor lungs compared pre-treatment images with post-treatment images of the upper right dependent (RUD) area, upper right independent (RUND), middle right independent (RMND) area, middle right dependent (RMD) area, lower right independent (RLND) area, and lower right dependent (RLD) area of the lungs, blood group a antigens not present in blood vessels.

Figure 11 shows a graph testing the effect of 3 hour enzymatic treatment on ex vivo perfused human donor lungs, where immunohistochemical staining of biopsied human donor lungs compared pre-treatment images with post-treatment images of the upper right dependent (RUD) area, upper right independent (RUND), middle right independent (RMND) area, middle right dependent (RMD) area, lower right independent (RLND) area, and lower right dependent (RLD) area of the lungs, blood group a antigens were not present in the blood vessels.

Detailed description of the invention

The following detailed description will be better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, the drawings show embodiments of the invention. The invention, however, is not limited to the precise arrangements, examples, and instrumentalities shown.

Any terms not directly defined herein should be understood to have the meanings commonly associated therewith as understood within the field of the present invention.

As used herein, an "immobilized enzyme" is an enzyme that is attached to a surface, which may be an inert insoluble material. Immobilization of enzymes may provide enhanced resistance to changes in conditions (e.g., pH, temperature, etc.) and facilitate their removal after use and reuse of the enzymes.

Immobilization of the enzyme may be achieved in various ways (e.g., affinity tag binding, surface adsorption on glass, resin, alginate beads or matrices, bead, fiber or microsphere entrapment, cross-linking to surfaces or other enzymes and covalent binding to surfaces).

As used herein, "affinity tag binding" refers to immobilization of an enzyme to a surface (e.g., using a porous material that is non-covalent or covalent protein tags). Affinity tag binding has been used for protein purification and has recently been used for EziGTM(ENGINZYME ABTMSweden-for example, PCT/US1992/010113 and PCT/SE 2015/050108). Alternative systems for attaching active enzymes to surfaces are known in the art (see e.g. US 4088538; US 4141857; US 4206259; US 4218363; US 4229536; US 4239854; US 4619897; US 4748121; US 4749653; US 4897352; US 4954444; US 4978619; US 5154808; US 5914367; US 9335962279; US 60306291582; US 6254645; US10,016,490 and US10,041,055).

Protein tags are peptide sequences that are genetically grafted onto recombinant proteins, usually removable by chemical reagents or by enzymatic methods, and are attached to the protein for various purposes. The protein tags listed in table a are intended as examples and are not intended to be limiting in any way. One type of protein tag is an affinity tag that is added to a protein or peptide sequence so that they can be purified from crude biological sources using affinity techniques (e.g., from expression system organisms), or to facilitate the immobilization of "tagged" proteins onto a surface. Some examples of affinity tags include Chitin Binding Domain (CBD), Maltose Binding Protein (MBP), Strep tag, glutathione-S-transferase (GST), and polyhistidine (His tag) bound to a metal matrix. Another type of protein tag is an epitope tag (e.g., including the V5 tag, Myc tag, HA tag, Spot tag, and NE tag), which are short peptide sequences selected for ease of generating high affinity antibodies, and are typically derived from viral gene sequences to improve immune reactivity. Epitope tags are particularly useful in western blot, immunofluorescence and immunoprecipitation experiments, although they are also useful for protein to surface purification and immobilization. Another type of protein tag is a chromatographic tag (e.g., a polyanionic amino acid, such as a FLAG tag), which can be used to alter the chromatographic properties of a protein to aid in isolation and purification or immobilization. Additional protein tags are solubilization tags (e.g., Maltose Binding Protein (MBP), glutathione S-transferase (GST), Thioredoxin (TRX), and poly (NANP)) and fluorescent tags (e.g., Green Fluorescent Protein (GFP)). Protein tags may allow specific enzymatic modification, chemical modification or linking of proteins to other components. However, depending on the type or amount of tag added to the protein sequence, the native function of the protein, in which case enzymatic function may be impaired by the tag. Thus, the protein tag needs to be selected to ensure that the activity of the enzyme is not compromised, or alternatively, the protein tag can be cleaved from the protein prior to use.

Table a: exemplary protein tags

In the present application, the use of protein tags is exemplified by the use of a poly histidine protein tag (His-tag) as shown in SEQ ID NOs 5, 10, 15, 17, 19, 21, 23, 25, 27, 29 and 31, but the skilled person will readily appreciate that any number of other protein tags may be used to purify the enzyme and/or to attach the enzyme to a surface as described herein, depending on the purification method used and/or the surface to which the enzyme is attached. Such protein tags may be selected from any one or more of the protein tags listed in table a, but other such protein tags are known in the art.

In addition, one or more cleavage sites (e.g., thrombin cleavage sites as used in SEQ ID NOs: 15, 17, 19, 21, 23, 25, 27, 29, and 31) can be employed to release a protein tag from an enzyme or otherwise cleave an enzyme. Cleavage sites may be used to remove the N-terminal methionine, signal peptides, and/or to convert inactive or non-functional proteins into active proteins (i.e., zymogens). Alternatively, cleavage sites may be used to isolate two or more enzymes expressed in the same reading frame. Examples of enzymes capable of cleaving a protein or peptide and which will have a sequence specific cleavage site may be selected from one or more of the following: Arg-C protease; Asp-N endopeptidase; Asp-N endopeptidase + N-terminal Glu BNPS-skatole; caspase 1; caspase 2; caspase 3; caspase 4; caspase 5; caspase 6; caspase 7; caspase 8; caspase 9; caspase 10; chymotrypsin-high specificity ([ FYW ] C-terminus, not before P); chymotrypsin-low specificity ([ FYWML ] C-terminal, not before P); clostripain (clostridial peptidase B); CNBr; enterokinase; factor Xa; formic acid; a glutamyl endopeptidase; granzyme B; a hydroxylamine; iodosobenzoic acid; LysC; LysN; NTCB (2-nitro-5-thiocyanobenzoic acid); a neutrophil elastase; pepsin (ph 1.3); pepsin (pH > 2); a proline-endopeptidase; proteinase K; staphylococcal peptidase I; tobacco etch virus protease; thermolysin; thrombin and trypsin.

The skilled person will understand that the combination of active galactosaminase and active GalNAc deacetylase capable of efficiently cleaving the a antigen as described herein is important, and the skilled person will also understand that the addition of one or more cleavage sites and/or one or more protein tags is optional, and that such modifications can be selected based on the specific expression system, purification system and possible surface attachment strategy. In addition, other modifications of the sequence of the galesaminidase and GalNAc deacetylases are possible, as long as the cleavage activity of the a antigen is not significantly impaired. In addition, the half lactamine and GalNAc deacetylase modification is possible, as long as the A antigen cutting activity is not significantly impaired. The modification of the sequences of the galesaminidase and GalNAc deacetylases may be deletion, insertion and/or substitution. The substitution may be a conservative substitution or a neutral substitution. For example, the galactosaminase and GalNAc deacetylase sequences may share 90% or greater sequence identity with the mature enzyme. For example, the galactosaminase and GalNAc deacetylase sequences may share 85% or greater sequence identity with the mature enzyme. For example, the galactosaminase and GalNAc deacetylase sequences may share 75% or greater sequence identity with the mature enzyme. Alternatively, the galactosaminase and GalNAc deacetylase sequences can modify 5%, 10%, 13%, 15%, 20%, or up to 25% of the amino acids.

As used herein, "adsorbed on glass, alginate beads or a matrix" refers to attaching an enzyme to the exterior of an inert material. Generally, this type of immobilization is not caused by chemical reactions, and the active sites of the immobilized enzyme can be blocked by the surface on which it is adsorbed, which can reduce the activity of the adsorbed enzyme.

"entrapping" as used herein refers to capturing the enzyme within insoluble beads or microspheres. However, entrapment may prevent the arrival of substrate and the exit of product. One example is the use of calcium alginate beads, which can be produced by reacting a mixture of a sodium alginate solution and an enzyme solution with calcium chloride.

"Cross-linking" as used herein refers to covalent bonding of enzymes to each other to produce a matrix consisting almost exclusively of enzymes. When designing a cross-linking enzyme reaction, the binding site ideally does not cover the active site of the enzyme, so that the activity of the enzyme is affected only by immobilization, and not by blocking of the active site of the enzyme. However, spacer molecules such as poly (ethylene glycol) may be used to reduce steric hindrance of the substrate.

"covalently bonded" as used herein refers to the bonding of an enzyme to an insoluble support or surface (e.g., silica gel) via a covalent bond. Due to the strength of the covalent bond between the enzyme and the support or surface, the possibility of the enzyme detaching from the support or surface is much less.

As used herein, "crowding agent" refers to any polymer or protein that promotes macromolecular crowding by concentrating the enzyme on the cell surface to improve the activity of the enzyme. The crowding agent may be, for example, dextran sulfate, dextrin, pullulan, poly (ethylene glycol), polysucroseTMHyperbranched glycerol and inert proteins. (Kuznetsova, I.M et al, Int J Mol Sci. (2014) "What Macromolecular crowning Can to a Protein" 15(12): 23090-23140).

As used herein, "dextran" refers to polysaccharides having a molecular weight ≧ 1,000 daltons and a linear backbone having α -linked d-glucopyranosyl repeat units. Dextran can be classified into 3 structural classes (i.e., classes 1-3) based on pyranose ring structure, which contains five carbon atoms and one oxygen atom. Dextran class 1 contains an α (1 → 6) -linked d-glucopyranosyl backbone modified with small side chains having d-glucose branches linked by α (1 → 2), α (1 → 3), and α (1 → 4). Dextran class 1 varies in their molecular weight, spatial arrangement, type and degree of branching, and branch length, 3-5 depending on the microbial production strain and culture conditions. Isomaltose and isomaltotriose are oligosaccharides with a dextran backbone structure of type 1. Dextran type 2 (alternating) contains a backbone structure of alternating α (1 → 3) and α (1 → 6) linked d-glucopyranosyl units with α (1 → 3) linked branches. Dextran type 3 (mutans) has a backbone structure of continuous α (1 → 3) linked d-glucopyranosyl units with α (1 → 6) linked branches.

As used herein, "pullulan" is a structural polysaccharide produced from starch primarily by the fungus Aureobasidium pullulans (Aureobasidium pullulans) and consists of repeating alpha (1 → 6) linked maltotriose (D-glucopyranosyl-alpha (1 → 4) -D-glucose) units with an occasional maltotetraose unit.

"dextrin" as used herein refers to a D-glucopyranosyl unit of shorter chain length than dextran, starting with a single alpha (1 → 6) linkage, but continuing linearly with an alpha (1 → 4) linked D-glucopyranosyl unit.

As used herein, "ficollTM"is a neutral, highly branched, high quality hydrophilic polysaccharide that dissolves readily in aqueous solution.

As used herein, "perfusion" refers to the infiltration of an organ with fluid by circulating the fluid through a blood vessel.

An important goal in organ preservation is to increase the number of transplantable organs available. Normally, organs are kept in a freezer, but this has potential diffusion limitations, and therefore cold perfusion systems have been developed. In addition, the near-normothermic system has also been used to enhance the functional preservation of solid organs including the liver, lungs, heart and kidneys. Many buffered extracellular solutions are used as perfusion solutions or preservation solutionsAnd (4) liquid. Many buffered extracellular solutions are known. For example, SteenTM、PerfadexTM、Perfadex PlusTMEuroCollins solution, histidine-tryptophan-ketoglutaric acid (HTK) solution, wisconsin university solution (UW), Celsior solution, kidney perfusate (KPS-1), kyoto university solution, IGL-1 solution, and citrate solution (Guibert, e.e., et al (2011). Many of these are commercially available and variations of these solutions will be apparent to those skilled in the art.

Various alternative embodiments and examples are described herein. These embodiments and examples are illustrative and should not be construed as limiting the scope of the invention.

Materials and methods

Unless otherwise indicated, chemicals and commercial enzymes used in this study were purchased from Sigma-AldrichTM. The monosaccharide methyl Umbelliferone glycoside is a generous gift to Hongming doctor Chen, and subtype 1A antigenFive of themMU is a generous gift of doctor David Kwan (Kwan et al, 2015).

Human excreta metagenomic library

To generate a human metagenomic fosmid library, we obtained from a library with AB+Healthy asian male volunteers of blood type samples of fresh human faeces were collected. Direct DNA extraction and fosmid library creation were performed according to the procedure described in the MoE protocol (Armstrong et al, 2017).

Fosmid library screening

Mixing 51X 384-hole AB+Blood Fosmid library plates were thawed at room temperature and replicated to 50. mu.L of screening LB medium (12.5. mu.g/mL chloramphenicol, 25. mu.g/mL kanamycin, 100. mu.g/mL arabinose, 0.2% (v/v) maltose, 10mM MgSO4) 384 well plates. The plates were incubated in a sealed container containing a water reservoir at 37 ℃ for 18 hours to prevent excessive evaporation. Using QfarTMInstrument [ GenetixTM]Mu.l of the reaction mixture (100mM NaH)2PO4pH7.4, 2% (v/v) Triton-X100, 100. mu.M GalNAc-. alpha. -MU, 100. mu.M Gal. alpha. -MU) were added to the grown selection plates. Then sealing the plateIncubate in vessel at 37 ℃ for 24H, and read plate at 1, 2, 4, 8 and 24 hours via Synergy H1 [ BioTekTM]The fluorescence of each plate was measured (Ex: 365nm, Em: 435nm, scanning mode, gain 80). For all wells, a Z score was calculated, which is given by: z score ═ (fluorescence value-median)/standard deviation.

All positive hits above a certain threshold were rearranged in a new 384 well plate, called "simple substrate hit" plate, and stored at-70 ℃. From the "simple substrate hit" plate to copy two screening plate, and screening again GalNAc-alpha-MU or Gal-alpha-MU activity, to verify and deconvolute the previously detected activity.

To determine which hits cleaved either the A antigen or B antigen structures, they were determined for 50. mu.M of subtype 1A antigen using a conjugated enzyme assayFourthly-subtype 1B antigen of MU or 50. mu.MFourthly-an activity of MU. One version of this coupling assay was previously described by Kwan (Kwan et al, 2015). By using BgaC (Jeong 2009) instead of BgaA (Singh 2014) as a coupling enzyme, our assay was modified to also detect cleavage of subtype 1A antigen. The potential alpha-N-acetylgalactosaminidase or alpha-galactosidase will cleave the terminal sugar, releasing the subtype I H antigenIII-MU. Subsequently, α -fucosidase (AfcA (Katayarna 2004)), β -galactosidase (BgaC (Jeong 2009)) and β -hexosaminidase (SpHex (Williams 2002)) will cleave the residual sugars in an outward manner until 4-methylumbelliferol is released; as the fluorescence increases, it is detectable. To achieve this, 50. mu.g/mL of each enzyme was added to the reaction mixture. All positive hits above a certain threshold were screened again in triplicate and a host cell strain containing the vector lacking any insert was used as a negative control. All validated hits were stored in LB medium (12.5. mu.g/mL chloramphenicol, 25. mu.g/mL kanamycin, 15% (v/v) glycerol, 0.2% (v/v) maltose, 10mM MgSO at-70 ℃ respectively4) In (1).

Fosmid hit sequencing

To isolate fosmid DNA for sequencing, a positive hit of fosmid glycerol stock was used to inoculate 5mL of TB medium (1)2.5. mu.g/mL chloramphenicol, 25. mu.g/mL kanamycin, 100. mu.g/mL arabinose, 0.2% (v/v) maltose, 10mM MgSO4) The cells were incubated overnight at 37 ℃ and 220 rpm. Using GeneJetTMPlasmid miniprep kit (Thermo Fisher)TM) Fosmid separation was performed. Use of Plasmid-SafeTMATP dependent DNase (Epicentre)TM) Isolated Fosmid was purified from contaminating linear e.coli (e.coli) DNA, then using GeneJetTMPCR purification kit (Thermo Fisher)TM) Another round of purification was performed. At QbitTMFluorometer (ThermoFisher)TM) Use of Quant-iTTMdsDNA HS assay kit (Invitrogen)TM) The concentration was calculated. The expected DNA size was verified on a 1% agarose gel. For complete fosmid sequencing, 2ng of each fosmid was sent to the UBC sequencing center (wingowski, BC, canada). Using Illumina MiseqTMThe system barcodes and sequences each fosmid individually.

Using in GitHubTMThe python script available above (https:// github. com/halllamlab/FabFos) for all Illumina MiSeqTMThe raw sequence data is trimmed and assembled. Briefly, trimmatic was used to remove adaptors and low quality sequences from reads (Bolger 2014). These read vector and host sequences were screened using BWA (Li 2013) followed by SamtoolsTMAnd bam2fastq script filtering to remove contaminants. These high quality and purified reads were assembled by MEGAHIT with k-mer values ranging from 71 to 241, increasing in increments of 10 (Li 2015). Since these libraries typically have more than 20,000-fold coverage, and to prevent accumulation of sequencing errors that interfere with the correct sequence assembly, the minimum k-mer multiplicities were calculated by 1% of the estimated coverage of fosmid. Then, outside the python script program set, more than one contig is generated using the minius 2(Treangen 2011) python script program set. The parameterized commands may be in GitHubTMPages and python scripts themselves.

Fosmid ORF prediction and hit validation

Using ProdigalTMMetagenomic versions of (Hyatt 2010) identified Fosmid ORF and used as MetaPathTMBLASTP of a part of the v2.5 software Package (Konwar 2015)TMCAZy ofTMThe databases are compared. MetaPathTMThe parameters of (2): length of>60 BLAST score>Ratio of blast scores 20>0.4,EValue of<1×10-6。

Using Golden GateTMCloning strategy (Engler 2008) all predicted ORFs with annotations for GH or CBM family members (with known or suspected α -galactosidase and/or α -N-acetylgalactosaminidase activities) were cloned into the pET16B plasmid, with the primer sequences listed in table B. The protein was expressed in BL21(DE3) cultured at 37 ℃ and 220rpm for 20h in 10mL of ZY5052 auto-induction medium (Studier 2005). Cells were harvested by centrifugation (4000 Xg, 4 ℃, 10min) and resuspended in 1mL lysis buffer (100mM NaH)2PO4Triton-X at pH7.4, 2% (v/v)TM100 EDTA-free 1 Xproteinase inhibitor [ PierceTM]) In (1). With 50. mu.L assay buffer (100mM NaH)2PO4pH7.4, SpHex 50. mu.g/mL, AfcA 50. mu.g/mL, BgaC 50. mu.g/mL, subtype 1A antigen 100. mu.MFourthly-subtype 1B antigen of MU or 100. mu.MFourthlyMU) and 50 μ L of crude cell lysate of the candidates mixed and incubated at 37 ℃ were subjected to conjugation assay (Kwan 2015). All reactions were performed in triplicate on black 96-well plates. Using SynergyTMH1 plate reader BioTekTM]Fluorescence (365/435nm) was monitored continuously for 4 hours. The assay from the crude extract showing cleavage activity of the a or B antigen was repeated, this time without the coupling enzyme, and the reaction product was separated via HF Bond Elut C18 column and analyzed by LC-MS and/or TLC. TLC silica gel 60F254 TLC plates [ EMD Millipore Corp were used.TMBillerica, Ma, USA]TLC was performed.

Table B: primer sequences

HPAE-PAD assay

Analysis of the enzymatic Release of galactosamine in HPAE-PAD (Dionex)TM) HPLC on system. The cleavage activity of different proteins was tested on the following substrates: NaH at 100mM2PO47.5. mu.g/. mu.L of type II mucin from pig stomach (pH 7.4); NaH at 100mM2PO4Subtype 1A antigen at 5mM in (pH7.4)Five of themMU and RBCs (50% hematocrit) from a + type donor, B + type donor and O type donor in 1 × PBS (ph 7.4). Samples containing 10. mu.g/mL enzyme were incubated at 37 ℃ for 2 hours and then stored at-80 ℃ for further analysis. A small aliquot (10. mu.l) of the reaction was diluted in H2O (100. mu.l) and analyzed on an HPAE-PAD instrument. In CarboPac PA200 with protective columnsTMThe separation was performed on a (150mm) column and detected using disposable gold on a Polytetrafluoroethylene (PTFE) electrode and a four potential waveform. The separation conditions were as follows: a gradient of 100mM sodium hydroxide and sodium acetate was from 70 to 300mM in the first 10 minutes of separation. The eluent was kept for 1min at the final gradient condition and then returned to the starting condition in the next minute. The flow rate was 1.0ml/min and injections were given every 27 minutes. Standards for the free sugars GalNAc, Gal1 and GalN (10 μ M) were also applied to HPAE-PAD to determine peak elution times for reference.

Kinetic determination

All kinetic determinations using 4-methylumbelliferone as leaving group were performed by fluorescence measurements. To avoid measurement errors based on the internal filtering effect (Palmier 2007), a standard curve was used to verify the linear range of the fluorophore.

Fp galactosaminidase

NaH at 100mM2PO4(pH7.4) determination of subtype 1GalN antigen at 37 ℃Five of them-MU and subtype 1A antigensFive of themMichaelis-Menten parameters of MU. The reaction was carried out in 100. mu.L with 3.4nM Fp galactosaminase (5.31nM FpGalNase-truncated) and 0.1mg/mL SpHex, AfcA, 0.2mg/mL BgaC and different concentrations of substrate (5. mu.M-2 mM). Reaction ofA series of four replicates were run with duplicate control (no Fp galatosamine enzyme). Through Synergy H1TMPlate reader BioTekTM]The fluorescent signal (365/435nm) generated by the release of MU by hydrolysis was monitored and converted to concentration using a MU standard concentration curve determined under the same reaction conditions. The initial rate (μ M/s) was determined and set at Grafit7.0TMTo determine kinetic parameters.

Determination of the subtype 1/2/4GalN antigen at pH7.4 and 37 deg.CFourthly-MU and subtype 1B antigensFourthlyK of MUcat/KMAnd (4) parameters. Reactions were performed in black 96-well wells (total volume 100. mu.L) as NaH at 100mM2PO4(pH7.4) in which 100mM NaH was added2PO4(pH7.4) contains Fp galactosaminase 8.63nM, SpHex 0.1mg/mL, BgaC (BgaA of subtype 2), AfcA, and substrate at different concentrations (25. mu.M, 20. mu.M, 15. mu.M, 10. mu.M, 7.5. mu.M, 5. mu.M). The reaction was run in a series of four replicates with duplicate control (no Fp galatosamine enzyme). Through Synergy H1TMPlate reader BioTekTM]The fluorescent signal (365/435nm) generated by the release of MU by hydrolysis was monitored and converted to concentration using a MU standard concentration curve determined under the same reaction conditions. The initial rate (μ M/s) was determined and set at Grafit7.0TMTo determine kcat/KM(s-1*mM-1) And (4) parameters.

In a clear 96 plate at 37 ℃, with 863.2nM Fp galactosaminase (in 100mM NaH)2PO4pH7.4) or 369.9nM FpGH4 (in 50mM Tris/HCl (pH7.4), 100. mu.M NAD +, 1mM MnCl2Medium) and a volume of 100. mu.l of various concentrations of substrate (10. mu.M-5 mM) were used to determine the Michaelis-Menten parameter of GalN-. alpha. -pNP. The reaction was run in a series of three replicates with two controls (no enzyme). Through Synergy H1TMPlate reader [ BioTekTM]The absorbance (at 405 nm) resulting from the release of pNP by hydrolysis was monitored and converted to concentration using a standard concentration curve for p-nitrophenol determined under the same reaction conditions. Initial rates (μ M/s) were determined and set at Grafit7.0TMTo determineAnd (4) kinetic parameters.

FpGalNAc deacetylase

Using the previously described coupling assay (Kwan 2015), at 37 deg.C, in 100mM NaH2PO4(pH7.4) determination of subtype 1A antigenFive of themMichaelis-Menten parameters of MU. The assay was modified to allow detection of cleavage of subtype 1 (and later 4) by using BgaC (Jeong 2009) instead of BgaA (Singh 2014) as β -galactosidase. In addition, due to subtype 1A antigenFive of them-MU contains additional galactose and the concentration of BgaC is increased to 0.2mg/mL to compensate for its need to cleave both Gal- β -1,3- β -GlcNAc- β -1, 3-Gal- β -MU and Gal- β -MU. In addition, Fp galactosaminidase is included to allow cleavage of galactosamine-containing intermediates. The reaction settings in 100. mu.m were 3nM FpGalNAc deacetylase (4.52nM FpGalNacDeAc _ D1ext, 3.55nM FpGalNacDeAc _ D1+2) and 0.01mg/mL Fp galactosaminase, 0.1mg/mL SpHex, AfcA, 0.2mg/mL BgaC and different concentrations of substrate (5. mu.M-2.5 mM). The reaction was performed in a series of four replicates with duplicate control (no FpGalNac deacetylase). Through Synergy H1TMPlate reader BioTekTM]The fluorescent signal (365/435nm) generated by the release of MU by hydrolysis was monitored and converted to concentration using a MU standard concentration curve determined under the same reaction conditions. Initial rates (μ M/s) were determined and plotted in Grafitt 7.0 to determine kinetic parameters.

Determination of subtype 1/2/4A antigen at pH7.4, 37 deg.CFourthlyK of MUcat/KMAnd (4) parameters. Reactions were performed in black 96-well wells (total volume 100. mu.L) as NaH at 100mM2PO4(pH7.4) in which 100mM NaH was added2PO4(pH7.4) contains 12nM FpGalNac deacetylase, 0.1mg/mL SpHex, BgaC (BgaA of subtype II), AfcA, and various concentrations of substrate (25. mu.M, 20. mu.M, 15. mu.M, 10. mu.M, 7.5. mu.M, 5. mu.M). The reaction was run in a series of four replicates with duplicate control (no FpGalNac deacetylase). Through Synergy H1TMPlate reader BioTekTM]The fluorescence signal (365/435nm) generated by the release of MU by hydrolysis was monitored and determined using the same reaction conditionsThe MU standard concentration curve of (a) was converted to concentration. The initial rate (μ M/s) was determined and set at Grafit7.0TMTo determine kcat/KM(s-1*mM-1) And (4) parameters.

Kinetics of the GH109 subtype

Determination of subtype 1/2/4A antigen at pH7.4 and 37 deg.CFourthlyK of MUcat/KMAnd (4) parameters. Reactions were performed in black 96-well wells (total volume 100. mu.L) as NaH at 100mM2PO4(pH7.4) in which 100mM NaH was added2PO4(pH7.4) with 86.02nM BvGH109_1/100.49nM BvGH 109/80.52nM BvGH109_2/87.4nM NAD + and 0.1mg/mL SpHex, BgaC (BgaA of subtype 2), AfcA, respectively, (pH7.4) different concentrations of substrate (25. mu.M, 20. mu.M, 15. mu.M, 10. mu.M, 7.5. mu.M, 5. mu.M). The reaction was run in a series of four replicates with duplicate control (no α -N-galactosaminase). Through Synergy H1TMPlate reader BioTekTM]The fluorescent signal (365/435nm) generated by the release of MU by hydrolysis was monitored and converted to concentration using a MU standard concentration curve determined under the same reaction conditions. The initial rate (μ M/s) was determined and set at Grafit7.0TMTo determine kcat/KM(s-1*mM-1) And (4) parameters.

Crystallography of

Prior to crystallization, thrombin (Novagen) was used at a concentration of 1mg/mL using the manufacturer's recommended protocolTM) Digestion of FpGalNAcDeAc _ D1ext overnight. The protein was then purified by HisTrap FF column and the flow-through was collected, buffer exchanged into 10mM Tris (pH8.0) +75mM NaCl and concentrated to 12 mg/mL.

Crystallization of

Using a hanging drop diffusion method with 0.2M CaCl20.1M MES (pH6), 18% PEG4000 and 20mM MnCl2The stock solution was composed of 1:1 protein: FpGalNAcDeAc _ D1ext (12mg/mL) was crystallized from the stock solution ratio. Crystals for phasing were derivatized using rapid bromide soak, and crystals were transferred to 1M NaBr, 25% glycerol, 18% PEG4000, 20mM CaCl2And 01M solution at Mes pH for 30 seconds and snap frozen in liquid nitrogen. Under the same conditions as above but omitting MnCl2Before setting up the drop, a crystal complex with a type B blood antigen trisaccharide (btree) was prepared by preincubating the protein (12mg/mL) with 10mM btree for 2 hours. The crystals were cryoprotected with stock solution supplemented with 25% glycerol.

Data acquisition, phasing and structure determination

In Canadian light sourcesTM(Canadian Light SourceTM) A data set is acquired. Integration of data using XDS (Kabsch 2010) and AimlessTM(Evans 2013) determines the ratio. Use of CRANK2TM(Skubak 2013) in CCP4I2TMPhasing and automation of the structured solution was performed in the program group (Potterton 2018). Using CootTM(Emsley 2004) and RefmacTM(Vagin 2004) alternate cycle check and improvement structure. The B _ three-structure complex is solved by difference Fourier method, and the ligand is in CootTMMedium manual construction, water and metal ions are also in CootTMIn (1) manual construction. Differential Density plots confirm the presence of Mn in the apo structure2+And the presence of Ca in the ligand structure2+. By CootTMAnd MolprobityTM(Chen 2010) validating the model. The atomic coordinates and structural factors of apo and B _ Triplex have been deposited in Protein Databases (PDBs) under the accession numbers:

the GalNAc deacetylase protein of Clostridium prospermi has the sequence of SEQ ID NO: WP _ 009260926.1; and

the sequence of SEQ ID NO: WP _044942952.1

Active site mutagenesis

Quickchange was used based on structural information (not shown) and sequence alignment (not shown)TMScheme (Zhang 2004), fpgnacdeac _ D1min and FpGalNase _ truncated were mutated using the primers shown in table B. Mutants were purified via NiNTA and HIC columns as described above. All mutants were checked for structural integrity via CD spectroscopy; all tested enzymes were similar in structure to their wild type. For mutants with relatively low activity, the reaction was performed under the same conditions used for the full kinetic assayThe preparation method comprises the following steps of; however, the kcat/KM value was determined using the substrate consumption method as previously described (Vocadlo 2002). Briefly: in which [ substrate ]]<KM(corresponding to K)m1/5-1/10) at low substrate concentrations, kcat/KMThe values can be approximated by fitting the reaction time course non-linearly to a first order curve and dividing by the enzyme concentration.

GH36 phylogenetic map

Use SACCHARISTMpI script (Jones 2018) from CAZyTMThe database downloads a reference sequence of GH 36. TreeSAPP using phylogenetic-based protein profiling softwareTM(available at https:// github. com/hallamlab/TreeSAPP) construct reference trees and map sequences to these trees. Briefly, HMMs from dbCAN are used for Slave CAZyTM(Yin 2012) all full length sequences downloaded extracted protein family domains. Then using UCLUSTTMThese sequences were clustered with 70% sequence similarity to remove redundant sequence space and reduce the size of the tree (Edgar 2010). Using RAxMLTMVersion 8.2.0 to build a reference tree with "- -autoMRE" to decide when to exit the boot instruction before 1000 iterations are performed, and PROTGAMAAUTOTMThe best protein model was selected (Stamatakis 2006 and Stamatakis 2008).

Then TreeSAPP was usedTMQuery sequences are mapped onto these reference trees. Briefly, hmmsearch is usedTMThe protein sequences were aligned to the HMM and the aligned regions were extracted (Eddy 1998). hmmulignTMIs used to include a new query sequence in a reference multiple alignment, then TrimA1TMNon-conserved positions were removed from the alignment file (Capella-Gutierrez 2009). Using RAxMLTMQuery sequences in the reference tree are classified by insertion. The positions of each query sequence are filtered and concatenated into a single query sequence. At iTOLTMJplace before visualizationTMFiles (Matsen 2012 and letonics 2016).

RBC assay

Whole blood from healthy consenting donors was collected into citrate evacuated blood collection tubes using protocols approved by The clinical ethics committee of The University of British Columbia. The tubes were spun at 1000 Xg for 4min at RT, the RBCs were separated and washed 3 times with 1 XPBS (pH7.4). For the assay in the presence of dextran 40k, washed RBCs (200 μ L, 10% hematocrit) were placed in tubes, the supernatant partially removed, and replaced with 1 x PBS (ph7.4) with or without dextran 40k (final concentration of 300 mg/mL). In addition, some assays were performed in 1 × PBS (pH7.4) + 25% plasma or 100% plasma. The RBCs were carefully mixed and placed on an orbital shaker for 30 s. The diluted enzyme solution was then added to a final volume of 200. mu.L. The tube was vortexed very lightly and placed on an orbital shaker at a set temperature for a defined time.

MTS card

After the reaction, RBC were washed 3 times with excess 1 XPBS (pH7.4) and mini-typing system was usedTM(Micro Typing SystemTMMTS card [ MTS ]TMFlorida, USA]And (6) carrying out analysis. Suspending in diluent [ MTS, Florida, USA]RBCs (12 μ l, 5% hematocrit) were carefully added to the microgel column, leaving a space between the blood and the contents of the microgel. Use of Beckman Coulter Allegra X-22R with modified sample holder as recommendedTMCentrifuge, MTS card was centrifuged at 156 Xg for 6min at RT. The extent of antigen removal from the RBC surface was assessed from the location of the RBCs in the rotated microgel according to the manufacturer's instructions. RBCs with high surface antigen concentrations agglutinate and are impermeable (MTS) when interacting with monoclonal antibodies present in gel columnsTMScore 4). RBCs without surface antigen did not agglutinate and migrated to the bottom of the microgel (MTS score 0). RBCs that partially remove surface antigens migrating to locations between these antigens are assigned a score of 0 (absence) to 4 (presence), according to the manufacturer's instructions.

H antigen agglutination assay

To analyze the conversion of A antigen to H antigen after enzymatic treatment, washed A-ECO-RBC was mixed in equal portions with 2. mu.g/mL of anti-H antibody (anti-H ab blood group antigen antibody [97-1 ]]: catalog number: ab24213 (Abcam)TM) Mixing and monitoringAgglutination phenomenon in the 30 minute time frame. RBCs agglutinated with anti-H antibodies were assigned a score of 0 (no agglutination within 1800 sec) to 5 (agglutination within 120 sec).

FACS

The enzyme-treated RBC was washed 2 times with 1 XPBS (pH7.4), and 1% hematocrit ECO-RBC was washed with 1/100 APC-anti-A antibody (AlexaFluor)TM647 mice anti-human A blood group: catalog number: 565384(BD PharmingenTM) And/or anti-H antibody (anti-H ab blood group antigen antibody [97-1 ]]: catalog number: ab24213 (Abcam)TM) Treated for 30 minutes at RT and then washed 2 times with 1 XPBS (pH7.4). For the detection of the anti-H antibodies, a second FITC-labeled antibody (goat F (ab')2 anti-mouse IgM mu chain (FITC): cat # ab5926 (Abcam) was used at a concentration of 1/500TM)). In-use flow cytometer (CytoFLEX)TM(Beckman CoulterTM) Reconstituted to 1 XPBS (pH7.4) (1% hematocrit), the data were evaluated.

Enzyme adsorption and antigenicity

To test whether the enzyme could be easily removed from RBCs after treatment, potential adsorption was assessed. Pacific blue-labeled FpGalNAc deacetylase and FpGalNase (F/P ═ 1) were incubated with RBC alone at 37 ℃ for 1h and after several washing steps then on a flow cytometer (CytoFLEX)TM(Beckman CoulterTM) Residual fluorescence was measured).

Antigenicity was tested by incubating RBCs with 50 μ g/mL of each enzyme and mixing the enzyme-treated RBCs with allogenic or autologous serum, observing potential agglutination. In addition, to assess potential anti-IgG, -C3d exposure, the anti-IgG, -C3d MTS was testedTMCard (MTS)TMFlorida, USA]Treated RBCs were tested above. The incubation time was 30 minutes at 37 ℃.

Antigen subtype synthesis

Subtype 1/2/4A and B antigensFourthlyThe synthesis of MU is carried out using a modified protocol as described by Kwan (Kwan et al, 2015).

Two-step subtype 1/2/4H antigenIIISynthesis of MU

All three syntheses were performed in 10mL of 50mM Tris/HCl,200mM NaCl, pH7.4, 10mM MnCl250U of alkaline phosphatase, 1.5 equivalents of UDP-Gal, 1.2 equivalents of GDP-Fuc (as determined by LacNAc-MU production) on a20 mg GalNAc-alpha-MU/GlcNAc-alpha-MU scale. Depending on the desired product, different glycosyltransferases were added at concentrations of 100. mu.g/mL for subtypes I CgtB S42 and Te2FT, for subtypes II HP0826 and WbgL, and for subtypes IV LgtD and Te2 FT. The reaction was carried out at 37 ℃ and by TLC (mobile phase, EtAc: MeOH: H)2O, ratio 6:2:1) control of the progress, 4-methylumbelliferone via 10% H2SO4Hydrolyzed from the compound and detected via UV (360 nm). After no further increase in product was observed, the reaction was applied to an HF Bond Elut C18 column, washed with several column volumes of 5% methanol and the product eluted with 25% methanol. The solvent was then removed in vacuo.

Subtype 1/2/4A antigenFourthlySynthesis of MU

The final synthesis step was performed at 37 ℃ in 5mL of 50mM Tris/HCl, 200mM NaCl, pH7.4, 10mM MnCl225U of alkaline phosphatase, 1.5 equivalents of UDP-Gal and 100. mu.g/mL of BgtA as 10mg of the subtype 1/2/4H antigenIII-size of MU. Following progress via TLC, after no further increase in product was observed, the reaction was applied to an HF Bond Elut C18 column, washed with several column volumes of 5% methanol, and the product eluted with 25% methanol. The solvent was then removed in vacuo. The final product was further purified on a 1.5X 46cm HW-40F size exclusion column and then freeze dried.

Subtype 1/2/4B antigenFourthlySynthesis of MU

The final synthesis step was performed at 37 ℃ in 5mL of 50mM Tris/HCl, 200mM NaCl, pH7.4, 25U alkaline phosphatase, 1.5 equivalents UDP-Gal and 100. mu.g/mL BoGT6a with 10mg of the subtype 1/2/4H antigenIII-size of MU. Following progress via TLC, after no further increase in product was observed, the reaction was applied to an HF Bond Elut C18 column, washed with several column volumes of 5% methanol, and the product eluted with 25% methanol. The solvent was then removed in vacuo. The final product was further subjected to a 1.5X 46cm HW-40F size exclusion columnPurified and then freeze-dried.

Subtype 1GalN antigenFive of them-MU synthesis

10mg of subtype 1A antigenFive of themMU with 100mM NaH at 5mL2PO4FpGalNAc deacetylase of 1. mu.g/mL in (1) was incubated at 37 ℃ for 30min and then stopped by addition of 1mM EDTA. The complete conversion of the substrate was checked via TLC and the reaction was applied to a HF Bond elt C18 column, washed with several column volumes of 2% methanol and the product eluted with 10% methanol. The solvent was then removed in vacuo.

Protein purification

Via Golden GateTMCloning (Engler 2008) or PIPE cloning (Klock 2008) all proteins and truncations therein were cloned into pET16b or pET28 a. The primer sequences are listed in table B.

The production of proteins for elongation characterization was carried out in BL21(DE3) cells, cultured in 200mL of ZY5052 auto-induction medium (Studier 2005) at 37 ℃, 220rpm for 20h, and inoculated with 100. mu.l of overnight LB culture. The cells were harvested by centrifugation (4000 Xg, 40 ℃, 10min) and resuspended in 10mL lysis buffer (50mM Tris/HCl, 150mM NaCl, 1% (v/v) glycerol, 40mM imidazole, pH7.4, 2mM DTT, 1X EDTA-free protease inhibitor (Pierce)TM) 2U Benzonase (Novagen)TM) 0.3mg/mL lysozyme, 10mM MgCl2) Then sonicated on ice (3min pulse time; pulse at 5sec, pause at 10sec, amplitude at 35%). After removing cell debris by centrifugation (14000 Xg, 4 ℃, 30min), the supernatant was collected and loaded onto a nickel affinity chromatography column (5mL of HisTrap HP)TMColumn (GE)TM) ). In AEKTApurifierTMSystem (GE)TM) Elution was performed and monitored, wherein protein containing fractions were identified via SDS-PAGE using a 10-75% gradient of 50mM Tris/HCl, 400mM imidazole, pH7.4, 2mM DTT, and then pooled. In Amicon Ultra-15 centrifugal filterTM MWCO 10kDa(MilliporeTM) The buffer was exchanged for 50mM Tris/HCl, 150mM NaCl, pH7.4, 2mM DTT and concentrated.

FpGalNAc takes offThe acetylase, Fp-galactosaminase and truncations thereof had to undergo a second round of purification after loading the protein onto a hydrophobic interaction chromatography column (10mL phenyl sepharose high efficiency column (Pharmacia Biotech)TM) Before the above, an Amicon Ultra-15 centrifugal filter device was usedTM MWCO 10kDa(MilliporeTM) Buffer exchange. The column was loaded, washed and eluted (gradient 0-100%) by an AEKTApurifierTMSystem (GE)TM) Treatment, using the following buffer conditions: FpGalNAc deacetylase; binding to 1 XPBS, 800mM NH2PO4pH7.4 and elution of 1 XPBS (pH7.4) and Fp galactosaminase; bind 25mM Tris/HCl, 1M NaCl, pH7.4 and elute 25mM Tris/HCl (pH 7.4). Protein containing fractions were identified via SDS-PAGE and then pooled. In Amicon Ultra-15 centrifugal filterTM MWCO 10kDa(MilliporeTM) The buffer was exchanged for 50mM Tris/HCl, 150mM NaCl, pH7.4 and concentrated.

Protein characterization

Optimum pH value

For subtype 1A antigenFive of them-MU and subtype 1GalN antigensFive of themThe general pH range of the activities of MU, FpGalNAc deacetylase and Fp galactosaminase, respectively, was determined by the products present on the TLC plates for varying the pH value. The reaction was carried out at 37 ℃ with 50. mu.M of substrate and 1. mu.g/mL of enzyme in a suitable buffer system on a 100. mu.l scale. The buffer used for pH 4-6 was based on 50mM citric acid/sodium citrate buffer, the buffer used for pH 6-8 was based on 50mM sodium phosphate buffer and the buffer used for pH 8-10 was based on 50mM glycine/sodium hydroxide buffer.

To determine the optimal pH, 5. mu.g/mL Fp galactosaminase was incubated in 100. mu.L 50mM sodium phosphate buffer with different pH ranges (5.8-8.0) and 200. mu.M GalN-. alpha. -pNP. The absorption (at 405 nm) resulting from pNP release was by Synergy H1TMPlate reader (BioTek)TM) Monitor at 37 ℃ for 1 h.

FpGalNAc deacetylase at 5. mu.g/mL and subtype I A antigen at 50. mu.MFive of themMU in 25mM sodium phosphate buffer with different pH ranges (5.8-10.0) at 37 ℃Preincubation for 10 min. The reaction was quenched with 100mM sodium phosphate buffer (pH7.5), 100. mu.M EDTA, 5. mu.g/mL Fp-galactosidase, 50. mu.g/mL SpHex, 50. mu.g/mL AfcA and 50. mu.g/mL BgaC (final volume of 100. mu.l). The fluorescent signal (365/435nm) generated by the release of MU by hydrolysis was passed through Synergy H1TMPlate reader (BioTek)TM) Monitoring was carried out at 37 ℃ for 30 min.

Protein stability

FpGalNAc deacetylase and FpGalNase were stored in 1 XPBS buffer (pH7.4) at 4 ℃. After 2 and 12 weeks, subtype I A antigen as in the coupled enzyme reaction against FpGalNAc deacetylaseFive of them-MU and PH optima of GalN- α -pNP in a coupled enzymatic reaction against fpgalnanse the activity of the enzyme was tested.

Inhibition of FpGalNAc deacetylase

In the 96-well plate format, FpGalNAc deacetylase was tested against different potential inhibitors as a coupled assay. The reaction was carried out at 37 ℃ on a 100. mu.L scale in 100mM NaH with Fp galactosaminase 10. mu.g/mL, SpHex 50. mu.g/mL, AfcA 50. mu.g/mL, BgaC 50. mu.g/mL2PO450 μ M subtype 1A antigen in (pH7.4)Five of themMU and 5. mu.g/mL FpGalNAc deacetylase. EDTA (1, 10, 100. mu.M), Marimastat (1, 10, 100, 1000. mu.M), DMSO (2%, 4%), EDTA-free protease inhibitor cocktail (Pierce) were tested as inhibitorsTM) (1X, 2X and 4X). Using Synergy H1TMPlate reader (BioTek)TM) Fluorescence (365/435nm) was monitored continuously for 1 hour. The additive showing a strong effect was run again without the conjugated enzyme and analyzed for product formation via TLC.

Limited proteolysis

To investigate the presence of the smaller, stable subdomain of Fp galactosaminase, limited proteolysis was performed. Fp-galactosaminase was treated with thermolysin (protein: protease mass ratio of 10: 1) at various temperatures (20 ℃, 37 ℃, 42 ℃,50 ℃ and 65 ℃) for 1.5 hr. The samples were then run on an SDS-PAGE gel and stable fragments running at about 70kDa (decreasing from the initial 118 kDa) were identified, with almost complete digestion achieved at 50 ℃ incubation temperature. This fragment was sent to the UBC proteome core facility for peptide identification and was identified as a C-terminal truncated form of the full-length protein with a cleavage site between amino acids 690-700.

Glycan array screening

For glycan array screening, Fluorotag was usedTMFITC conjugation kit (Sigma)TM) 500 μ g of FpGalNAcDeAc _ D2ext was labeled with Fluorescein Isothiocyanate (FITC) at an F/P ratio of 1. Screening in the core facility of CFG protein-glycan interactionTM(the CFG's Protein-Glycan Interaction Core FacilityTM) Wherein the printing array is version 5.3 and consists of 600 glycans in 6 replicate samples with protein concentrations of 5 μ g/mL and 50 μ g/mL. Analysis of binding motifs was performed using the network tool of Emmeri university (https:// glycopatern. entity. edu /).

Enzyme assay in buffered extracellular solution

PBS, Steen, buffered extracellular solutions were tested at 37 deg.C, 37 deg.C and 4 deg.C using a composition comprising purified GalNAc deacetylase (SEQ ID NO:5) and purified galactosaminase (SEQ ID NO:10), respectivelyTMAnd PerfadexTMThe compatibility of (a) with (b). In PBS, SteenTMAnd PerfadexTMHuman type a Red Blood Cells (RBCs) were incubated with different doses of the enzyme composition to determine the ability of the enzyme to cleave the a antigen from the red blood cells. In PBS, SteenTMAnd PerfadexTMVarious doses of enzyme in solution treated 1% RBC solution and analyzed by flow cytometry for antigen removal levels at the end of treatment.

Immunohistochemical analysis of arterial biopsy

To test the dose-escalating effect of an enzyme composition comprising purified GalNAc deacetylase (SEQ ID NO:5) and purified galactosaminase (SEQ ID NO:10), STEEN was testedTMType A human arteries in solution, type A antigens by immunohistochemical analysis of biopsies obtained from untreated (control), treated (treated) type A arteries and type O arteries as negative controlsThe percentage of (c) was quantified. Area quantification software was used and normalized to the control group using the following formula:

the residual positive level of type a antigen quantified in the type O group may explain the artifacts that appear during treatment.

Enzymatic treatment of human arteries was tested in human pulmonary arteries (static treatment). The doses involved are prepared relative to STEENTMUnit of enzyme weight for solution volume. The arteries were biopsied, processed and analyzed by immunohistochemistry with double staining of CD31 (positive for endothelial cell staining) and BTA (positive for blood group a antigen staining). Enzymatic treatment was performed on human arteries at 1. mu.g/mL and 10. mu.g/mL for 4 hours. 20 Xmagnification immunohistochemical staining images of arterial biopsy without enzyme treatment (control) and with enzyme treatment (treatment). CD31 shows the location of endothelial cells (blood vessels), while BTA shows the location of type a blood antigens. BTA in untreated arteries co-localized with endothelial cells (CD31 positive), while BTA was absent in treated arteries.

Human donor lung study

Effect of 1 hour enzymatic treatment on ex vivo perfused human donor lungs, expression levels of type a antigens were quantified using immunohistochemical analysis of lung tissue biopsies and area quantification software and normalized to pre-treated biopsies using the following formula:

the effect of 1 and 3 hours enzymatic treatment (i.e., an enzyme composition comprising purified GalNAc deacetylase (SEQ ID NO:5) and purified galactosaminase (SEQ ID NO: 10)) on ex vivo perfused human donor lungs was tested. Immunohistochemical staining of biopsied human donor lungs was imaged at 20 x magnification to determine the effect of treatment of the lungs with the enzyme composition. CD31 shows the location of endothelial cells (blood vessels); BTA shows the location of type a blood antigens. The pre-processed images show that blood group antigens are located within blood vessels and airways. In the post-processed image, the upper right dependent (RUD) region, the upper right independent (RUND), the middle right independent (RMND) region, the middle right dependent (RMD) region, the lower right independent (RLND) region, and the lower right dependent (RLD) region of the lung, blood group a antigen is not present in the blood vessel.

Two separate ex vivo perfused human donor lungs were tested in this study and the results are shown in fig. 10 and 11 at 1hr and 3hr, respectively.

Examples

Example 1: metagenomic library construction and screening

We constructed metagenomic libraries containing a gene from AB+Large (35-65kb) DNA fragments extracted from faecal samples from male donors of blood group type. Such libraries contain multiple genes per bacterium, which increases the likelihood of expression of at least some of these genes, and allows for small "pathways" of expression of multiple genes. Our library contains 19,500 clones in 51 × 384 well plates, perhaps around 800,000 genes, so initial screening of such libraries with expensive a antigen substrates is impractical. Instead, we first screened with the simple, sensitive fluorogenic substrates, methylumbelliferone α -glycosides of galactose and N-acetyl-galactosamine (Gal- α -MU and GalNAc- α -MU). This initial screening together with a mixture of the two substrates produced a subset of 226 hits. These substrates were rescreened for each individual substrate, 44 identified with GalNACase and 166 identified with galactosidase activity. These hits were subjected to a second round of screening using the a and B antigen tetrasaccharide glycoside substrates shown in figure 1, using a coupled enzyme assay (Kwan 2015), and no substrate control: only when the initial Gal or GalNAc is cleaved can the conjugating enzyme act and release the MU. Eleven of these hits contained a-antigen cleavage activity, one of which also cleaved B-antigen, while six produced fluorescence in the absence of substrate, thus encoding pathways that produced unrelated fluorescent products.

Example 2: sequencing and initial analysis of hits

In Illumina MiSeqTMEleven fosmid were sequenced and Metapathway was usedTMThe software (Konwar 2015) identified the presence therein of CAZyTMORFs in the database (http:// www.cazy.org /) (Lombard 2014). Since the depth of sequencing of the human microbiome currently available is considerable, all organisms from which fosmid is derived can be identified. Their sequences can be divided into five clusters, since eight of eleven are overlapping fragments derived from the genome of only two Bacteroides (Bacteroides sp.). The only gene common to all fosmid in cluster B is the GH109 enzyme (bacteroides vulgarus (B. vulgatus)); cluster a also contains GH109 (b. stercoris), which is the only CAZy gene found in fosmid (common bacteroids) of other bacteroid origin. Fosmid No8 from the obligate anaerobe Clostridium proudenreichii (Li 2015) contains three ORFs found in CAZy: an apparent carbohydrate binding module CBM32 and two potential glycoside hydrolases-GH 36 and GH 4. Finally fosmid K05 from Coriolis (Collinsella sp.) (probably Collinsella tanakaei) does not contain a CAZy-associated ORF. Here, the generation of a sub-library of fosmid K05 allowed the identification of the ORF with a cleavage activity, which was subsequently identified as GH36 (not shown).

Example 3: analysis of GH109 enzyme

The GH109 family is established based on the a antigen cleavage activity of several of its members. These enzymes employ unusual NAD+The dependent mechanism was first found in the enzyme from GH4 Add YIp Ref (2004) J.Amer.chem.Soc.,126,8354-8355, since this is the one showing the mechanism (Varrot 2005; and Liu 2007). After removal of the signal peptide, the three GH109 genes identified here were cloned with His tag and expressed in e.coli (Escherichia coli) BL21(DE 3). The three proteins (BsGH109, BvGH109_1 and BvGH109_2) (not shown) were purified, as well as classical GH109(EmGH109) (Liu 2007) from meningococcal pyogenes (elizabethikingia menosepticum) as standards, and kinetic parameters were determined for each protein. The three novel enzymes exhibited similar catalytic efficiencies in each of the three subtype a substrates tested, which largely reflected the kinetic parameters of the EmGH109 standard. In contrast, when using an approved MTS card at A+When tested for their a antigen removal activity on RBCs, it was disappointing that only EmGH109 was significantly active. Testing in the presence of dextran 40K as a crowding agent, we have shown to increase activity by concentrating the enzyme on the cell surface (Chapanian 2014). In the absence, even 150. mu.g/mL of EmGH109 was ineffective, whereas in the presence of 300mg/mL dextran 40K, 15. mu.g/mL of enzyme was sufficient (see FIGS. 3 and 4). Previous studies showed that low ionic strength also increased the activity of EmGH109 on cells (Liu 2007). Therefore, EmGH109 was ineffective in whole blood.

Example 4: analysis of GH36 Fosmid K05 from Coriolis

The identified GH36 protein in Fosmid K05 (named K05GH36) was active on GalNAc-alpha-MU and A antigen tetrasaccharides. This is consistent with members of the GH36 family, which contain primarily alpha-galactosidase and alpha-N-acetylgalactosaminidase, and are hydrolyzed via a double substitution mechanism involving covalent beta-glycosylase intermediates (Comfort 2007). Phylogenetic analysis aligned their sequences within cluster 4 of GH36 subfamily (Fredslund 2011). Interestingly, this cluster also contains a close proximity of the characteristic GH36 from Clostridium perfringens (Clostridium perfringens), which is also known to cleave the a antigen structure (Calcutt 2002). However, when we tested the ability of K05GH36 to remove a antigen from red blood cells, its activity was disappointing, and even when used in combination with crowding agents, scored only 3.

Example 5: analysis of Fosmid No8 from Clostridium pratensis

Since these new enzymes do not offer advantages, our attention was turned to No8 fosmid from clostridium pruriens (f.platutii), especially because its gene product cleaves both a and B antigens. Three CAZy-related genes were cloned, their signal peptide sequences were removed, expressed in E.coli BL21(DE3), and the resulting enzyme was purified to yield up to 140 mg/L. Surprisingly, when we tested separately purified proteins against the a and B tetrasaccharide substrates, the only cleavage observed was that of the B antigen by No8GH36, while none of them cleaved the a antigen. Thus, we tested combinations of these enzymes in pairs and surprisingly found that a mixture of No8CBM32 and No8GH36 rapidly cleaves the a antigen tetrasaccharide. TLC analysis of the reaction mixture with the enzyme alone showed that No8CBM32 catalyses the conversion of the a antigen to a more polar but still UV active product, whereas subsequent addition of No8GH36 released the sugar product co-migrating with galactosamine as well as the H antigen trisaccharide. MS analysis of the reaction mixture showed that No8CBM32 is an A antigen deacetylase, thus reducing m/z by 42 and being more polar, while No8GH36 is a galactosaminase, a novel activity of this family (FIG. 2). This was further confirmed by high performance anion exchange chromatography (HPAE-PAD) analysis of the reaction (fig. 5), which shows that treatment of the a antigen with both enzymes released galactosamine, while the enzyme alone did not. Similar results were obtained with gastric mucin substrates, for which the enzyme is supposed to release galactosamine. Thus, these two enzymes are hereinafter referred to as FpGalNAc deacetylase (FpGalNAcDeAc) and Fp galactosamine enzyme (FpGalNase).

Although this pathway to degrade the A antigen has not been characterized before, it is fascinating that an explanation was proposed more than 50 years ago to explain the so-called "acquired" B phenomenon, in which the blood type of patients of type A infected with Clostridium subthreshold became significantly type B (Gerbal 1975), just like the samples of human histology medicolegal submerged in the Thames river (Ref Judd and Annesley https:// doi. org/10.1016/S0887-7963(96)80087-3, Transfusion media reviews (1996)10, 111-. This is presumably because the anti-B antibodies used for typing cannot distinguish between terminal Gal and GalN.

Studies with the third enzyme in fosmid GH4 showed that although it hydrolyzes Gal- α -pNP, GalN- α -pNP and GlcN- α -pNP, it does not cleave any a antigen-based substrate. Thus, it appears that there is no direct effect in the conversion of the a antigen. However, these glycosaminoglycases do represent a novel activity within the GH4 family.

Example 6: characterization of FpGalNAc deacetylase

With Phyre2TM(Kelley 2015) a more closely bioinformatic analysis of this gene showed that it was at the N-terminusThe terminus has a domain of-308 amino acids of previously unknown function and near the C-terminus has a CBM32 of-145 amino acids with a linker region between them. Truncation analysis confirmed this basic structure, as all constructs containing the entire deacetylase domain did have catalytic activity (table 2). Thus, this protein is classified as an initiating member of the novel carbohydrate esterase family CExx.

The acetyl-glucosamine deacetylases have all been shown to be metalloenzymes requiring divalent metal ions (Blair 2005). Accordingly, treatment with 100. mu.M EDTA largely abolished the enzyme activity, while the addition of Mn2+、Co2+、Ni2+Or Zn2+The enzyme activity is increased. Other (non-metal) amidase inhibitors have no effect. The enzyme has a broad pH profile with an optimum around pH8 (FIG. 6) and a narrow substrate specificity, limited to the different subtype A and its shorter forms. However, it was not very discriminatory among those subtypes, and the specific activities differed only by a factor of-2 among all these subtypes (Table 2). This pH-dependent and specific characteristic is ideal for RBC conversion, since all subtypes of a are deacetylated, but not others.

Glycan arrays of the Functional glycogenomics Consortium (CFG) were used to explore the specificity of the CBM portion of proteins. Preferred targets are glycans with repeating N-acetyllactosamine (LacNAc) structure; it is in the founding member of the CBM32 family; n-acetylglucosaminidase from Clostridium perfringens is also found (Ficko-Blean 2006). However, unlike this CBM, our did not show high affinity binding to blood antigen structures. The repetitive LacNAc structure is a common component of the cell surface (Cohen 2009) as a general component of complex and hybrid N-glycans, as well as some O-glycans and glycolipids. In our case, they may serve as anchors to which the deacetylase domain is attached. This will bring its catalytic domain very close to the a antigen without competing for its own substrate. With the support of this model, domain removal resulted in reduced RBC activity, with no effect on the cleavage rate of soluble substrates (table 2).

Example 7: crystallization analysis of FpGalNAc deacetylase

To provide structural insight into this novel enzyme activity, crystallization experiments were performed on the truncated proteins and it was found that FpGalancDeAc _ D1ext produced crystals that diffracted to the best resolution. Solutions of this structure reveal catalytic domains employing a 5-fold beta propeller structure with active sites containing divalent metal ions coordinated by D100 and H252. Co-crystallization of the enzyme with the closely analogous B antigen trisaccharide as the reaction product revealed its binding mode. On the basis of the active site pocket, the non-reducing terminal galactosyl part is used as a distinguishing group of the A antigen and the B antigen and forms hydrogen bond interaction with H97, E64 and two kinds of metal coordinated water. The remainder of the ligand is surface exposed and polar interactions are defined between the fucosyl and the S61 and D121 side chains. The C1-OH group of the reducing terminal galactosyl moiety is solvent exposed, so the enzyme readily adapts to the extension of the substrate (i.e., by GlcNAc). Mimicking the N-acetyl group of A-trisaccharide on this structure allows us to rationally mutate nearby amino acids that may be involved in the deacetylation of substrates. Since both mutants were inactive, residue E64 was shown to be critical for activity, suggesting a possible direct role in activation of nucleophilic water molecules (table 1). Residues D100, Y315 and H252 of the coordinating divalent metal have also proven important, where any mutation results in a-5000 fold rate reduction, consistent with their apparent role in binding divalent metal ions. Similar to other acetamido sugar deacetylases, we propose that FpGalNAc deacetylase hydrolyses by a mechanism in which a metal acts to polarise a carbonyl and activate a water molecule to nucleophilically attack the carbonyl to form a tetrahedral intermediate. The supply of protons to the sugar nitrogen atom via His 100 promotes the decomposition of this intermediate.

TABLE 1| FpGalNAcDeAc _ D1min and its mutants cleave type 2A antigenFourthlySpecific Activity of MU

No detectable activity

Example 8: characterization of FpGalNAcDeAc and FpGalNase

Phylogenetic analysis of the sequence FpGalNase was placed in a new subgroup of the GH36 family (5) (Frednlund 2011). The 390 amino acid catalytic domain is located in the center of the large (1079 amino acids) protein, with a potential carbohydrate-binding domain at the C-terminus. Removal of this C-terminal domain had no effect on the kinetic parameters of the enzyme and soluble substrate (Table 2), but resulted in deacetylated A+The cutting efficiency of RBC decreases. The enzyme is specific for galactosamine-containing sugars and will not cleave GalNAc residues in any context tested. However, it has rather broad specificity for cleavage of de-N-acetylgalactosamine upwards from the simple aryl glycoside GalN- α -pNP. K of the three A subtypes actually tested (Table 2)cat/KMValues of k with each other and with those of deacetylasescat/KMThe values are all similar. K cleaved by B antigencat/KMValues were more than 2000-fold lower than the corresponding GalN antigen, but still sufficient to produce positive hits in the original screen. This specificity for deacetylated alpha galactose configuration substrates, combined with its pH optimum of 6.5-7.0, is well suited for blood group conversion in combination with deacetylases (figure 6).

TABLE 2 kinetic parameters of FpGalNAcDeAc and FpGalNase constructs for different antigen substrates

Example 9: cleavage of A antigen from RBC

A+、B+And O+RBCs were incubated with FpGalNAcDeAc and FpGalNase alone and analyzed for released sugars as a mixture, as well as on HPAE-PAD ion chromatograms. None of the enzymes used released any sugar product alone. However, when a mixture of both is used, galactosamine is evident from A+Releasing RBC from type B+Or O+Released and thus has high specificity only for the a antigen. This is achieved byIs very important because it shows that GalNAc is not released from RBC surface in any other case. Truncated forms of FpGalNase are also effective, but are slightly less active.

Then we continue to use the industry standard MTSTMThe card test removes antigen from RBCs. These antibody-conjugated columns were loaded with RBCs and spun in a centrifuge. Antigen-free RBCs migrated to the bottom of the column and scored 0, while untreated RBCs had a corresponding antigen bar at the top and scored 4, with the middle score ranking the extent of antigen removal. Treatment with FpGalNase alone failed to remove a or B antigenicity at the concentrations employed (table 3), which is consistent with its inactivity on GalNAc substrates and low activity on Gal. Incubation with FpGalNAcDeAc removes antigenicity due to conversion of acetamide to amine, thereby impairing the binding of the anti-a antibody employed. The minimum amount of enzyme required for complete antigen deacetylation in combination with FpGalNAcDeAc alone and FpGalNase was evaluated in the absence and presence of 300mg/ml dextran as crowding agent. Without assistance from dextran, an amount of FpGalNase reduced to 3. mu.g/ml was sufficient, while the inclusion of 300mg/ml dextran reduced the required loading to 0.5. mu.g/ml (Table 3). By comparing the best previous enzymes, EmGH109 was ineffective in the absence of dextran, unless low salt buffer was used, and the minimum effective concentration was 15 μ g/ml in the presence of dextran (30-fold higher loading). The FpGalNAcDeAc form lacking CBM is much less efficient.

TABLE 3| treatment of A with EmGH109, FpGalNAcDeAc, and FpGalNase+、B+And AB+MTS card results for RBC.

Due to the MTSTMThe card test did not assess the complete conversion of the a antigen, and since no antibodies were available to detect GalN antigen, we focused on detecting newly formed H antigen on treated RBCs. FpGalNase is functional at concentrations of only 5. mu.g/ml, which leads to H antigen levels consistent with loss of A antigenAs evidenced by FACS analysis seen in figure 3. By measuring the agglutination time in the presence of anti-H-antibodies, we demonstrated that both enzymes are responsible for several A' s+The functionality of the RBC donor, also under whole blood reaction conditions, was previously not achievable with other blood converting enzymes. Thus, this uses a much lower enzyme loading for the enzyme than was required for the best previous enzyme+RBCs are converted to O-type "universal donor" RBCs. However, before delivering these RBCs into the patient, it is recommended to remove all trace enzymes used in the conversion to avoid adverse immune responses, most recommended by washing the cells after centrifugation. To confirm that this could be achieved, we treated A with fluorescently labeled FpGalNAcDeAc and FpGalNase samples+RBC, and then confirmed that a truly simple wash was effective using FACS analysis (fig. 3).

Further characterization of the produced a-ECO RBCs can be used to assess their full viability for use in transfusion medicine, but the possibility of including enzymes directly in the plasma, possibly at the time of blood donation collection, can allow for easy, cost-effective separation of the process from existing automated procedures of blood collection and storage. In particular, the stability of the enzymes was tested as shown in table 4.

Table 4: storage stability of galactosaminidase and GalNAc deacetylase

Example 10: fusion of GalNac deacetylase and galactosaminase from Clostridium sp

In search for similar enzymes, a novel natural fusion of a third clostridium of galactosaminidase and GalNAc deacetylase linked by a CBM (GH36_ domain-CBM-deacetylation _ domain) was identified. Initial tests showed that this enzyme cleaves the a antigen of red blood cells (the same mechanism, first deacetylation followed by galactosamine cleavage), but with low efficiency (i.e. similar to EmGH 109). The deacetylation domain of clostridium ljunii is not as efficient as the GalNAc deacetylase of clostridium przewachii, but if complemented with the GalNAc deacetylase of clostridium przewachii, the galactosaminase domain of clostridium ljunii shows similar activity on red blood cells as the galactosaminase of clostridium przewachii.

Example 11: alternative GalNAc deacetylases and galactosaminases

The data show that galactosamine glycosidase from clostridium sp (Ct5757_ GalNAse) and Rp1021 did have comparable enzymatic activity for the conversion of GalN antigen to H antigen (second reaction step).

Data were also collected for alternative GalNAc deacetylases and galactosaminases and the alternative enzymes were compared to the GalNAc deacetylases and the galactosaminases of fusobacterium previosum. As shown in table 5, it was shown that anti-a antibodies of natural fusions of galactosidase and GalNAc deacetylase of third clostridium scored MTS on treated a RBCs, which required the presence of dextran to effectively cleave the a antigen, and also showed good activity of GalNAc deacetylase of third clostridium (Ct5757_ DeAcase) when combined with galaminase of clostridium prospermi (FpGalNase). Also in table 6, the data shows that Rp3672 and Rp3671 of Robinsoniella peoriensis (Rp) are able to deacetylate the a antigen on RBCs, but less efficient than FpGalNAcDeAase and achieve activity only in the presence of crowding agent (i.e. dextran 40 k).

Table 5: MTS scoring of anti-A antibodies on treated A RBCs

Table 6: MTS scores of 3671 and 3672 of Robinsoniella peeiensis (Rp)

Sample (I) anti-A MTS score
aRBC control 4
Rp3671 (50. mu.g/mL) + dextran 40k 3
Rp3672 (50. mu.g/mL) + dextran 40k 1

FIG. 7 shows the conversion of A antigen on A RBCs to H antigen analyzed via FACS sorting for (A) A + RBC control, (B) GalNAc deacetylase of Clostridium prodigiosus (FpGalNAcDeAc) + galactosaminase of Clostridium prodigiosus (FpGalNase) (10. mu.g/mL), (C) Ct 5757577 _ GalNase of FpGalNAcDeAc + CT of third Clostridium difficile (Ct) (10ug/mL), and (D) GalNAc deacetylase of FpGalNAcDeAc + Robinsoniella peeiensis (Rp) (Rp1021) GalNase (10 ug/mL). The data indicate that galactosaminidase (Rp1021) GalNase of third clostridium (Ct) Ct5757_ GalNase and Robinsoniella peeiensis (Rp) has comparable enzymatic activity to that of clostridium pruriens (FpGalNase) for converting GalN antigen to H antigen (second reaction step).

Example 12: compatibility of enzyme compositions with perfusion/preservation fluids

To ensure that the enzyme composition is compatible with the EVLP system, we first tested in organ perfusion/preservation fluids (STEEN)TMAnd PerfadexTMXVIVO perfusion) in the presence of the enzyme (GalNAc deacetylase protein of purified Clostridium prosbergii of SEQ ID NO:5 and galactosaminase protein of purified Clostridium prosbergii of SEQ ID NO: 10). According to the enzyme composition STEEN at 37 deg.CTMOr Perfadex at 4 ℃TMThe ability to remove type a blood antigens on erythrocytes to assess compatibility. Phosphate Buffered Saline (PBS) at 37 ℃ was used as the comparative group because PBS is among the standard solutions for blood treatmentOne, the first step. For STEENTMAnd PerfadexTMIs based on its operating temperature in clinical practice. The level of antigen removal was analyzed by flow cytometry. In STEENTMAnd PerfadexTMIn order to help predict the appropriate dose to be used in the organ (see figure 8). The dosage unit used throughout the study was defined as the weight of enzyme (μ g) relative to the volume of the solution (mL).

Showing the enzyme composition with STEENTMAnd PerfadexTMThe perfusion/preservation fluid is fully compatible, and increases the efficiency of the enzyme composition compared to PBS. The enzyme composition can remove STEEN at a total enzyme concentration of 1 μ g/mLTMAnd PerfadexTMMore than 90% of the antigen, and the same effect was achieved in PBS at a dose of 4. mu.g/mL (FIG. 8).

Example 13: static treatment of human arteries

To test the efficacy of the enzymes (GalNAc deacetylase protein of purified Clostridium prosbergii of SEQ ID NO:5 and galactosaminase protein of purified Clostridium prosbergii of SEQ ID NO:10) at the tissue level, an in vitro model of the human artery was used. The pulmonary arteries from the same human donor were divided into control groups (STEEN)TMSolution) and treatment groups (enzyme composition + STEEN)TMSolution) and incubated statically at 37 ℃ for 4 hours. Both groups were biopsied at the end of the incubation. The dosages of the enzyme composition were 1. mu.g/mL and 10. mu.g/mL, respectively. Changes in blood group antigens were analyzed by immunohistochemistry. Serial sections of the biopsy were double stained with CD31 (a marker for endothelial cells) to show the location of the inner surface of the blood vessel and BTA to show the expression of blood group antigens.

The expression level of blood antigen type a was significantly reduced in the treated group compared to the control group. The dose effects of 1. mu.g/mL and 10. mu.g/mL were similar to those of the treated arteries. Enzymes may also work at total enzyme concentrations (doses) below 1. mu.g/mL. Disappearance of blood group antigens was confirmed when comparing the staining images of BTA and CD31 (fig. 9).

Example 14: in vitro perfusion of human lungs

Measurement at Toronto EVLP settingThe efficacy of enzyme-containing STEENTM solutions in removing tissue blood group antigens from human organs (e.g., lung) was tested. Donor human lungs were evaluated with clinical Ex Vivo Lung Perfusion (EVLP) and determined to be unsuitable for transplantation and therefore suitable for testing enzyme compositions. After the lung function decreased, an enzyme composition (GalNAc deacetylase protein of purified Clostridium prosperii of SEQ ID NO:5 and galactosaminase protein of purified Clostridium prosperii of SEQ ID NO:10) was added to STEENTMTo begin processing in the perfusate. The dose used was 1. mu.g/mL. Biopsies are taken before and after treatment. Changes in blood group antigen expression were analyzed by immunohistochemistry. Throughout the experiment, lung function and physiology were monitored hourly to ensure treatment did not cause acute side effects.

For human lungs, the volume of perfusate required for single lung EVLP was 1.5L and for double lung EVLP was 2L. In the first test (fig. 10), 1.5mg of the enzyme composition was added to the perfusate for a single right lung EVLP to achieve a dose of 1 μ g/mL. Lungs were treated for one (1) hour. Immunohistochemical analysis showed a significant reduction in type a blood antigen levels after treatment (figure 10). Comparison of pre-treated biopsy sections, which were double stained for blood group antigens and blood vessels, revealed that the antigens in the lungs were located not only on the surface of the vessel wall, but also in the airways. Comparison of biopsies after double staining showed that intravascular antigens had been effectively removed.

In a second test (FIG. 11), in STEENTMAnother right lung EVLP was treated with 1.5mg of the enzyme composition in the perfusion fluid to a concentration of 1 μ g/mL. Lungs were treated for three (3) hours. Immunohistochemical analysis showed a significant reduction in the expression level of type a blood antigens. Comparing the pre-treatment biopsies with double staining of blood group antigens and blood vessels revealed that blood group antigens in the lungs are not only located on the surface of the blood vessels, but also in the airways (fig. 11). Comparison of post-treatment biopsies with double staining showed that intravascular antigens had been effectively removed (fig. 11). No acute side effects in lung physiology and function were observed after the start of the enzymatic treatment.

The results show that at a dose of 1. mu.g/mL, the enzyme works in the perfused human lung within one hour.

Although various embodiments of the invention are disclosed herein, many adaptations and modifications may be made within the scope of the invention in accordance with the common general knowledge of those skilled in the art. Such modifications include the substitution of known equivalents for any aspect of the invention in order to achieve the same result in substantially the same way. Numerical ranges include the numbers defining the range. The word "comprising" is used herein as an open-ended term that is substantially equivalent to the phrase "including, but not limited to," and the word "comprising" has a corresponding meaning. As used herein, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a thing" includes more than one such thing. Citation of references herein is not an admission that such references are prior art to embodiments of the present invention. The invention includes all embodiments and variations substantially as hereinbefore described and with reference to the examples and figures.

Sequence of

The DNA sequence of Clostridium pusillus was modified from a naturally occurring DNA sequence (GalNAc deacetylase 2311/2319 nt/galactosaminidase 3228/3237 nt). In particular, there are differences in the length of the sequences used for protein purification, thereby removing the signal peptide and adding an N-terminal His-tag through the vector backbone.

Informal sequence listing

SEQ ID NO:2

Described is the GalNAc deacetylase (protein sequence) of Clostridium provenii

MRNRRKAVSLLTGLLVTAQLFPTAALAADSSESALNKAPGYQDFPAYYSDSAHADDQVTHPDVVVLEEPWNGYRYWAVYTPNVMRISIYENPSIVASSDGVHWVEPEGLSNPIEPQPPSTRYHNCDADMVYNAEYDAMMAYWNWADDQGGGVGAEVRLRISYDGVHWGVPVTYDEMTRVWSKPTSDAERQVADGEDDFITAIASPDRYDMLSPTIVYDDFRDVFILWANNTGDVGYQNGQANFVEMRYSDDGITWGEPVRVNGFLGLDENGQQLAPWHQDVQYVPDLKEFVCISQCFAGRNPDGSVLHLTTSKDGVNWEQVGTKPLLSPGPDGSWDDFQIYRSSFYYEPGSSAGDGTMRVWYSALQKDTNNKMVADSSGNLTIQAKSEDDRIWRIGYAENSFVEMMRVLLDDPGYTTPALVSGNSLMLSAETTSLPTGDVMKLETSFAPVDTSDQVVKYTSSDPDVATVDEFGTITGVSVGSARIMAETREGLSDDLEIAVVENPYTLIPQSNMTATATSVYGGTTEGPASNVLDGNVRTIWHTNYAPKDELPQSITVSFDQPYTVGRFVYTPRQNGTNGIISEYELYAIHQDGSKDLVASGSDWALDAKDKTVSFAPVEAVGLELKAIAGAGGFGTAAELNVYAYGPIEPAPVYVPVDDRDASLVFTGAWNSDSNGSFYEGTARYTNEIGASVEFTFVGTAIRWYGQNDVNFGAAEVYVDGVLAGEVNVYGPAAAQQLLFEADGLAYGKHTIRIVCVSPVVDFDYFSYVGE

SEQ ID NO:4

Described is a GalNAc deacetylase of Clostridium provenii (protein sequence with signal peptide removed)

ADSSESALNKAPGYQDFPAYYSDSAHADDQVTHPDVVVLEEPWNGYRYWAVYTPNVMRISIYENPSIVASSDGVHWVEPEGLSNPIEPQPPSTRYHNCDADMVYNAEYDAMMAYWNWADDQGGGVGAEVRLRISYDGVHWGVPVTYDEMTRVWSKPTSDAERQVADGEDDFITAIASPDRYDMLSPTIVYDDFRDVFILWANNTGDVGYQNGQANFVEMRYSDDGITWGEPVRVNGFLGLDENGQQLAPWHQDVQYVPDLKEFVCISQCFAGRNPDGSVLHLTTSKDGVNWEQVGTKPLLSPGPDGSWDDFQIYRSSFYYEPGSSAGDGTMRVWYSALQKDTNNKMVADSSGNLTIQAKSEDDRIWRIGYAENSFVEMMRVLLDDPGYTTPALVSGNSLMLSAETTSLPTGDVMKLETSFAPVDTSDQVVKYTSSDPDVATVDEFGTITGVSVGSARIMAETREGLSDDLEIAVVENPYTLIPQSNMTATATSVYGGTTEGPASNVLDGNVRTIWHTNYAPKDELPQSITVSFDQPYTVGRFVYTPRQNGTNGIISEYELYAIHQDGSKDLVASGSDWALDAKDKTVSFAPVEAVGLELKAIAGAGGFGTAAELNVYAYGPIEPAPVYVPVDDRDASLVFTGAWNSDSNGSFYEGTARYTNEIGASVEFTFVGTAIRWYGQNDVNFGAAEVYVDGVLAGEVNVYGPAAAQQLLFEADGLAYGKHTIRIVCVSPVVDFDYFSYVGE

SEQ ID NO:5

Is described as havingHis labelGalNAc deacetylase of Clostridium putida (pET16 a-protein sequence)

MGHHHHHHHHHHSSGADSSESALNKAPGYQDFPAYYSDSAHADDQVTHPDVVVLEEPWNGYRYWAVYTPNVMRISIYENPSIVASSDGVHWVEPEGLSNPIEPQPPSTRYHNCDADMVYNAEYDAMMAYWNWADDQGGGVGAEVRLRISYDGVHWGVPVTYDEMTRVWSKPTSDAERQVADGEDDFITAIASPDRYDMLSPTIVYDDFRDVFILWANNTGDVGYQNGQANFVEMRYSDDGITWGEPVRVNGFLGLDENGQQLAPWHQDVQYVPDLKEFVCISQCFAGRNPDGSVLHLTTSKDGVNWEQVGTKPLLSPGPDGSWDDFQIYRSSFYYEPGSSAGDGTMRVWYSALQKDTNNKMVADSSGNLTIQAKSEDDRIWRIGYAENSFVEMMRVLLDDPGYTTPALVSGNSLMLSAETTSLPTGDVMKLETSFAPVDTSDQVVKYTSSDPDVATVDEFGTITGVSVGSARIMAETREGLSDDLEIAVVENPYTLIPQSNMTATATSVYGGTTEGPASNVLDGNVRTIWHTNYAPKDELPQSITVSFDQPYTVGRFVYTPRQNGTNGIISEYELYAIHQDGSKDLVASGSDWALDAKDKTVSFAPVEAVGLELKAIAGAGGFGTAAELNVYAYGPIEPAPVYVPVDDRDASLVFTGAWNSDSNGSFYEGTARYTNEIGASVEFTFVGTAIRWYGQNDVNFGAAEVYVDGVLAGEVNVYGPAAAQQLLFEADGLAYGKHTIRIVCVSPVVDFDYFSYVGE

SEQ ID NO:7

Described is the galactosaminidase of Clostridium proverbyi

MRGKKFISLTLSTMLCLQLLPTASFAAAPATDTGNAGLIAEGDYAIAGNGVRVTYDADGQTITLYRTEGSGLIQMSKPSPLGGPVIGGQEVQDFSHISCDVEQSTSGVMGSGQRMTITSQSMSTGLIRTYVLETSDIEEGVVYTATSYEAGASDVEVSWFIGSVYELYGAEDRIWSYNGGGEGPMHYYDTLQKIDLTDSGKFSRENKQDDTAASIPVSDIYIADGGITVGDASATRREVHTPVQETSDSAQVSIGWPGKVIAAGSVIEIGESFAVVHPGDYYNGLRGYKNAMDHLGVIMPAPGDIPDSSYDLRWESWGWGFNWTIDLIIGKLDELQAAGVKQITLDDGWYTNAGDWALNPEKFPNGASDALRLTDAIHEHGMTALLWWRPCDGGIDSILYQQHPEYFVMDADGRPARLPTPGGGTNPSLGYALCPMADGAIASQVDFVNRAMNDWGFDGFKGDYVWSMPECYNPAHNHASPEESTEKQSEIYRVSYEAMVANDPNVFNLLCNCGTPQDYYSLPYMTQIATADPTSVDQTRRRVKAYKALMGDYFPVTADHNNIWYPSAVGTGSVLIEKRDLSGTAKEEYEKWLGIADTVQLQKGRFIGDLYSYGFDPYETYVVEKDGVMYYAFYKDGSKYSPTGYPDIELKGLDPNKMYRIVDYVNDRVVATNLMGDNAVFNTRFSDYLLVKAVEISEPDPEPVDPDYGFTSVDDRDEALIYTGTWHDDNNASFSEGTARYTNSTDASVVFSFTGTSIRWYGQRDTNFGTAEVYLDDELKTTVDANGAAEAGVCLFEALDLPAAEHTIKIVCKSGVIDIDRFAYEAATLEPIYEKVDALSDRITYVGNWEEYHNSEFYMGNAMRTDEAGAYAELTFRGTAVRLYAEMSFNFGTADVYLDGELVENIILYGQEATGQLMFERTGLEEGEHTIRLVQNAWNINLDYISYLPEQDQPTPPETTVTVDAMDAQLVYTGVWNDDYHDVFQEGTARYASSAGASVEFEFTGSEIRWYGQNDSNFGVASVYIDNEFVQQVNVNGAAAVGKLLFQKADLPAGSHTIRIVCDTPVIDLDYLTYTTNA

SEQ ID NO:9

Described is the galactosaminidase of Clostridium provenii (protein sequence with signal peptide removed)

AAPATDTGNAGLIAEGDYAIAGNGVRVTYDADGQTITLYRTEGSGLIQMSKPSPLGGPVIGGQEVQDFSHISCDVEQSTSGVMGSGQRMTITSQSMSTGLIRTYVLETSDIEEGVVYTATSYEAGASDVEVSWFIGSVYELYGAEDRIWSYNGGGEGPMHYYDTLQKIDLTDSGKFSRENKQDDTAASIPVSDIYIADGGITVGDASATRREVHTPVQETSDSAQVSIGWPGKVIAAGSVIEIGESFAVVHPGDYYNGLRGYKNAMDHLGVIMPAPGDIPDSSYDLRWESWGWGFNWTIDLIIGKLDELQAAGVKQITLDDGWYTNAGDWALNPEKFPNGASDALRLTDAIHEHGMTALLWWRPCDGGIDSILYQQHPEYFVMDADGRPARLPTPGGGTNPSLGYALCPMADGAIASQVDFVNRAMNDWGFDGFKGDYVWSMPECYNPAHNHASPEESTEKQSEIYRVSYEAMVANDPNVFNLLCNCGTPQDYYSLPYMTQIATADPTSVDQTRRRVKAYKALMGDYFPVTADHNNIWYPSAVGTGSVLIEKRDLSGTAKEEYEKWLGIADTVQLQKGRFIGDLYSYGFDPYETYVVEKDGVMYYAFYKDGSKYSPTGYPDIELKGLDPNKMYRIVDYVNDRVVATNLMGDNAVFNTRFSDYLLVKAVEISEPDPEPVDPDYGFTSVDDRDEALIYTGTWHDDNNASFSEGTARYTNSTDASVVFSFTGTSIRWYGQRDTNFGTAEVYLDDELKTTVDANGAAEAGVCLFEALDLPAAEHTIKIVCKSGVIDIDRFAYEAATLEPIYEKVDALSDRITYVGNWEEYHNSEFYMGNAMRTDEAGAYAELTFRGTAVRLYAEMSFNFGTADVYLDGELVENIILYGQEATGQLMFERTGLEEGEHTIRLVQNAWNINLDYISYLPEQDQPTPPETTVTVDAMDAQLVYTGVWNDDYHDVFQEGTARYASSAGASVEFEFTGSEIRWYGQNDSNFGVASVYIDNEFVQQVNVNGAAAVGKLLFQKADLPAGSHTIRIVCDTPVIDLDYLTYTTNA

SEQ ID NO:10

Is described as havingHis labelThe galactosamine enzyme of Clostridium proverb (pET16 a-protein sequence)

MGHHHHHHHHHHSSGAAPATDTGNAGLIAEGDYAIAGNGVRVTYDADGQTITLYRTEGSGLIQMSKPSPLGGPVIGGQEVQDFSHISCDVEQSTSGVMGSGQRMTITSQSMSTGLIRTYVLETSDIEEGVVYTATSYEAGASDVEVSWFIGSVYELYGAEDRIWSYNGGGEGPMHYYDTLQKIDLTDSGKFSRENKQDDTAASIPVSDIYIADGGITVGDASATRREVHTPVQETSDSAQVSIGWPGKVIAAGSVIEIGESFAVVHPGDYYNGLRGYKNAMDHLGVIMPAPGDIPDSSYDLRWESWGWGFNWTIDLIIGKLDELQAAGVKQITLDDGWYTNAGDWALNPEKFPNGASDALRLTDAIHEHGMTALLWWRPCDGGIDSILYQQHPEYFVMDADGRPARLPTPGGGTNPSLGYALCPMADGAIASQVDFVNRAMNDWGFDGFKGDYVWSMPECYNPAHNHASPEESTEKQSEIYRVSYEAMVANDPNVFNLLCNCGTPQDYYSLPYMTQIATADPTSVDQTRRRVKAYKALMGDYFPVTADHNNIWYPSAVGTGSVLIEKRDLSGTAKEEYEKWLGIADTVQLQKGRFIGDLYSYGFDPYETYVVEKDGVMYYAFYKDGSKYSPTGYPDIELKGLDPNKMYRIVDYVNDRVVATNLMGDNAVFNTRFSDYLLVKAVEISEPDPEPVDPDYGFTSVDDRDEALIYTGTWHDDNNASFSEGTARYTNSTDASVVFSFTGTSIRWYGQRDTNFGTAEVYLDDELKTTVDANGAAEAGVCLFEALDLPAAEHTIKIVCKSGVIDIDRFAYEAATLEPIYEKVDALSDRITYVGNWEEYHNSEFYMGNAMRTDEAGAYAELTFRGTAVRLYAEMSFNFGTADVYLDGELVENIILYGQEATGQLMFERTGLEEGEHTIRLVQNAWNINLDYISYLPEQDQPTPPETTVTVDAMDAQLVYTGVWNDDYHDVFQEGTARYASSAGASVEFEFTGSEIRWYGQNDSNFGVASVYIDNEFVQQVNVNGAAAVGKLLFQKADLPAGSHTIRIVCDTPVIDLDYLTYTTNA

SEQ ID NO:12

Described is the protein sequence isolated from Clostridium difficile (fusion of galactosidase and GalNAcDe acetyl enzyme linked by CBM) (original protein sequence) identified as 99345757.1-Ct5757

MKKRILATFITAMCGLGFFSNWTSSNAYNLIDNISVEKLDTDISQANENVFLNGNGIALEVDNRGATCIYLVDENGVKTKATTSLDTADFSGYPIIGGQKIRDFVIISKNLEENINSILGVGNRLTIISKSSSTNLIRKIVFETSNSNPGAIYSTVSYKAESNDLLVDSFHENEYTMSLGQGPFLAYQGCADQQGANTIVNVTNGYNHNSGQNNYSVGVPFSYVYNSVGGIGIGDASTSRREFKLPIIGKDNTVSLGMEWNGQTLKKGAETAIGTSVITTTNGDYYSGLKSYAEVMKDKGISAPASIPDIAYDSRWESWGFEFDFTIEKIVNKLDELKAMGIKQITLDDGWYTYAGDWKLSPQKFPNGNADMKYLTDEIHKRGMTAILWWRPVDGGINSKLVSEHPEWFIKNSQGNMVRLPGPGGGNGGTAGYALCPNSEGSIQHHKDFVTVALEEWGFDGFKEDYVWGIPKCYDSSHKHSSLSDTLENQYKFYEAIYEQSIAINPDTFIELCNCGTPQDFYSTPYVNHAPTADPISRVQTRTRVKAFKAIFGDDFPVTTDHNSVWLPSALGTGSVMITKHTTLSSSDREQYNKYFGLARDLELAKGEFIGNLYKYGIDPLESYVIRKGEDIYYSFYKDNSSYSGNIEIKGLDSNATYRIEDYVNNRVIARGVKGPTATINTSFTDNLLVRAIPDDTPAEVTTFDVGNNTILSSTDSGNSKYLNAVSTTLEKTATIDSLSIYIGNNSENGKLQIAIYDDNNGKPGTKKAYVEEFVPTKNSWNTKKVVNSVTLPSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEKPLQNAETYLNIPTYDGLNQSTHPDVKYFKNGWNGYKYWMIMTPNRTGSSVAENPSILASDDGINWEVPAGVTNPIAPMPQVGHNCDVDMIYNEATDELWVYWVESDDITKGWVKLIKSKDGVNWSSQQVVVDDNRAKYSTLSPSIIFKDNKYYMWSVNTGNSGWNNQSNKVELRESSDGVNWSNPTVVNTLAQDGSQIWHVNVEYIPSKNEYWAIYPAYKNGTGSDKTELYYAKSSDGVNWTTYKNPILSKGTSGKWDDMEIYRSCFVYDEDTNMIKVWYGAVSQNPQIWKIGFTENDYDKFIEGLTQ

SEQ ID NO:14

Described is an isolated protein sequence (identified as 099345757.1-Ct5757) of Bacillus fuscus 5757(Ct5757) with the signal peptide removed

YNLIDNISVEKLDTDISQANENVFLNGNGIALEVDNRGATCIYLVDENGVKTKATTSLDTADFSGYPIIGGQKIRDFVIISKNLEENINSILGVGNRLTIISKSSSTNLIRKIVFETSNSNPGAIYSTVSYKAESNDLLVDSFHENEYTMSLGQGPFLAYQGCADQQGANTIVNVTNGYNHNSGQNNYSVGVPFSYVYNSVGGIGIGDASTSRREFKLPIIGKDNTVSLGMEWNGQTLKKGAETAIGTSVITTTNGDYYSGLKSYAEVMKDKGISAPASIPDIAYDSRWESWGFEFDFTIEKIVNKLDELKAMGIKQITLDDGWYTYAGDWKLSPQKFPNGNADMKYLTDEIHKRGMTAILWWRPVDGGINSKLVSEHPEWFIKNSQGNMVRLPGPGGGNGGTAGYALCPNSEGSIQHHKDFVTVALEEWGFDGFKEDYVWGIPKCYDSSHKHSSLSDTLENQYKFYEAIYEQSIAINPDTFIELCNCGTPQDFYSTPYVNHAPTADPISRVQTRTRVKAFKAIFGDDFPVTTDHNSVWLPSALGTGSVMITKHTTLSSSDREQYNKYFGLARDLELAKGEFIGNLYKYGIDPLESYVIRKGEDIYYSFYKDNSSYSGNIEIKGLDSNATYRIEDYVNNRVIARGVKGPTATINTSFTDNLLVRAIPDDTPAEVTTFDVGNNTILSSTDSGNSKYLNAVSTTLEKTATIDSLSIYIGNNSENGKLQIAIYDDNNGKPGTKKAYVEEFVPTKNSWNTKKVVNSVTLPSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEKPLQNAETYLNIPTYDGLNQSTHPDVKYFKNGWNGYKYWMIMTPNRTGSSVAENPSILASDDGINWEVPAGVTNPIAPMPQVGHNCDVDMIYNEATDELWVYWVESDDITKGWVKLIKSKDGVNWSSQQVVVDDNRAKYSTLSPSIIFKDNKYYMWSVNTGNSGWNNQSNKVELRESSDGVNWSNPTVVNTLAQDGSQIWHVNVEYIPSKNEYWAIYPAYKNGTGSDKTELYYAKSSDGVNWTTYKNPILSKGTSGKWDDMEIYRSCFVYDEDTNMIKVWYGAVSQNPQIWKIGFTENDYDKFIEGLTQ

SEQ ID NO:15

The following steps are described: has the advantages ofHis labelAnd a fusion protein sequence expression construct of Bacillus fusiformis 5757(Ct5757) with a thrombin cleavage site (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHYNLIDNISVEKLDTDISQANENVFLNGNGIALEVDNRGATCIYLVDENGVKTKATTSLDTADFSGYPIIGGQKIRDFVIISKNLEENINSILGVGNRLTIISKSSSTNLIRKIVFETSNSNPGAIYSTVSYKAESNDLLVDSFHENEYTMSLGQGPFLAYQGCADQQGANTIVNVTNGYNHNSGQNNYSVGVPFSYVYNSVGGIGIGDASTSRREFKLPIIGKDNTVSLGMEWNGQTLKKGAETAIGTSVITTTNGDYYSGLKSYAEVMKDKGISAPASIPDIAYDSRWESWGFEFDFTIEKIVNKLDELKAMGIKQITLDDGWYTYAGDWKLSPQKFPNGNADMKYLTDEIHKRGMTAILWWRPVDGGINSKLVSEHPEWFIKNSQGNMVRLPGPGGGNGGTAGYALCPNSEGSIQHHKDFVTVALEEWGFDGFKEDYVWGIPKCYDSSHKHSSLSDTLENQYKFYEAIYEQSIAINPDTFIELCNCGTPQDFYSTPYVNHAPTADPISRVQTRTRVKAFKAIFGDDFPVTTDHNSVWLPSALGTGSVMITKHTTLSSSDREQYNKYFGLARDLELAKGEFIGNLYKYGIDPLESYVIRKGEDIYYSFYKDNSSYSGNIEIKGLDSNATYRIEDYVNNRVIARGVKGPTATINTSFTDNLLVRAIPDDTPAEVTTFDVGNNTILSSTDSGNSKYLNAVSTTLEKTATIDSLSIYIGNNSENGKLQIAIYDDNNGKPGTKKAYVEEFVPTKNSWNTKKVVNSVTLPSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEKPLQNAETYLNIPTYDGLNQSTHPDVKYFKNGWNGYKYWMIMTPNRTGSSVAENPSILASDDGINWEVPAGVTNPIAPMPQVGHNCDVDMIYNEATDELWVYWVESDDITKGWVKLIKSKDGVNWSSQQVVVDDNRAKYSTLSPSIIFKDNKYYMWSVNTGNSGWNNQSNKVELRESSDGVNWSNPTVVNTLAQDGSQIWHVNVEYIPSKNEYWAIYPAYKNGTGSDKTELYYAKSSDGVNWTTYKNPILSKGTSGKWDDMEIYRSCFVYDEDTNMIKVWYGAVSQNPQIWKIGFTENDYDKFIEGLTQ

SEQ ID NO:17

The following steps are described: has the advantages ofHis labelAnd the GalNAc deacetylase protein sequence-expression construct of the third Clostridium 5757(Ct5757) with Thrombin cleavage site (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEKPLQNAETYLNIPTYDGLNQSTHPDVKYFKNGWNGYKYWMIMTPNRTGSSVAENPSILASDDGINWEVPAGVTNPIAPMPQVGHNCDVDMIYNEATDELWVYWVESDDITKGWVKLIKSKDGVNWSSQQVVVDDNRAKYSTLSPSIIFKDNKYYMWSVNTGNSGWNNQSNKVELRESSDGVNWSNPTVVNTLAQDGSQIWHVNVEYIPSKNEYWAIYPAYKNGTGSDKTELYYAKSSDGVNWTTYKNPILSKGTSGKWDDMEIYRSCFVYDEDTNMIKVWYGAVSQNPQIWKIGFTENDYDKFIEGLTQ

SEQ ID NO:19

The following steps are described: has the advantages ofHis labelAnd a third Clostridium 5757(Ct5757) protein sequence of Thrombin cleavage site Galactosaminidase _ expression construct (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHYNLIDNISVEKLDTDISQANENVFLNGNGIALEVDNRGATCIYLVDENGVKTKATTSLDTADFSGYPIIGGQKIRDFVIISKNLEENINSILGVGNRLTIISKSSSTNLIRKIVFETSNSNPGAIYSTVSYKAESNDLLVDSFHENEYTMSLGQGPFLAYQGCADQQGANTIVNVTNGYNHNSGQNNYSVGVPFSYVYNSVGGIGIGDASTSRREFKLPIIGKDNTVSLGMEWNGQTLKKGAETAIGTSVITTTNGDYYSGLKSYAEVMKDKGISAPASIPDIAYDSRWESWGFEFDFTIEKIVNKLDELKAMGIKQITLDDGWYTYAGDWKLSPQKFPNGNADMKYLTDEIHKRGMTAILWWRPVDGGINSKLVSEHPEWFIKNSQGNMVRLPGPGGGNGGTAGYALCPNSEGSIQHHKDFVTVALEEWGFDGFKEDYVWGIPKCYDSSHKHSSLSDTLENQYKFYEAIYEQSIAINPDTFIELCNCGTPQDFYSTPYVNHAPTADPISRVQTRTRVKAFKAIFGDDFPVTTDHNSVWLPSALGTGSVMITKHTTLSSSDREQYNKYFGLARDLELAKGEFIGNLYKYGIDPLESYVIRKGEDIYYSFYKDNSSYSGNIEIKGLDSNATYRIEDYVNNRVIARGVKGPTATINTSFTDNLLVRAIPDDTPAEVTTFDVGNNTILSSTDSGNSKYLNAVSTTLEKTATIDSLSIYIGNNSENGKLQIAIYDDNNGKPGTKKAYVEEFVPTKNSWNTKKVVNSVTLPSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEK

SEQ ID NO:21

The following steps are described: has the advantages ofHis labelAnd the galactosaminidase protein expression construct of Robinsoniella peoriensis Rp1021 at the thrombin cleavage site (in the pET28a vector)

MGSSHHHHHHSSGLVPRGSHGNGLEVKASPREVAQITGNGVSVTFFQEDGTVQLSCIEDDGNTAFMTRNSEVSYPVVGGEEVTDFSDFQCEVQENVTGAAGAGSRMTITSISSGRGIQRSVVIETVDEVKGLLHISSSYRAEEEVDADEFIDSRFSLDNPSDTVWSYNGGGEGAQSRYDTLQKIDLSDGESFYRENLQNQTAAGIPVADIYGKDGGITVGDASVTRRQLSTPVNERNGTAYVSVKHPGAVITQRETEISQSFVNVHRGDYYSGLRGYADGMKQIGFTTLSREQIPESSYDLRWESWGWEFDWTVELIINKLDELKEMGIKQITLDDGWYNAAGEWGLNNWKLPNGALDMRHLTDAIHERGMTAVLWWRPCDGGREDSALFKEHPEYFIKNQDGSFGKLAGPGQWNSFLGSCGYALCPLSEGAVQSQVDFINRAMNEWGFDGFKSDYVWSLPKCYSQDHHHEYPEESTEQQAVFYRAVYEAMTDNDPNAFHLLCNCGTPQDYYSLPYVTQVPTADPTSVDQTRRRVKAYKALCGDYFPVTTDHNEVWYPSTIGTGAILIEKRDLSGWEEEEYAKWLKIAQENQLHKGTFIGDLYSYGYDPYETYTVYKDGIMYYAFYKDGNRYRPSGNPDIELKGLEDGKLYRIVDYVNNQVVATNVTSSNAVFSYPFSDYLLVKAVEISEPDTDGPGPVPDPEGAVTVEENDPELVYTGDWVREENDGYHGGGARYTKEAEASVELAFYGTGAAWYGQHDVNFGSARIYIDGTYVKTVSCMGEPGINIKLFEISGLDLASHRIKIECETPVIDIDRLTYIKGEEVPAKVMTADLRALTVIANQYDMNSFADGNYKDQLGVSLVRANQLLAADDVTQGAVNEEQKYLLNAMLKIRKKVDKSWIGLPGPIPQDIQTENISRDNLAKVISYTGQLDRDEIIPAIKEQLNDSYDKAVSIAERQDASQPEIDRAWAELMNAVQYSSYIRGSKEELLSLLDEYGKVDTTVYKDAALFIESLEAAKKVYQDENAMDGEISDCIKQLRDAKDQLQLKDPVDPPKPDPDPDPKPDPTPDPGPDPKPDPTPDPTPDPKPNPTPTPDPTPEPALKKPEQVSGLKSKAETDYLTVSWKKLNNAESYKVYIYKSGKWRLAGKTTKTSIKIKKLVSGTKYTVKVAAVNKAGQGKYSSQVYTAAKPKKVKLKSVSRYRTSKVKLNYGKVKAGGYEIWMKNGKGSYKKAATSTKTTAIKSGLKKGKTYYFKVRAYVKNKNQVIYGSFSNIKKYKMVL

SEQ ID NO:23

The following steps are described: has the advantages ofHis labelAnd thrombin cleavage site from the GalNAc deacetylase protein sequence of Ruthenium lactaforts Rl8755 (in the pET28a vector)

MGSSHHHHHHSSGLVPRGSHEETDLLVNGGFETGDSTGWNWFNNAVVDSAAPHSGNYCAKVAKNSSYEQVVTVSPDTKYVLTGWAKSEGSSVMTLGVKNYGGQETFSATLSADYQQLAVTFTTGPNAQTATIYGYRQNSGSGAGYFDDVELTAVQDFAPYQPLANAIAPQAIPTYDGANQPTHPSVVKFEQPWNGYLYWMAMTPYPFNDGSYENPSIVASNDGENWIVPEGVSNPLAGTPSPGHNCDVDLVYVPASDELRMYYVEADDIISSRVKMISSRDGVHWSEPQVVMQDLVRKYSILSPSIEILPDGTYMMWYVDTGNAGWNSQNNQVKYRTSADGIKWSGAVTCTDFVQPGYQIWHIDVHYDTSSGAYYAVYPAYPNGTDCDHCNLFFAVNRTGKQWETFSRPILKPSTEGGWDDFCIYRSSMLIDDGMLKVWYGAKKQEDSSWHTGLTMRDFSEFMKILER

SEQ ID NO:25

The following steps are described: has the advantages ofHis labelAnd thrombin cleavage site GalNAc deacetylase protein _ expression construct of Robinsoniella peoriensis Rp3671 (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHSPLSAAAESGTGTRLVKGQTGYLTEEQAIRNQEQTTEEREQKLTGEETAEVLMEGTKDSGIVQTEEVQTKEMQTEDAQTEEVQTEEMQTEDAQTKEVQTEEMQTEDAQTEEVQTKEEPAEETHMKEIQTQGTKKASDRNGKARVTEILEDAQDPANRIVYLSDLQWKSENHTVDSELPTRKDKSFGGGKITLKVDGTVTEFDKGIGTQTDSTIVYDLEGKGYTKFETYVGVDYSQKENIPGEVCDVKFRVKIDDKIVSETGVLDPLSNAVKISVNIPDTAKTLTLYADKVTETWSDHANWADAKFYQALPEPENVAFKKTVVTRKTSDNSEAPVNPDSAVNSSKAVDGVIDSSSYFDFGDQANSGAVRESLYMEVDLKGSYLLSDIQLWRYWKDGRTYAATAIVVAEDENFENAAVIYNSDTTGEIHHLGAGSDMLYAETESGKTFPVPENTKARYIRVYTYGVNGTSGVTNHIVELKVNAYVFGDEILPEKPDDSKIFPNAVNPLKLQGPGTNDQVTHPDVTVFDEPWNGYKYWMAYTPNKPGSSYFENPCIAASNDGVNWEFPAQNPVQPRYDSEIENQNEHNCDTDIVYDPVNDRLIMYWEWAQDEAVNGKTHRSEIRYRVSYDGINWGVEDKTGVLMTGPTDHGCAIATEGERYSDLSPTVVYDKTEKIYKMWANDAGDVGYENKQNNKVWYRTSQDGISNWSDKTYVENFLGVNEDGLQMYPWHQDIQWVEEFQEYWALQQAFPAGSGPDNSSLRFSKSKDGLHWEPVSEKALITVGAPGTWDAGQIYRSTFWYEPGGAKGNGTFHIWYAALAEGQSHWDIGYTSANYADAMYKLTGSRPEVEKRIEVNNENPLLIMPLYGKSYSESGSTLDWGDDLVSRWKQVPEDLKENAVIEIHLGGKIGLNESDSHTAKAFYEQQLAIAQENNIPVMMVVATAGQQNYWTGTANLDAEWIDRMFKQHSVLKGIMSTENYWTDYNKVATMGADYLRVAAENGGYFVWSEHQEGVIENVIANEKFNEALKLYGNNFIFTWKNTPAGTNSNAGTASYMQGLWLTGICAQWGGLADTWKWYEKGFGKLFDGQYSYNPGGEEARPVATEPEALLGIEMMSIYTNGGCVYNFEHPAYVYGSYNQNSPCFENVIAEFMRYAIKNPAPGKEEVLADTKAVFYGKLSSLKSAGNLLQKGLNWEDATLPTQTTGRYGLIPAVPEAVDEKTVKAVFGDIEILNQSSAQLANKDAKKAYFEEKYPEQYTGTAFGQLLNDTWYLYNSNVNVDGVQNAKLPLEGNKSVDITMTPHTYVILDDQDGELQIKLNNYRVDKDSIWEGYGTTVTDRWDTDHNTKLQDWIRDEYIPNPDDDTFRDTTFELVGLESEPEVNVTNGLKDQYQEPVVEYDAAAGTAMITVSGNGWVDLTIDTNTAEVPQVDKAKLNSKIAEAKGIRQGNYTDESYKALQEEIGKSQAVSNKTDATQEEVNAQLSRLESAIARLKEKPAVVSKTALNAKIAEAKGIRQGNYTDESYKALQNAIVKAQELSNKTDATQQQVNDLVSALTNAIKNLKIDADKLAAESAKKVAAVKVAVKAVSYKSKEIKLSWKTVADADGYVIRVKTGKKWSTEKTIKNNRIITYTYKKGTPGKKYVFEVKAFKKVNGKTTYSKYKTATKKVVPQTVTAKAKASKNNVVVKWNKVSGASGYVVMKKKGKTWVKAAQVNAKKLYFTDKKVKKGKVYSYKVKAYKVYKGKKVYGSYSKSVNVKTKS

SEQ ID NO:27

The following steps are described: has the advantages ofHis labelAnd thrombin cleavage site GalNAc deacetylase protein _ expression construct of Robinsoniella peoriensis Rp3672 (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHAETATEENAALEKTVTLHKSDGTELPEDYRNPQRPATMAVDGIIDDTGEYNYCDFGKDGDKAALYMQVDLGGLYDLSRVNMWRYWKDSRTYDATVITTSESGDFTDEAVIYNSDRSNVHGFGAGGDERYAETASGHEFPVPDGTKAQAVRVYVFGSQNGTTNHINELQVWGTPHTENPDVNSYQVTIPQGNGYQVIPYENDPTTVEEGGSFRFQVLIDSDNGYSATSAVKANGVSLEAVDSVYTIENITEDQVITIEGVHKAQYEVKFPENPQGYSVEIQNEGSTTVDYNGSVSFKLIIDEAYNESVPVVKANGGAALGKDELGVYTIANIQDDITVTVEGIQENTVVKTKTMYLSDMDWKSAANAVGATGEKDTPTKDLNHLQQQMKLLVNGAEKSFDKGIGVQTDSSIVYDLEDKGYTSFHTLAGVDYSAMEYVDGEGCDIQFKVYLDDVVVFDSGVVDASDEAQEVNVAITSENKELKLEAKMVKEPYNDWGNWADASFEMAYPEPSNVALNKTVTVKKTADNSDSEVNSSRPGSMAVDGIIGPTSDSNYCDFGQDGDNTSRYLQVDLGDVYELTQINMFRYWADGRVYNGTVIAVSENADFSNPTFIYNSDKADKHGLGAGSDDTYGETQSGKLFEVPAGTMGQYVRVYMAGSNKGTTNHIAELQVMGYNFNTEPKPYEANAFENAEVYLDMPTHFQDLDSNKNDDGSLKHIGGQVTHPDIQVFDQPWNGYKYWMIYTPNTMITSQYENPYIVASEDGQTWVEPEGISNPIEPEPPSTRFHNCDADLLYDSVNDRLLAYWNWADDGGGIDDELKDQNCQIRLRISYDGINWGVPYDKDGNIATTADTVVRMETGDKDFIPAISEKDRYGMLSPTFTYDDFRGIYTMWAQNSGDAGYNQSGKFIEMRWSEDGINWSEPQKVNNFLGKDENGRQLWPWHQDIQYIPELQEYWGLSQCFSTSNPDGSVLYLTKSRDGVNWEQAGTQPVLRAGKSGTWDDFQIYRSTFYYDNQSDSPTGGKFRIWYSALQANTSGKTVLAPDGTVSLQVGSQDTRIWRIGYTENDYMEVMKALTQNKNYEEPELVDAVSLNLSMDKTSISVGEEATVSTAFVPENATDRIVKYTSQDPEIAVIDPTGIVTGVKDGTTTIVAETKSGAKGELSVTVGELQRGEIRFEVSNDHPMYLENYYWSDDAPKKDGLDANKNYYGDERVDSPVMLYNTVPEELKDNTVILLIAERSLNSTDAVRDWIKKNVELCNENKIPCAVQIANGETNVNTTIPLSFWNELATNNEYLVGFNAAEMYNRFAGDNRSYVMDMIRLGVSHGVCMMWTDTNIFGTNGVLYDWLTQDEKLSGLMREYKEYISLMTKESYGSEAANTDALFKGLWMTDYCENWGIASDWWHWQLDSNGALFDAGSGGDAWKQCLTWPENMYTQDVVRAVSQGATCFKSEAQWYSNATKGMRTPTYQYSMIPFLEKLVSKEVKIPTKEEMLERTKAIVVGAENWNNFNYNTTYSNLYPSTGQYGIVPYVPSNCPEEELAGYDLVVRENLGKAGLKSALDTVYPVQKSEGTAYCETFGDTWYWMNSSEDKNVSQYTEFTTAINGAESVKIAGEPHVFGIIKENPGSLNVYLSNYRLDKTELWDGTIPGGLSDQGCYNYVWQMCERMKNGTGLDTQLRDTVITVKNAVEPKVNFVTESPADRSFAEDNYVRPYKYTVAQKEGTTDEWVITVSHNGIVEFNIVTGDEKVPATSVELSTDKVDVIRNRTAVVKATVLPQNAGNKQLTWTIADPEIASVDNKGTVTGLKEGKTVLRAAISGSVYKECEVNVIDRKVTEVNLNKTELSLSAGDSAKLEASIAPEDPSDSSITWTSTNENVATVASNGTVTAHKAGVAQIIAQSAYQAKGIATVTVNYAASVKLDRTGMTATANSEQSKSGGEGPASNVLDGKQDTMWHTSWTDKPELHPHWIKIDLNGTKTINKFAYTPRTGASNGTIYNYVLIITDLEGNEKQVAKGVWAANADVKYAEFDAVEATAIKLQVDGNDDKASKGGYGSAAEINIFEVAQKPSANELAENIKVIAPVKAEDTKVSIPVITGFDIVISNSSNPDVIGIDGSITRPENDTVVTLTLKVKETDAKSVKAAGTEATTNVDVLVTGTKTSDVEAESVTLDQTSADLTVGGELLLNAVVKPDIATNKAVTWSSDKPGTATVENGRVKALAAGEARITAATANGKTADCVINVKEKEEPEVILPAEVRLNIPSAEFTVGDQIQLTASVLPANAADKTITWKSDKPEVATVANGWVKGIAAGTAKITATSVNGKTAVCVITVKAQPQNLPTGVSLNKKTASVKLNKTLTLSAVVQPSNADNKTVKWTSDNTYVATVENGVVKAVNAGTARITAATVNGHKATCTITVPGTKISKAKVSLASSKTHTGKALKPSVKVTYGKNTLKKNTDYTVSYKNNINPGTASVTITGKGKYYGTINKTFAIKAAEGKTYTVGKGKYKVTDASAKNKTVTFMAPVKKTYSSFSVPSKVKIGNDTYKVTAVAKNAFKKNTKLTKLTIGSNVKTIGSYAFYGASQLKTLTLKTTGLNSVGKNAFKKTNAKLTVKVPKSKLADYKKLLKGKGLSGKAKIQK

SEQ ID NO:29

The following steps are described: has the advantages ofHis labelAnd the GalNAc deacetylase protein Rp 3671-expression construct of Robinsoniella peoriensis Rp3671 at the Thrombin cleavage site (in the pET28a vector)

MGSSHHHHHHSSGLVPRGSHSPLSAAAESGTGTRLVKGQTGYLTEEQAIRNQEQTTEEREQKLTGEETAEVLMEGTKDSGIVQTEEVQTKEMQTEDAQTEEVQTEEMQTEDAQTKEVQTEEMQTEDAQTEEVQTKEEPAEETHMKEIQTQGTKKASDRNGKARVTEILEDAQDPANRIVYLSDLQWKSENHTVDSELPTRKDKSFGGGKITLKVDGTVTEFDKGIGTQTDSTIVYDLEGKGYTKFETYVGVDYSQKENIPGEVCDVKFRVKIDDKIVSETGVLDPLSNAVKISVNIPDTAKTLTLYADKVTETWSDHANWADAKFYQALPEPENVAFKKTVVTRKTSDNSEAPVNPDSAVNSSKAVDGVIDSSSYFDFGDQANSGAVRESLYMEVDLKGSYLLSDIQLWRYWKDGRTYAATAIVVAEDENFENAAVIYNSDTTGEIHHLGAGSDMLYAETESGKTFPVPENTKARYIRVYTYGVNGTSGVTNHIVELKVNAYVFGDEILPEKPDDSKIFPNAVNPLKLQGPGTNDQVTHPDVTVFDEPWNGYKYWMAYTPNKPGSSYFENPCIAASNDGVNWEFPAQNPVQPRYDSEIENQNEHNCDTDIVYDPVNDRLIMYWEWAQDEAVNGKTHRSEIRYRVSYDGINWGVEDKTGVLMTGPTDHGCAIATEGERYSDLSPTVVYDKTEKIYKMWANDAGDVGYENKQNNKVWYRTSQDGISNWSDKTYVENFLGVNEDGLQMYPWHQDIQWVEEFQEYWALQQAFPAGSGPDNSSLRFSKSKDGLHWEPVSEKALITVGAPGTWDAGQIYRSTFWYEPGGAKGNGTFHIWYAALAEGQSHWDIGYTSANYADAMYKLTGSR

SEQ ID NO:31

The following steps are described: has the advantages ofHis labelAnd thrombin cleavage site Robinsoniella peoriensis Rp3672_ GalNAc deacetylase _ protein expression construct (in pET28a vector)

MGSSHHHHHHSSGLVPRGSHAETATEENAALEKTVTLHKSDGTELPEDYRNPQRPATMAVDGIIDDTGEYNYCDFGKDGDKAALYMQVDLGGLYDLSRVNMWRYWKDSRTYDATVITTSESGDFTDEAVIYNSDRSNVHGFGAGGDERYAETASGHEFPVPDGTKAQAVRVYVFGSQNGTTNHINELQVWGTPHTENPDVNSYQVTIPQGNGYQVIPYENDPTTVEEGGSFRFQVLIDSDNGYSATSAVKANGVSLEAVDSVYTIENITEDQVITIEGVHKAQYEVKFPENPQGYSVEIQNEGSTTVDYNGSVSFKLIIDEAYNESVPVVKANGGAALGKDELGVYTIANIQDDITVTVEGIQENTVVKTKTMYLSDMDWKSAANAVGATGEKDTPTKDLNHLQQQMKLLVNGAEKSFDKGIGVQTDSSIVYDLEDKGYTSFHTLAGVDYSAMEYVDGEGCDIQFKVYLDDVVVFDSGVVDASDEAQEVNVAITSENKELKLEAKMVKEPYNDWGNWADASFEMAYPEPSNVALNKTVTVKKTADNSDSEVNSSRPGSMAVDGIIGPTSDSNYCDFGQDGDNTSRYLQVDLGDVYELTQINMFRYWADGRVYNGTVIAVSENADFSNPTFIYNSDKADKHGLGAGSDDTYGETQSGKLFEVPAGTMGQYVRVYMAGSNKGTTNHIAELQVMGYNFNTEPKPYEANAFENAEVYLDMPTHFQDLDSNKNDDGSLKHIGGQVTHPDIQVFDQPWNGYKYWMIYTPNTMITSQYENPYIVASEDGQTWVEPEGISNPIEPEPPSTRFHNCDADLLYDSVNDRLLAYWNWADDGGGIDDELKDQNCQIRLRISYDGINWGVPYDKDGNIATTADTVVRMETGDKDFIPAISEKDRYGMLSPTFTYDDFRGIYTMWAQNSGDAGYNQSGKFIEMRWSEDGINWSEPQKVNNFLGKDENGRQLWPWHQDIQYIPELQEYWGLSQCFSTSNPDGSVLYLTKSRDGVNWEQAGTQPVLRAGKSGTWDDFQIYRSTFYYDNQSDSPTGGKFRIWYSALQANTSGKTVLAPDGTVSLQVGSQDTRIWRIGYTENDYMEVMKALTQNKNYEE

SEQ ID NO:32

The following steps are described: GalNAc deacetylase protein sequence of Bacillus fusciparum 5757(Ct5757)

HSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEKPLQNAETYLNIPTYDGLNQSTHPDVKYFKNGWNGYKYWMIMTPNRTGSSVAENPSILASDDGINWEVPAGVTNPIAPMPQVGHNCDVDMIYNEATDELWVYWVESDDITKGWVKLIKSKDGVNWSSQQVVVDDNRAKYSTLSPSIIFKDNKYYMWSVNTGNSGWNNQSNKVELRESSDGVNWSNPTVVNTLAQDGSQIWHVNVEYIPSKNEYWAIYPAYKNGTGSDKTELYYAKSSDGVNWTTYKNPILSKGTSGKWDDMEIYRSCFVYDEDTNMIKVWYGAVSQNPQIWKIGFTENDYDKFIEGLTQ

SEQ ID NO:33

The following steps are described: GalNAc deacetylase protein sequence of Ruthenium lactatiformans Rl8755

HEETDLLVNGGFETGDSTGWNWFNNAVVDSAAPHSGNYCAKVAKNSSYEQVVTVSPDTKYVLTGWAKSEGSSVMTLGVKNYGGQETFSATLSADYQQLAVTFTTGPNAQTATIYGYRQNSGSGAGYFDDVELTAVQDFAPYQPLANAIAPQAIPTYDGANQPTHPSVVKFEQPWNGYLYWMAMTPYPFNDGSYENPSIVASNDGENWIVPEGVSNPLAGTPSPGHNCDVDLVYVPASDELRMYYVEADDIISSRVKMISSRDGVHWSEPQVVMQDLVRKYSILSPSIEILPDGTYMMWYVDTGNAGWNSQNNQVKYRTSADGIKWSGAVTCTDFVQPGYQIWHIDVHYDTSSGAYYAVYPAYPNGTDCDHCNLFFAVNRTGKQWETFSRPILKPSTEGGWDDFCIYRSSMLIDDGMLKVWYGAKKQEDSSWHTGLTMRDFSEFMKILER

SEQ ID NO:34

The following steps are described: GalNAc deacetylase protein of Robinsoniella peoriensis Rp3671

HSPLSAAAESGTGTRLVKGQTGYLTEEQAIRNQEQTTEEREQKLTGEETAEVLMEGTKDSGIVQTEEVQTKEMQTEDAQTEEVQTEEMQTEDAQTKEVQTEEMQTEDAQTEEVQTKEEPAEETHMKEIQTQGTKKASDRNGKARVTEILEDAQDPANRIVYLSDLQWKSENHTVDSELPTRKDKSFGGGKITLKVDGTVTEFDKGIGTQTDSTIVYDLEGKGYTKFETYVGVDYSQKENIPGEVCDVKFRVKIDDKIVSETGVLDPLSNAVKISVNIPDTAKTLTLYADKVTETWSDHANWADAKFYQALPEPENVAFKKTVVTRKTSDNSEAPVNPDSAVNSSKAVDGVIDSSSYFDFGDQANSGAVRESLYMEVDLKGSYLLSDIQLWRYWKDGRTYAATAIVVAEDENFENAAVIYNSDTTGEIHHLGAGSDMLYAETESGKTFPVPENTKARYIRVYTYGVNGTSGVTNHIVELKVNAYVFGDEILPEKPDDSKIFPNAVNPLKLQGPGTNDQVTHPDVTVFDEPWNGYKYWMAYTPNKPGSSYFENPCIAASNDGVNWEFPAQNPVQPRYDSEIENQNEHNCDTDIVYDPVNDRLIMYWEWAQDEAVNGKTHRSEIRYRVSYDGINWGVEDKTGVLMTGPTDHGCAIATEGERYSDLSPTVVYDKTEKIYKMWANDAGDVGYENKQNNKVWYRTSQDGISNWSDKTYVENFLGVNEDGLQMYPWHQDIQWVEEFQEYWALQQAFPAGSGPDNSSLRFSKSKDGLHWEPVSEKALITVGAPGTWDAGQIYRSTFWYEPGGAKGNGTFHIWYAALAEGQSHWDIGYTSANYADAMYKLTGSR

SEQ ID NO:35

The following steps are described: robinsoniella peeiensis Rp3672_ GalNAc deacetylase _ protein

HAETATEENAALEKTVTLHKSDGTELPEDYRNPQRPATMAVDGIIDDTGEYNYCDFGKDGDKAALYMQVDLGGLYDLSRVNMWRYWKDSRTYDATVITTSESGDFTDEAVIYNSDRSNVHGFGAGGDERYAETASGHEFPVPDGTKAQAVRVYVFGSQNGTTNHINELQVWGTPHTENPDVNSYQVTIPQGNGYQVIPYENDPTTVEEGGSFRFQVLIDSDNGYSATSAVKANGVSLEAVDSVYTIENITEDQVITIEGVHKAQYEVKFPENPQGYSVEIQNEGSTTVDYNGSVSFKLIIDEAYNESVPVVKANGGAALGKDELGVYTIANIQDDITVTVEGIQENTVVKTKTMYLSDMDWKSAANAVGATGEKDTPTKDLNHLQQQMKLLVNGAEKSFDKGIGVQTDSSIVYDLEDKGYTSFHTLAGVDYSAMEYVDGEGCDIQFKVYLDDVVVFDSGVVDASDEAQEVNVAITSENKELKLEAKMVKEPYNDWGNWADASFEMAYPEPSNVALNKTVTVKKTADNSDSEVNSSRPGSMAVDGIIGPTSDSNYCDFGQDGDNTSRYLQVDLGDVYELTQINMFRYWADGRVYNGTVIAVSENADFSNPTFIYNSDKADKHGLGAGSDDTYGETQSGKLFEVPAGTMGQYVRVYMAGSNKGTTNHIAELQVMGYNFNTEPKPYEANAFENAEVYLDMPTHFQDLDSNKNDDGSLKHIGGQVTHPDIQVFDQPWNGYKYWMIYTPNTMITSQYENPYIVASEDGQTWVEPEGISNPIEPEPPSTRFHNCDADLLYDSVNDRLLAYWNWADDGGGIDDELKDQNCQIRLRISYDGINWGVPYDKDGNIATTADTVVRMETGDKDFIPAISEKDRYGMLSPTFTYDDFRGIYTMWAQNSGDAGYNQSGKFIEMRWSEDGINWSEPQKVNNFLGKDENGRQLWPWHQDIQYIPELQEYWGLSQCFSTSNPDGSVLYLTKSRDGVNWEQAGTQPVLRAGKSGTWDDFQIYRSTFYYDNQSDSPTGGKFRIWYSALQANTSGKTVLAPDGTVSLQVGSQDTRIWRIGYTENDYMEVMKALTQNKNYEE

SEQ ID NO:36

The following steps are described: galactosamine enzyme protein sequence of third Clostridium 5757(Ct5757)

HYNLIDNISVEKLDTDISQANENVFLNGNGIALEVDNRGATCIYLVDENGVKTKATTSLDTADFSGYPIIGGQKIRDFVIISKNLEENINSILGVGNRLTIISKSSSTNLIRKIVFETSNSNPGAIYSTVSYKAESNDLLVDSFHENEYTMSLGQGPFLAYQGCADQQGANTIVNVTNGYNHNSGQNNYSVGVPFSYVYNSVGGIGIGDASTSRREFKLPIIGKDNTVSLGMEWNGQTLKKGAETAIGTSVITTTNGDYYSGLKSYAEVMKDKGISAPASIPDIAYDSRWESWGFEFDFTIEKIVNKLDELKAMGIKQITLDDGWYTYAGDWKLSPQKFPNGNADMKYLTDEIHKRGMTAILWWRPVDGGINSKLVSEHPEWFIKNSQGNMVRLPGPGGGNGGTAGYALCPNSEGSIQHHKDFVTVALEEWGFDGFKEDYVWGIPKCYDSSHKHSSLSDTLENQYKFYEAIYEQSIAINPDTFIELCNCGTPQDFYSTPYVNHAPTADPISRVQTRTRVKAFKAIFGDDFPVTTDHNSVWLPSALGTGSVMITKHTTLSSSDREQYNKYFGLARDLELAKGEFIGNLYKYGIDPLESYVIRKGEDIYYSFYKDNSSYSGNIEIKGLDSNATYRIEDYVNNRVIARGVKGPTATINTSFTDNLLVRAIPDDTPAEVTTFDVGNNTILSSTDSGNSKYLNAVSTTLEKTATIDSLSIYIGNNSENGKLQIAIYDDNNGKPGTKKAYVEEFVPTKNSWNTKKVVNSVTLPSGQYWLVFQPDNDVLQTKTNPSSMKQSANNNPYNYNILPNSFPIGTGYNAYKGDVSFYATFKEASSQAIPQNSWALKYVDSEETTGENGRATNAFDGNNNTIWHTKYSGGNAAPMPHEIQIDLRGVYNINQINYLPRQDGGTNGTIKDYEVYLSLDGVNWGQPISKGTFESNSTEKIVKFNETKSRYVKLKALSEINNKQFTTVADLKVFGWEISKIEK

SEQ ID NO:37

The following steps are described: galactosamine enzyme protein sequence of Robinsoniella peoriensis Rp1021

HGNGLEVKASPREVAQITGNGVSVTFFQEDGTVQLSCIEDDGNTAFMTRNSEVSYPVVGGEEVTDFSDFQCEVQENVTGAAGAGSRMTITSISSGRGIQRSVVIETVDEVKGLLHISSSYRAEEEVDADEFIDSRFSLDNPSDTVWSYNGGGEGAQSRYDTLQKIDLSDGESFYRENLQNQTAAGIPVADIYGKDGGITVGDASVTRRQLSTPVNERNGTAYVSVKHPGAVITQRETEISQSFVNVHRGDYYSGLRGYADGMKQIGFTTLSREQIPESSYDLRWESWGWEFDWTVELIINKLDELKEMGIKQITLDDGWYNAAGEWGLNNWKLPNGALDMRHLTDAIHERGMTAVLWWRPCDGGREDSALFKEHPEYFIKNQDGSFGKLAGPGQWNSFLGSCGYALCPLSEGAVQSQVDFINRAMNEWGFDGFKSDYVWSLPKCYSQDHHHEYPEESTEQQAVFYRAVYEAMTDNDPNAFHLLCNCGTPQDYYSLPYVTQVPTADPTSVDQTRRRVKAYKALCGDYFPVTTDHNEVWYPSTIGTGAILIEKRDLSGWEEEEYAKWLKIAQENQLHKGTFIGDLYSYGYDPYETYTVYKDGIMYYAFYKDGNRYRPSGNPDIELKGLEDGKLYRIVDYVNNQVVATNVTSSNAVFSYPFSDYLLVKAVEISEPDTDGPGPVPDPEGAVTVEENDPELVYTGDWVREENDGYHGGGARYTKEAEASVELAFYGTGAAWYGQHDVNFGSARIYIDGTYVKTVSCMGEPGINIKLFEISGLDLASHRIKIECETPVIDIDRLTYIKGEEVPAKVMTADLRALTVIANQYDMNSFADGNYKDQLGVSLVRANQLLAADDVTQGAVNEEQKYLLNAMLKIRKKVDKSWIGLPGPIPQDIQTENISRDNLAKVISYTGQLDRDEIIPAIKEQLNDSYDKAVSIAERQDASQPEIDRAWAELMNAVQYSSYIRGSKEELLSLLDEYGKVDTTVYKDAALFIESLEAAKKVYQDENAMDGEISDCIKQLRDAKDQLQLKDPVDPPKPDPDPDPKPDPTPDPGPDPKPDPTPDPTPDPKPNPTPTPDPTPEPALKKPEQVSGLKSKAETDYLTVSWKKLNNAESYKVYIYKSGKWRLAGKTTKTSIKIKKLVSGTKYTVKVAAVNKAGQGKYSSQVYTAAKPKKVKLKSVSRYRTSKVKLNYGKVKAGGYEIWMKNGKGSYKKAATSTKTTAIKSGLKKGKTYYFKVRAYVKNKNQVIYGSFSNIKKYKMVL

Reference to the literature

Kuznetsova,I.M et al.Int J Mol Sci.(2014)“What Macromolecular Crowding Can Do to a Protein”15(12):23090–23140.

Marcus,D,M.et al.Biochem(1964)“Immunochemical Studies on Blood Groups.XXXI.Destruction of Blood Group A Activity by an Enzyme from Clostridium tertium Which Deacetylates N-Acetylgalactosamine in Intact Blood Group Substances”(4)437-443.

Daniels,G.and Reid M.E.Transfusion(2010)“Blood groups:the past 50years.”50(2):281-9.doi:10.1111/j.1537-2995.2009.02456.x.Epub 2009Nov 9

Vox Sang.2011Nov;101(4):327-32.doi:10.1111/j.1423-0410.2011.01540.x.Epub 2011Sep 6.

Garratty,G.Vox Sang.(2008)“Modulating the red cell membrane to produce universal/stealth donor red cells suitable for transfusion.”94(2):87-95.Epub 2007 Nov 22.

Goldstein et al.Science(1982)“Group B erythrocytes enzymatically converted to group O survive normally in A,B,and O individuals.”215(4529):168-70.

US4609627;and CA2272925

Kruskall M.S.et al.Transfusion(2000)“Transfusion to blood group A and O patients of group B RBCs that have been enzymatically converted to group O.”40(11):1290-8.

Clausen,H and Hakomori,S.Vox Sang.(1989)“ABH and related histo-blood group antigens;immunochemical differences in carrier isotypes and their distribution.”56(1):1-20.

EP2243793

Liu,Q.P.et al.J Biol Chem.(2008)“Identification of a GH110 subfamily of alpha 1,3-galactosidases:novel enzymes for removal of the alpha 3Gal xenotransplantation antigen.”283(13):8545-54.doi:10.1074/jbc.M709020200.Epub 2008 Jan 28.

PCT/US1992/010113;and PCT/SE2015/050108

US4088538;US4141857;US4206259;US4218363;US4229536;US4239854;US4619897;US4748121;US4749653;US4897352;US4954444;US4978619;US5154808;US5914367;US5962279;US6030933;US6291582;US6254645;US10,016,490;and US10,041,055

Jeong,J.K.et al.J Bacteriol.(2009)“Characterization of the Streptococcus pneumoniae BgaC protein as a novel surface beta-galactosidase with specific hydrolysis activity for the Galbeta1-3GlcNAc moiety of oligosaccharides.”191(9):3011-23.doi:10.1128/JB.01601-08.Epub 2009 Mar 6.

Singh,A.K.et al.PLoS Pathog.(2014)“Unravelling the multiple functions of the architecturally intricate Streptococcus pneumoniaeβ-galactosidase,BgaA.”10(9):e1004364.doi:10.1371/journal.ppat.1004364.eCollection 2014 Sep.

Katayarna,T.et al.J Bacteriol.(2004)“Molecular cloning and characterization of Bifidobacterium bifidum 1,2-alpha-L-fucosidase(AfcA),a novel inverting glycosidase(glycoside hydrolase family 95).”186(15):4885-93.

Williams,S.J.et al.J Biol Chem.(2002)“Aspartate 313 in the Streptomyces plicatus hexosaminidase plays a critical role in substrate-assisted catalysis by orienting the 2-acetamido group and stabilizing the transition state.”277(42):40055-65.Epub 2002 Aug 8.

Bolger,A.M.et al.Bioinformatics.(2014)“Trimmomatic:a flexible trimmer for Illumina sequence data.”30(15):2114-20.doi:10.1093/bioinformatics/btu170.Epub 2014 Apr 1.

Li 2013

Treangen,T.J.et al.Curr Protoc Bioinformatics(2011)“Next generation sequence assembly with AMOS.”Chapter 11:Unit 11.8.doi:10.1002/0471250953.bi1108s33.

Hyatt,D.et al.BMC Bioinformatics.(2010)“Prodigal:prokaryotic gene recognition and translation initiation site identification.”11:119.doi:10.1186/1471-2105-11-119.

Konwar,K.M.et al.Bioinformatics.(2015)“MetaPathways v2.5:quantitative functional,taxonomic and usability improvements.”31(20):3345-7.doi:10.1093/bioinformatics/btv361.Epub 2015 Jun 15.

Studier,F.W.Protein Expr Purif.(2005)“Protein production by auto-induction in high density shaking cultures.”41(1):207-34.

Palmier M.O.and Van Doren S.R.Anal Biochem.(2007)“Rapid determination of enzyme kinetics from fluorescence:overcoming the inner filter effect.”371(1):43-51.Epub 2007 Jul 18.

Kabsch,W.Acta Crystallogr D Biol Crystallogr.(2010)“XDS”66(Pt 2):125-32.doi:10.1107/S0907444909047337.Epub 2010 Jan 22.

Evans,P.R.and Murshudov,G.N.Acta Crystallogr D Biol Crystallogr.(2013)“How good are my data and what is the resolution?”69(Pt 7):1204-14.doi:10.1107/S0907444913000061.Epub 2013 Jun 13.

Skubák,P.and Pannu,N.S.Nat Commun.(2013)“Automatic protein structure solution from weak X-ray data.”4:2777.doi:10.1038/ncomms3777.

Potterton,L.et al.Acta Crystallogr D Struct Biol.(2018)“CCP4i2:the new graphical user interface to the CCP4 program suite.”74(Pt 2):68-84.doi:10.1107/S2059798317016035.Epub 2018 Feb 1.

Emsley,P.and Cowtan,K.Acta Crystallogr D Biol Crystallogr.(2004)“Coot:model-building tools for molecular graphics.”60(Pt 12 Pt 1):2126-32.Epub 2004 Nov 26.

Vagin,A.A.et al.Acta Crystallogr D Biol Crystallogr.(2004)“REFMAC5 dictionary:organization of prior chemical knowledge and guidelines for its use.”60(Pt 12 Pt 1):2184-95.Epub 2004 Nov 26.

Chen,V.B.et al.Acta Crystallogr D Biol Crystallogr.(2010)“MolProbity:all-atom structure validation for macromolecular crystallography.”66(Pt 1):12-21.doi:10.1107/S0907444909042073.Epub 2009 Dec 21.

Zhang 2004

Vocadlo,D.J.et al.Biochemistry.(2002)“A case for reverse protonation:identification of Glu160 as an acid/base catalyst in Thermoanaerobacterium saccharolyticum beta-xylosidase and detailed kinetic analysis of a site-directed mutant.”41(31):9736-46.

Jones,D.R.et al.Biotechnol Biofuels.(2018)“SACCHARIS:an automated pipeline to streamline discovery of carbohydrate active enzyme activities within polyspecific families and de novo sequence datasets.”11:27.doi:10.1186/s13068-018-1027-x.eCollection 2018.

Yin,Y.et al.Nucleic Acids Res.(2012)“dbCAN:a web resource for automated carbohydrate-active enzyme annotation.”40(Web Server issue):W445-51.doi:10.1093/nar/gks479.Epub 2012 May 29.

Edgar,R.C.Bioinformatics.(2010)“Search and clustering orders of magnitude faster than BLAST.”26(19):2460-1.doi:10.1093/bioinformatics/btq461.Epub 2010 Aug 12.

Stamatakis,A.Bioinformatics.(2006)“RAxML-VI-HPC:maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.”22:2688–2690.doi:10.1093/bioinformatics/btl446.

Stamatakis,A.and Ott,M.Philos Trans R Soc Lond B Biol Sci.(2008)“Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures.”363(1512):3977-84.doi:10.1098/rstb.2008.0163.

Eddy,S.R.Bioinformatics.(1998)“Profile hidden Markov models.”14(9):755-63.Review.

Capella-Gutiérrez,S.et al.Bioinformatics.(2009)“trimAl:a tool for automated alignment trimming in large-scale phylogenetic analyses.”25(15):1972-3.doi:10.1093/bioinformatics/btp348.Epub 2009 Jun 8.

Matsen,F.A.et al.PLoS One.(2012)“A format for phylogenetic placements.”7(2):e31009.doi:10.1371/journal.pone.0031009.Epub 2012 Feb 22.

Letunic,I.and Bork,P.Nucleic Acids Res.(2016)“Interactive tree of life(iTOL)v3:an online tool for the display and annotation of phylogenetic and other trees.”44(W1):W242-5.doi:10.1093/nar/gkw290.Epub 2016 Apr 19.

Engler,C.et al.PLoS One.(2008)“A one pot,one step,precision cloning method with high throughput capability.”3(11):e3647.doi:10.1371/journal.pone.0003647.Epub 2008Nov 5.

Kwan,D.H.et al.J Am Chem Soc.(2015)“Toward Efficient Enzymes for the Generation of Universal Blood through Structure-Guided Directed Evolution.”137(17):5695-705.doi:10.1021/ja5116088.Epub 2015 Apr 24.The eleven fosmids were sequenced on an Illumina MiSeqTM and ORFs therein that are present in the CAZyTM database(http://www.cazy.org/)(Lombard 2014

Konwar,K.M.et al.Bioinformatics.(2015)“MetaPathways v2.5:quantitative functional,taxonomic and usability improvements.”31(20):3345-7.doi:10.1093/bioinformatics/btv361.Epub 2015 Jun 15.

Li,D.et al.Bioinformatics.(2015)“MEGAHIT:an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.”31(10):1674-6.doi:10.1093/bioinformatics/btv033.Epub 2015 Jan 20.

enzymes from GH4 Add Yip,V.L.and Withers,S.G.J.Amer.Chem.Soc.(2006)“Mechanistic analysis of the unusual redox-elimination sequence employed by Thermotoga maritima BglT:a 6-phospho-beta-glucosidase from glycoside hydrolase family 4.”126,8354-8355

Chapanian,R.et al.Nat Commun.(2014)“Enhancement of biological reactions on cell surfaces via macromolecular crowding.”5:4683.doi:10.1038/ncomms5683.

Varrot,A.et al.J Mol Biol.(2005)“NAD+and metal-ion dependent hydrolysis by family 4 glycosidases:structural insight into specificity for phospho-beta-D-glucosides.”346(2):423-35.Epub 2005 Jan 7.

Liu,Q.P.et al.Nat Biotechnol.(2007)“Bacterial glycosidases for the production of universal red blood cells.”25(4):454-64.Epub 2007 Apr 1.

Comfort,D.A.et al.Biochemistry(2007)“Biochemical analysis of Thermotoga maritima GH36 alpha-galactosidase(TmGalA)confirms the mechanistic commonality of clan GH-D glycoside hydrolases.”46(11):3319-30.Epub 2007 Feb 27.

Calcutt,M.J.et al.FEMS Microbiol Lett.(2002)“Identification,molecular cloning and expression of an alpha-N-acetylgalactosaminidase gene from Clostridium perfringens.”214(1):77-80.

Gerbal,A.Maslet,C.and Salmon,C.Vox Sang.(1975)“Immunological aspects of the acquired B antigen.”28(5):398-403.

Judd,W.J.and Annesley,T.M Transfusion medicine reviews(1996)“The acquired-B phenomenon.”10,111-117.

Kelley,L.A.et al.Nat Protoc.(2015)“The Phyre2 web portal for protein modeling,prediction and analysis.”10(6):845-58.doi:10.1038/nprot.2015.053.Epub 2015 May 7.

Ficko-Blean,E.and Boraston,AB.J Biol Chem.(2006)“The interaction of a carbohydrate-binding module from a Clostridium perfringens N-acetyl-beta-hexosaminidase with its carbohydrate receptor.”281(49):37748-57.Epub 2006 Sep 21.

Cohen,M.et al.Blood.(2009)“ABO blood group glycans modulate sialic acid recognition on erythrocytes.”114(17):3668-76.doi:10.1182/blood-2009-06-227041.Epub 2009 Aug 24.

Fredslund,F.et al.J Mol Biol.(2011)“Crystal structure ofα-galactosidase from Lactobacillus acidophilus NCFM:insight into tetramer formation and substrate binding.”412(3):466-80.doi:10.1016/j.jmb.2011.07.057.Epub 2011 Jul 30.

Guibert,E.E.et al.Transfus Med Hemother.(2011)“Organ Preservation:Current Concepts and New Strategies for the Next Decade”38(2):125–142.

Sequence listing

<110> UNIVERSITY OF BRITISH Columbia (THE UNIVERSITY OF BRITISH COLUMBIA)

Maselo-Seplaa (CYPEL, Marcelo)

Shafeike Ke Sha Fu Ji (KESHAVJEE, Shafique)

Wang ai Zhou (WANG, Aizhou)

<120> enzymatic compositions for carbohydrate antigen cleavage on donor organs, methods and uses related thereto

<130> P1600PC02

<140> NOT YET ASSIGNED

<141> 2019-08-16

<150> US 62/719,272

<151> 2018-08-17

<160> 108

<170> PatentIn version 3.5

<210> 1

<211> 2319

<212> DNA

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 1

atgagaaatc gaaggaaagc tgtttcgctt ctaacgggcc tactcgtgac ggcccagtta 60

tttccaaccg cggcgcttgc ggcagactcc agcgagtccg cattgaacaa ggcccccgga 120

tatcaggatt ttcccgccta ttacagcgac agtgcgcatg ccgatgacca ggtgactcac 180

ccggacgtag ttgtcctgga agaaccgtgg aacggctatc gctattgggc cgtttatacg 240

cccaacgtga tgcggatctc catctacgaa aacccgtcca tcgttgcctc cagcgacgga 300

gtgcattggg tagaaccgga ggggctttcc aatcccattg agccgcagcc gcccagcacc 360

cgctaccaca actgcgacgc tgatatggtc tataacgcgg aatacgatgc catgatggcc 420

tattggaact gggcggatga ccagggcgga ggcgttgggg ccgaagtccg gctgcggatt 480

tcctatgacg gcgtacattg gggcgtcccc gtgacttatg atgagatgac ccgcgtatgg 540

tcgaagccca cctccgacgc ggagcgtcag gttgcggatg gagaggatga cttcattacc 600

gccattgctt ctccagaccg ctacgatatg ctctctccca ctattgtcta cgatgacttc 660

cgggatgtgt tcatcctgtg ggccaacaat accggcgacg tggggtatca gaatggtcag 720

gcgaacttcg tggaaatgcg ttattcggac gacgggatca cctggggtga gccagtccgc 780

gtcaacggct tcctggggct tgacgagaat gggcagcagt tggccccctg gcatcaggat 840

gtccagtatg ttccagattt gaaggagttt gtttgtattt cccagtgctt tgccggccga 900

aatccggatg gctctgtcct gcacctgacc acatcaaagg atggagtcaa ctgggagcag 960

gtgggcacca agcccctgct gtcccccggg ccagacggca gttgggatga tttccagatc 1020

tatcgctcca gtttttacta tgagccaggc agttccgccg gagatggtac catgcgcgtc 1080

tggtacagtg ccctgcagaa ggacaccaat aacaagatgg tcgcggattc ctccgggaat 1140

ctgaccattc aggccaaaag tgaggatgac cgcatctgga ggatcggcta tgcggaaaac 1200

agttttgttg agatgatgcg cgtgctgctg gatgaccccg gctacacgac gcccgccctg 1260

gtttccggca attcccttat gctgagtgct gagaccactt cccttcccac aggggatgtc 1320

atgaagctgg aaaccagttt cgcgcctgtg gacacctctg atcaggtcgt gaaatatacc 1380

tccagtgatc cggatgtggc gacggtggat gagtttggaa ccattacagg cgtttctgtc 1440

ggttcagcgc gcatcatggc ggagacccgg gagggcctgt ccgacgacct tgaaattgca 1500

gtggtggaga atccgtacac gctgattccc cagtccaata tgacggcaac cgccaccagc 1560

gtctacggcg ggacgacgga gggccccgcc tccaatgtcc tcgatggaaa cgtccgcaca 1620

atatggcata ccaactatgc tcccaaagat gaactgccgc agagtatcac cgtttccttt 1680

gaccagccct ataccgtcgg ccgcttcgtc tataccccac gtcaaaacgg gacaaatggc 1740

ataatttcgg agtatgagct atacgccatc caccaggacg gcagcaagga cctagtcgcc 1800

tccggctcag actgggcgct cgatgccaag gataaaaccg tgagctttgc accggtagaa 1860

gccgtcggcc tggagctcaa ggcgattgcc ggcgcaggtg ggttcggtac tgccgccgaa 1920

ctcaatgtgt atgcgtatgg tccaatcgag cctgcgcccg tatatgtccc ggtggatgac 1980

cgggatgctt ccctggtgtt tacgggtgca tggaatagcg acagcaacgg aagcttttat 2040

gaagggacgg cccgttatac caacgagatc ggcgcgtccg tggagttcac atttgtgggg 2100

acggccattc ggtggtatgg tcaaaatgat gtaaatttcg gcgctgcgga ggtatacgtg 2160

gacggcgttc tggcagggga ggtaaatgtg tatgggccgg cggcggctca gcagcttcta 2220

tttgaggcgg acggtctggc ctatgggaag cataccatcc gcatcgtctg tgtgtctccg 2280

gtggttgact tcgactattt ttcgtatgtg ggagaataa 2319

<210> 2

<211> 772

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 2

Met Arg Asn Arg Arg Lys Ala Val Ser Leu Leu Thr Gly Leu Leu Val

1 5 10 15

Thr Ala Gln Leu Phe Pro Thr Ala Ala Leu Ala Ala Asp Ser Ser Glu

20 25 30

Ser Ala Leu Asn Lys Ala Pro Gly Tyr Gln Asp Phe Pro Ala Tyr Tyr

35 40 45

Ser Asp Ser Ala His Ala Asp Asp Gln Val Thr His Pro Asp Val Val

50 55 60

Val Leu Glu Glu Pro Trp Asn Gly Tyr Arg Tyr Trp Ala Val Tyr Thr

65 70 75 80

Pro Asn Val Met Arg Ile Ser Ile Tyr Glu Asn Pro Ser Ile Val Ala

85 90 95

Ser Ser Asp Gly Val His Trp Val Glu Pro Glu Gly Leu Ser Asn Pro

100 105 110

Ile Glu Pro Gln Pro Pro Ser Thr Arg Tyr His Asn Cys Asp Ala Asp

115 120 125

Met Val Tyr Asn Ala Glu Tyr Asp Ala Met Met Ala Tyr Trp Asn Trp

130 135 140

Ala Asp Asp Gln Gly Gly Gly Val Gly Ala Glu Val Arg Leu Arg Ile

145 150 155 160

Ser Tyr Asp Gly Val His Trp Gly Val Pro Val Thr Tyr Asp Glu Met

165 170 175

Thr Arg Val Trp Ser Lys Pro Thr Ser Asp Ala Glu Arg Gln Val Ala

180 185 190

Asp Gly Glu Asp Asp Phe Ile Thr Ala Ile Ala Ser Pro Asp Arg Tyr

195 200 205

Asp Met Leu Ser Pro Thr Ile Val Tyr Asp Asp Phe Arg Asp Val Phe

210 215 220

Ile Leu Trp Ala Asn Asn Thr Gly Asp Val Gly Tyr Gln Asn Gly Gln

225 230 235 240

Ala Asn Phe Val Glu Met Arg Tyr Ser Asp Asp Gly Ile Thr Trp Gly

245 250 255

Glu Pro Val Arg Val Asn Gly Phe Leu Gly Leu Asp Glu Asn Gly Gln

260 265 270

Gln Leu Ala Pro Trp His Gln Asp Val Gln Tyr Val Pro Asp Leu Lys

275 280 285

Glu Phe Val Cys Ile Ser Gln Cys Phe Ala Gly Arg Asn Pro Asp Gly

290 295 300

Ser Val Leu His Leu Thr Thr Ser Lys Asp Gly Val Asn Trp Glu Gln

305 310 315 320

Val Gly Thr Lys Pro Leu Leu Ser Pro Gly Pro Asp Gly Ser Trp Asp

325 330 335

Asp Phe Gln Ile Tyr Arg Ser Ser Phe Tyr Tyr Glu Pro Gly Ser Ser

340 345 350

Ala Gly Asp Gly Thr Met Arg Val Trp Tyr Ser Ala Leu Gln Lys Asp

355 360 365

Thr Asn Asn Lys Met Val Ala Asp Ser Ser Gly Asn Leu Thr Ile Gln

370 375 380

Ala Lys Ser Glu Asp Asp Arg Ile Trp Arg Ile Gly Tyr Ala Glu Asn

385 390 395 400

Ser Phe Val Glu Met Met Arg Val Leu Leu Asp Asp Pro Gly Tyr Thr

405 410 415

Thr Pro Ala Leu Val Ser Gly Asn Ser Leu Met Leu Ser Ala Glu Thr

420 425 430

Thr Ser Leu Pro Thr Gly Asp Val Met Lys Leu Glu Thr Ser Phe Ala

435 440 445

Pro Val Asp Thr Ser Asp Gln Val Val Lys Tyr Thr Ser Ser Asp Pro

450 455 460

Asp Val Ala Thr Val Asp Glu Phe Gly Thr Ile Thr Gly Val Ser Val

465 470 475 480

Gly Ser Ala Arg Ile Met Ala Glu Thr Arg Glu Gly Leu Ser Asp Asp

485 490 495

Leu Glu Ile Ala Val Val Glu Asn Pro Tyr Thr Leu Ile Pro Gln Ser

500 505 510

Asn Met Thr Ala Thr Ala Thr Ser Val Tyr Gly Gly Thr Thr Glu Gly

515 520 525

Pro Ala Ser Asn Val Leu Asp Gly Asn Val Arg Thr Ile Trp His Thr

530 535 540

Asn Tyr Ala Pro Lys Asp Glu Leu Pro Gln Ser Ile Thr Val Ser Phe

545 550 555 560

Asp Gln Pro Tyr Thr Val Gly Arg Phe Val Tyr Thr Pro Arg Gln Asn

565 570 575

Gly Thr Asn Gly Ile Ile Ser Glu Tyr Glu Leu Tyr Ala Ile His Gln

580 585 590

Asp Gly Ser Lys Asp Leu Val Ala Ser Gly Ser Asp Trp Ala Leu Asp

595 600 605

Ala Lys Asp Lys Thr Val Ser Phe Ala Pro Val Glu Ala Val Gly Leu

610 615 620

Glu Leu Lys Ala Ile Ala Gly Ala Gly Gly Phe Gly Thr Ala Ala Glu

625 630 635 640

Leu Asn Val Tyr Ala Tyr Gly Pro Ile Glu Pro Ala Pro Val Tyr Val

645 650 655

Pro Val Asp Asp Arg Asp Ala Ser Leu Val Phe Thr Gly Ala Trp Asn

660 665 670

Ser Asp Ser Asn Gly Ser Phe Tyr Glu Gly Thr Ala Arg Tyr Thr Asn

675 680 685

Glu Ile Gly Ala Ser Val Glu Phe Thr Phe Val Gly Thr Ala Ile Arg

690 695 700

Trp Tyr Gly Gln Asn Asp Val Asn Phe Gly Ala Ala Glu Val Tyr Val

705 710 715 720

Asp Gly Val Leu Ala Gly Glu Val Asn Val Tyr Gly Pro Ala Ala Ala

725 730 735

Gln Gln Leu Leu Phe Glu Ala Asp Gly Leu Ala Tyr Gly Lys His Thr

740 745 750

Ile Arg Ile Val Cys Val Ser Pro Val Val Asp Phe Asp Tyr Phe Ser

755 760 765

Tyr Val Gly Glu

770

<210> 3

<211> 2238

<212> DNA

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 3

gcagactcca gcgagtccgc attgaacaag gcccccggat atcaggattt tcccgcctat 60

tacagcgaca gtgcgcatgc cgatgaccag gtgactcacc cggacgtagt tgtcctggaa 120

gaaccgtgga acggctatcg ctattgggcc gtttatacgc ccaacgtgat gcggatctcc 180

atctacgaaa acccgtccat cgttgcctcc agcgacggag tgcattgggt agaaccggag 240

gggctttcca atcccattga gccgcagccg cccagcaccc gctaccacaa ctgcgacgct 300

gatatggtct ataacgcgga atacgatgcc atgatggcct attggaactg ggcggatgac 360

cagggcggag gcgttggggc cgaagtccgg ctgcggattt cctatgacgg cgtacattgg 420

ggcgtccccg tgacttatga tgagatgacc cgcgtatggt cgaagcccac ctccgacgcg 480

gagcgtcagg ttgcggatgg agaggatgac ttcattaccg ccattgcttc tccagaccgc 540

tacgatatgc tctctcccac tattgtctac gatgacttcc gggatgtgtt catcctgtgg 600

gccaacaata ccggcgacgt ggggtatcag aatggtcagg cgaacttcgt ggaaatgcgt 660

tattcggacg acgggatcac ctggggtgag ccagtccgcg tcaacggctt cctggggctt 720

gacgagaatg ggcagcagtt ggccccctgg catcaggatg tccagtatgt tccagatttg 780

aaggagtttg tttgtatttc ccagtgcttt gccggccgaa atccggatgg ctctgtcctg 840

cacctgacca catcaaagga tggagtcaac tgggagcagg tgggcaccaa gcccctgctg 900

tcccccgggc cagacggcag ttgggatgat ttccagatct atcgctccag tttttactat 960

gagccaggca gttccgccgg agatggtacc atgcgcgtct ggtacagtgc cctgcagaag 1020

gacaccaata acaagatggt cgcggattcc tccgggaatc tgaccattca ggccaaaagt 1080

gaggatgacc gcatctggag gatcggctat gcggaaaaca gttttgttga gatgatgcgc 1140

gtgctgctgg atgaccccgg ctacacgacg cccgccctgg tttccggcaa ttcccttatg 1200

ctgagtgctg agaccacttc ccttcccaca ggggatgtca tgaagctgga aaccagtttc 1260

gcgcctgtgg acacctctga tcaggtcgtg aaatatacct ccagtgatcc ggatgtggcg 1320

acggtggatg agtttggaac cattacaggc gtttctgtcg gttcagcgcg catcatggcg 1380

gagacccggg agggcctgtc cgacgacctt gaaattgcag tggtggagaa tccgtacacg 1440

ctgattcccc agtccaatat gacggcaacc gccaccagcg tctacggcgg gacgacggag 1500

ggccccgcct ccaatgtcct cgatggaaac gtccgcacaa tatggcatac caactatgct 1560

cccaaagatg aactgccgca gagtatcacc gtttcctttg accagcccta taccgtcggc 1620

cgcttcgtct ataccccacg tcaaaacggg acaaatggca taatttcgga gtatgagcta 1680

tacgccatcc accaggacgg cagcaaggac ctagtcgcct ccggctcaga ctgggcgctc 1740

gatgccaagg ataaaaccgt gagctttgca ccggtagaag ccgtcggcct ggagctcaag 1800

gcgattgccg gcgcaggtgg gttcggtact gccgccgaac tcaatgtgta tgcgtatggt 1860

ccaatcgagc ctgcgcccgt atatgtcccg gtggatgacc gggatgcttc cctggtgttt 1920

acgggtgcat ggaatagcga cagcaacgga agcttttatg aagggacggc ccgttatacc 1980

aacgagatcg gcgcgtccgt ggagttcaca tttgtgggga cggccattcg gtggtatggt 2040

caaaatgatg taaatttcgg cgctgcggag gtatacgtgg acggcgttct ggcaggggag 2100

gtaaatgtgt atgggccggc ggcggctcag cagcttctat ttgaggcgga cggtctggcc 2160

tatgggaagc ataccatccg catcgtctgt gtgtctccgg tggttgactt cgactatttt 2220

tcgtatgtgg gagaataa 2238

<210> 4

<211> 745

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 4

Ala Asp Ser Ser Glu Ser Ala Leu Asn Lys Ala Pro Gly Tyr Gln Asp

1 5 10 15

Phe Pro Ala Tyr Tyr Ser Asp Ser Ala His Ala Asp Asp Gln Val Thr

20 25 30

His Pro Asp Val Val Val Leu Glu Glu Pro Trp Asn Gly Tyr Arg Tyr

35 40 45

Trp Ala Val Tyr Thr Pro Asn Val Met Arg Ile Ser Ile Tyr Glu Asn

50 55 60

Pro Ser Ile Val Ala Ser Ser Asp Gly Val His Trp Val Glu Pro Glu

65 70 75 80

Gly Leu Ser Asn Pro Ile Glu Pro Gln Pro Pro Ser Thr Arg Tyr His

85 90 95

Asn Cys Asp Ala Asp Met Val Tyr Asn Ala Glu Tyr Asp Ala Met Met

100 105 110

Ala Tyr Trp Asn Trp Ala Asp Asp Gln Gly Gly Gly Val Gly Ala Glu

115 120 125

Val Arg Leu Arg Ile Ser Tyr Asp Gly Val His Trp Gly Val Pro Val

130 135 140

Thr Tyr Asp Glu Met Thr Arg Val Trp Ser Lys Pro Thr Ser Asp Ala

145 150 155 160

Glu Arg Gln Val Ala Asp Gly Glu Asp Asp Phe Ile Thr Ala Ile Ala

165 170 175

Ser Pro Asp Arg Tyr Asp Met Leu Ser Pro Thr Ile Val Tyr Asp Asp

180 185 190

Phe Arg Asp Val Phe Ile Leu Trp Ala Asn Asn Thr Gly Asp Val Gly

195 200 205

Tyr Gln Asn Gly Gln Ala Asn Phe Val Glu Met Arg Tyr Ser Asp Asp

210 215 220

Gly Ile Thr Trp Gly Glu Pro Val Arg Val Asn Gly Phe Leu Gly Leu

225 230 235 240

Asp Glu Asn Gly Gln Gln Leu Ala Pro Trp His Gln Asp Val Gln Tyr

245 250 255

Val Pro Asp Leu Lys Glu Phe Val Cys Ile Ser Gln Cys Phe Ala Gly

260 265 270

Arg Asn Pro Asp Gly Ser Val Leu His Leu Thr Thr Ser Lys Asp Gly

275 280 285

Val Asn Trp Glu Gln Val Gly Thr Lys Pro Leu Leu Ser Pro Gly Pro

290 295 300

Asp Gly Ser Trp Asp Asp Phe Gln Ile Tyr Arg Ser Ser Phe Tyr Tyr

305 310 315 320

Glu Pro Gly Ser Ser Ala Gly Asp Gly Thr Met Arg Val Trp Tyr Ser

325 330 335

Ala Leu Gln Lys Asp Thr Asn Asn Lys Met Val Ala Asp Ser Ser Gly

340 345 350

Asn Leu Thr Ile Gln Ala Lys Ser Glu Asp Asp Arg Ile Trp Arg Ile

355 360 365

Gly Tyr Ala Glu Asn Ser Phe Val Glu Met Met Arg Val Leu Leu Asp

370 375 380

Asp Pro Gly Tyr Thr Thr Pro Ala Leu Val Ser Gly Asn Ser Leu Met

385 390 395 400

Leu Ser Ala Glu Thr Thr Ser Leu Pro Thr Gly Asp Val Met Lys Leu

405 410 415

Glu Thr Ser Phe Ala Pro Val Asp Thr Ser Asp Gln Val Val Lys Tyr

420 425 430

Thr Ser Ser Asp Pro Asp Val Ala Thr Val Asp Glu Phe Gly Thr Ile

435 440 445

Thr Gly Val Ser Val Gly Ser Ala Arg Ile Met Ala Glu Thr Arg Glu

450 455 460

Gly Leu Ser Asp Asp Leu Glu Ile Ala Val Val Glu Asn Pro Tyr Thr

465 470 475 480

Leu Ile Pro Gln Ser Asn Met Thr Ala Thr Ala Thr Ser Val Tyr Gly

485 490 495

Gly Thr Thr Glu Gly Pro Ala Ser Asn Val Leu Asp Gly Asn Val Arg

500 505 510

Thr Ile Trp His Thr Asn Tyr Ala Pro Lys Asp Glu Leu Pro Gln Ser

515 520 525

Ile Thr Val Ser Phe Asp Gln Pro Tyr Thr Val Gly Arg Phe Val Tyr

530 535 540

Thr Pro Arg Gln Asn Gly Thr Asn Gly Ile Ile Ser Glu Tyr Glu Leu

545 550 555 560

Tyr Ala Ile His Gln Asp Gly Ser Lys Asp Leu Val Ala Ser Gly Ser

565 570 575

Asp Trp Ala Leu Asp Ala Lys Asp Lys Thr Val Ser Phe Ala Pro Val

580 585 590

Glu Ala Val Gly Leu Glu Leu Lys Ala Ile Ala Gly Ala Gly Gly Phe

595 600 605

Gly Thr Ala Ala Glu Leu Asn Val Tyr Ala Tyr Gly Pro Ile Glu Pro

610 615 620

Ala Pro Val Tyr Val Pro Val Asp Asp Arg Asp Ala Ser Leu Val Phe

625 630 635 640

Thr Gly Ala Trp Asn Ser Asp Ser Asn Gly Ser Phe Tyr Glu Gly Thr

645 650 655

Ala Arg Tyr Thr Asn Glu Ile Gly Ala Ser Val Glu Phe Thr Phe Val

660 665 670

Gly Thr Ala Ile Arg Trp Tyr Gly Gln Asn Asp Val Asn Phe Gly Ala

675 680 685

Ala Glu Val Tyr Val Asp Gly Val Leu Ala Gly Glu Val Asn Val Tyr

690 695 700

Gly Pro Ala Ala Ala Gln Gln Leu Leu Phe Glu Ala Asp Gly Leu Ala

705 710 715 720

Tyr Gly Lys His Thr Ile Arg Ile Val Cys Val Ser Pro Val Val Asp

725 730 735

Phe Asp Tyr Phe Ser Tyr Val Gly Glu

740 745

<210> 5

<211> 760

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 5

Met Gly His His His His His His His His His His Ser Ser Gly Ala

1 5 10 15

Asp Ser Ser Glu Ser Ala Leu Asn Lys Ala Pro Gly Tyr Gln Asp Phe

20 25 30

Pro Ala Tyr Tyr Ser Asp Ser Ala His Ala Asp Asp Gln Val Thr His

35 40 45

Pro Asp Val Val Val Leu Glu Glu Pro Trp Asn Gly Tyr Arg Tyr Trp

50 55 60

Ala Val Tyr Thr Pro Asn Val Met Arg Ile Ser Ile Tyr Glu Asn Pro

65 70 75 80

Ser Ile Val Ala Ser Ser Asp Gly Val His Trp Val Glu Pro Glu Gly

85 90 95

Leu Ser Asn Pro Ile Glu Pro Gln Pro Pro Ser Thr Arg Tyr His Asn

100 105 110

Cys Asp Ala Asp Met Val Tyr Asn Ala Glu Tyr Asp Ala Met Met Ala

115 120 125

Tyr Trp Asn Trp Ala Asp Asp Gln Gly Gly Gly Val Gly Ala Glu Val

130 135 140

Arg Leu Arg Ile Ser Tyr Asp Gly Val His Trp Gly Val Pro Val Thr

145 150 155 160

Tyr Asp Glu Met Thr Arg Val Trp Ser Lys Pro Thr Ser Asp Ala Glu

165 170 175

Arg Gln Val Ala Asp Gly Glu Asp Asp Phe Ile Thr Ala Ile Ala Ser

180 185 190

Pro Asp Arg Tyr Asp Met Leu Ser Pro Thr Ile Val Tyr Asp Asp Phe

195 200 205

Arg Asp Val Phe Ile Leu Trp Ala Asn Asn Thr Gly Asp Val Gly Tyr

210 215 220

Gln Asn Gly Gln Ala Asn Phe Val Glu Met Arg Tyr Ser Asp Asp Gly

225 230 235 240

Ile Thr Trp Gly Glu Pro Val Arg Val Asn Gly Phe Leu Gly Leu Asp

245 250 255

Glu Asn Gly Gln Gln Leu Ala Pro Trp His Gln Asp Val Gln Tyr Val

260 265 270

Pro Asp Leu Lys Glu Phe Val Cys Ile Ser Gln Cys Phe Ala Gly Arg

275 280 285

Asn Pro Asp Gly Ser Val Leu His Leu Thr Thr Ser Lys Asp Gly Val

290 295 300

Asn Trp Glu Gln Val Gly Thr Lys Pro Leu Leu Ser Pro Gly Pro Asp

305 310 315 320

Gly Ser Trp Asp Asp Phe Gln Ile Tyr Arg Ser Ser Phe Tyr Tyr Glu

325 330 335

Pro Gly Ser Ser Ala Gly Asp Gly Thr Met Arg Val Trp Tyr Ser Ala

340 345 350

Leu Gln Lys Asp Thr Asn Asn Lys Met Val Ala Asp Ser Ser Gly Asn

355 360 365

Leu Thr Ile Gln Ala Lys Ser Glu Asp Asp Arg Ile Trp Arg Ile Gly

370 375 380

Tyr Ala Glu Asn Ser Phe Val Glu Met Met Arg Val Leu Leu Asp Asp

385 390 395 400

Pro Gly Tyr Thr Thr Pro Ala Leu Val Ser Gly Asn Ser Leu Met Leu

405 410 415

Ser Ala Glu Thr Thr Ser Leu Pro Thr Gly Asp Val Met Lys Leu Glu

420 425 430

Thr Ser Phe Ala Pro Val Asp Thr Ser Asp Gln Val Val Lys Tyr Thr

435 440 445

Ser Ser Asp Pro Asp Val Ala Thr Val Asp Glu Phe Gly Thr Ile Thr

450 455 460

Gly Val Ser Val Gly Ser Ala Arg Ile Met Ala Glu Thr Arg Glu Gly

465 470 475 480

Leu Ser Asp Asp Leu Glu Ile Ala Val Val Glu Asn Pro Tyr Thr Leu

485 490 495

Ile Pro Gln Ser Asn Met Thr Ala Thr Ala Thr Ser Val Tyr Gly Gly

500 505 510

Thr Thr Glu Gly Pro Ala Ser Asn Val Leu Asp Gly Asn Val Arg Thr

515 520 525

Ile Trp His Thr Asn Tyr Ala Pro Lys Asp Glu Leu Pro Gln Ser Ile

530 535 540

Thr Val Ser Phe Asp Gln Pro Tyr Thr Val Gly Arg Phe Val Tyr Thr

545 550 555 560

Pro Arg Gln Asn Gly Thr Asn Gly Ile Ile Ser Glu Tyr Glu Leu Tyr

565 570 575

Ala Ile His Gln Asp Gly Ser Lys Asp Leu Val Ala Ser Gly Ser Asp

580 585 590

Trp Ala Leu Asp Ala Lys Asp Lys Thr Val Ser Phe Ala Pro Val Glu

595 600 605

Ala Val Gly Leu Glu Leu Lys Ala Ile Ala Gly Ala Gly Gly Phe Gly

610 615 620

Thr Ala Ala Glu Leu Asn Val Tyr Ala Tyr Gly Pro Ile Glu Pro Ala

625 630 635 640

Pro Val Tyr Val Pro Val Asp Asp Arg Asp Ala Ser Leu Val Phe Thr

645 650 655

Gly Ala Trp Asn Ser Asp Ser Asn Gly Ser Phe Tyr Glu Gly Thr Ala

660 665 670

Arg Tyr Thr Asn Glu Ile Gly Ala Ser Val Glu Phe Thr Phe Val Gly

675 680 685

Thr Ala Ile Arg Trp Tyr Gly Gln Asn Asp Val Asn Phe Gly Ala Ala

690 695 700

Glu Val Tyr Val Asp Gly Val Leu Ala Gly Glu Val Asn Val Tyr Gly

705 710 715 720

Pro Ala Ala Ala Gln Gln Leu Leu Phe Glu Ala Asp Gly Leu Ala Tyr

725 730 735

Gly Lys His Thr Ile Arg Ile Val Cys Val Ser Pro Val Val Asp Phe

740 745 750

Asp Tyr Phe Ser Tyr Val Gly Glu

755 760

<210> 6

<211> 3159

<212> DNA

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 6

gcggcgcctg caacggacac cggcaacgca ggactgattg cagaaggtga ttatgccatt 60

gccggcaatg gcgtccgcgt cacttatgac gcggacgggc agacaatcac tctgtaccgc 120

acagagggat ctgggcttat ccagatgagc aagccttctc cattgggagg gccagtgatt 180

ggagggcagg aggttcagga cttcagccat atttcatgtg atgtggagca gagcaccagc 240

ggagtgatgg gcagcggtca gagaatgacc attacctctc agagcatgag cacgggccta 300

attcgtacct atgtgctgga gacctctgat atcgaggagg gtgtggtata tactgcaaca 360

tcctatgagg caggagcttc tgatgtggaa gtgtcttggt tcattggcag tgtgtatgag 420

ctttatggtg cggaagatcg tatctggagt tataacggcg gcggtgaggg gccgatgcac 480

tactatgata cgcttcaaaa gattgacctg accgactctg gcaagttcag tagggagaat 540

aaacaggatg acacggctgc aagtattcct gtgtcagata tttacattgc tgatggaggg 600

attaccgttg gcgatgcttc tgcaaccaga agggaggtac atactccggt tcaggaaacc 660

agtgattcag ctcaagtttc tatcgggtgg ccaggcaaag tcattgccgc cggaagcgtg 720

atcgaaattg gtgagagctt tgctgtagtc catccgggtg actattataa cggcttgaga 780

ggttacaaaa atgcaatgga tcacttgggc gtgattatgc ctgcacctgg ggatattcct 840

gatagcagct atgatctccg atgggaaagc tggggctggg ggtttaactg gacgatcgat 900

ttaataatcg gcaaattgga tgaacttcag gcagccggag tcaagcagat cactttggat 960

gatggttggt ataccaatgc aggagactgg gccttaaatc cagaaaagtt tccaaatgga 1020

gcctccgatg cgttgcggct gacagatgca attcatgagc atggtatgac tgcactcctt 1080

tggtggagac cttgtgacgg cgggatcgat agtatactct atcagcaaca ccctgaatat 1140

ttcgttatgg atgcagatgg aagacctgca aggcttccta ctcctggtgg tgggaccaat 1200

cccagcttgg gatatgcact ttgccctatg gcggatggtg cgattgcaag ccaagttgac 1260

tttgtaaacc gtgcaatgaa tgattggggg ttcgatggct tcaagggaga ttatgtgtgg 1320

agtatgcctg aatgctacaa tcctgcacat aaccacgcct cgccagaaga atccactgaa 1380

aagcaatccg agatataccg cgtctcttat gaggctatgg tggccaacga ccccaatgtg 1440

ttcaatttgt tgtgcaactg cggtacgccc caggactact atagtttacc atatatgaca 1500

cagattgcta cggctgaccc cacttctgtg gatcaaacaa ggagacgcgt gaaagcctac 1560

aaggcactga tgggagatta tttccctgtt acagccgacc acaataacat ctggtatcca 1620

agtgccgtcg gtacgggctc tgttctcatt gaaaaacgtg accttagcgg tactgccaag 1680

gaagaatatg aaaaatggct tgggattgcg gatacagttc agttgcagaa aggccggttt 1740

attggcgatc tttacagtta tggttttgac ccttacgaaa cctatgtggt ggagaaagac 1800

ggggttatgt actatgcctt ctacaaagat gggagcaaat atagccccac tggctatcca 1860

gatattgagt tgaaggggct agatccaaat aaaatgtata ggattgttga ctatgtcaat 1920

gatcgtgtcg tggcaacaaa cctgatgggt gataacgctg tattcaatac acgtttttcc 1980

gactatctac tggttaaagc ggtggaaatt tcggaaccgg atccagaacc tgttgaccct 2040

gattatggtt tcacctctgt tgatgacaga gacgaggctc ttatttacac agggacatgg 2100

catgatgaca ataacgcatc tttcagcgaa gggactgcac gttataccaa cagtacggat 2160

gcttcggttg tattctcctt tactggaact tccattcgct ggtatggcca gagggatacc 2220

aattttggca cggcagaagt ttatttggac gatgaactga aaacaacagt tgatgcgaat 2280

ggggccgcag aagcaggcgt atgtcttttt gaggcgcttg atcttccggc tgccgagcat 2340

accattaaaa ttgtgtgcaa gagcggagtg attgatattg accgctttgc atatgaagct 2400

gctacccttg aacccatcta tgaaaaggtc gatgcgctct cggatcggat cacttatgtt 2460

gggaattggg aagagtatca caacagcgag ttctacatgg gaaacgcaat gcgcacagac 2520

gaagccggcg cttatgctga actgactttc cgtggtacag ccgtacgcct gtatgcagag 2580

atgagcttca attttggcac tgcagatgtc tatttagacg gagagttagt ggaaaacata 2640

atcctatacg gccaggaagc aactgggcag ctaatgtttg agcgtacggg actggaggaa 2700

ggagaacata ccattcgcct tgtacaaaac gcctggaaca tcaatttgga ctatatttct 2760

tatctaccag agcaagatca accaacgccg ccggagacga cggttactgt tgatgcaatg 2820

gacgcccaac tggtgtatac aggcgtatgg aatgatgact atcatgacgt ctttcaggaa 2880

ggaaccgccc gttatgccag tagtgccggc gcctcggtcg agttcgaatt tactggaagc 2940

gaaatccgtt ggtatggaca aaatgattcc aacttcggtg ttgccagcgt ttatatcgat 3000

aatgagtttg tgcagcaggt aaatgttaac ggagctgcgg ctgtgggaaa gcttttgttt 3060

caaaaggctg atctaccagc cggttcgcac acgatccgca ttgtgtgcga tactccggtt 3120

attgatttgg actatttgac ttataccact aacgcataa 3159

<210> 7

<211> 1078

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 7

Met Arg Gly Lys Lys Phe Ile Ser Leu Thr Leu Ser Thr Met Leu Cys

1 5 10 15

Leu Gln Leu Leu Pro Thr Ala Ser Phe Ala Ala Ala Pro Ala Thr Asp

20 25 30

Thr Gly Asn Ala Gly Leu Ile Ala Glu Gly Asp Tyr Ala Ile Ala Gly

35 40 45

Asn Gly Val Arg Val Thr Tyr Asp Ala Asp Gly Gln Thr Ile Thr Leu

50 55 60

Tyr Arg Thr Glu Gly Ser Gly Leu Ile Gln Met Ser Lys Pro Ser Pro

65 70 75 80

Leu Gly Gly Pro Val Ile Gly Gly Gln Glu Val Gln Asp Phe Ser His

85 90 95

Ile Ser Cys Asp Val Glu Gln Ser Thr Ser Gly Val Met Gly Ser Gly

100 105 110

Gln Arg Met Thr Ile Thr Ser Gln Ser Met Ser Thr Gly Leu Ile Arg

115 120 125

Thr Tyr Val Leu Glu Thr Ser Asp Ile Glu Glu Gly Val Val Tyr Thr

130 135 140

Ala Thr Ser Tyr Glu Ala Gly Ala Ser Asp Val Glu Val Ser Trp Phe

145 150 155 160

Ile Gly Ser Val Tyr Glu Leu Tyr Gly Ala Glu Asp Arg Ile Trp Ser

165 170 175

Tyr Asn Gly Gly Gly Glu Gly Pro Met His Tyr Tyr Asp Thr Leu Gln

180 185 190

Lys Ile Asp Leu Thr Asp Ser Gly Lys Phe Ser Arg Glu Asn Lys Gln

195 200 205

Asp Asp Thr Ala Ala Ser Ile Pro Val Ser Asp Ile Tyr Ile Ala Asp

210 215 220

Gly Gly Ile Thr Val Gly Asp Ala Ser Ala Thr Arg Arg Glu Val His

225 230 235 240

Thr Pro Val Gln Glu Thr Ser Asp Ser Ala Gln Val Ser Ile Gly Trp

245 250 255

Pro Gly Lys Val Ile Ala Ala Gly Ser Val Ile Glu Ile Gly Glu Ser

260 265 270

Phe Ala Val Val His Pro Gly Asp Tyr Tyr Asn Gly Leu Arg Gly Tyr

275 280 285

Lys Asn Ala Met Asp His Leu Gly Val Ile Met Pro Ala Pro Gly Asp

290 295 300

Ile Pro Asp Ser Ser Tyr Asp Leu Arg Trp Glu Ser Trp Gly Trp Gly

305 310 315 320

Phe Asn Trp Thr Ile Asp Leu Ile Ile Gly Lys Leu Asp Glu Leu Gln

325 330 335

Ala Ala Gly Val Lys Gln Ile Thr Leu Asp Asp Gly Trp Tyr Thr Asn

340 345 350

Ala Gly Asp Trp Ala Leu Asn Pro Glu Lys Phe Pro Asn Gly Ala Ser

355 360 365

Asp Ala Leu Arg Leu Thr Asp Ala Ile His Glu His Gly Met Thr Ala

370 375 380

Leu Leu Trp Trp Arg Pro Cys Asp Gly Gly Ile Asp Ser Ile Leu Tyr

385 390 395 400

Gln Gln His Pro Glu Tyr Phe Val Met Asp Ala Asp Gly Arg Pro Ala

405 410 415

Arg Leu Pro Thr Pro Gly Gly Gly Thr Asn Pro Ser Leu Gly Tyr Ala

420 425 430

Leu Cys Pro Met Ala Asp Gly Ala Ile Ala Ser Gln Val Asp Phe Val

435 440 445

Asn Arg Ala Met Asn Asp Trp Gly Phe Asp Gly Phe Lys Gly Asp Tyr

450 455 460

Val Trp Ser Met Pro Glu Cys Tyr Asn Pro Ala His Asn His Ala Ser

465 470 475 480

Pro Glu Glu Ser Thr Glu Lys Gln Ser Glu Ile Tyr Arg Val Ser Tyr

485 490 495

Glu Ala Met Val Ala Asn Asp Pro Asn Val Phe Asn Leu Leu Cys Asn

500 505 510

Cys Gly Thr Pro Gln Asp Tyr Tyr Ser Leu Pro Tyr Met Thr Gln Ile

515 520 525

Ala Thr Ala Asp Pro Thr Ser Val Asp Gln Thr Arg Arg Arg Val Lys

530 535 540

Ala Tyr Lys Ala Leu Met Gly Asp Tyr Phe Pro Val Thr Ala Asp His

545 550 555 560

Asn Asn Ile Trp Tyr Pro Ser Ala Val Gly Thr Gly Ser Val Leu Ile

565 570 575

Glu Lys Arg Asp Leu Ser Gly Thr Ala Lys Glu Glu Tyr Glu Lys Trp

580 585 590

Leu Gly Ile Ala Asp Thr Val Gln Leu Gln Lys Gly Arg Phe Ile Gly

595 600 605

Asp Leu Tyr Ser Tyr Gly Phe Asp Pro Tyr Glu Thr Tyr Val Val Glu

610 615 620

Lys Asp Gly Val Met Tyr Tyr Ala Phe Tyr Lys Asp Gly Ser Lys Tyr

625 630 635 640

Ser Pro Thr Gly Tyr Pro Asp Ile Glu Leu Lys Gly Leu Asp Pro Asn

645 650 655

Lys Met Tyr Arg Ile Val Asp Tyr Val Asn Asp Arg Val Val Ala Thr

660 665 670

Asn Leu Met Gly Asp Asn Ala Val Phe Asn Thr Arg Phe Ser Asp Tyr

675 680 685

Leu Leu Val Lys Ala Val Glu Ile Ser Glu Pro Asp Pro Glu Pro Val

690 695 700

Asp Pro Asp Tyr Gly Phe Thr Ser Val Asp Asp Arg Asp Glu Ala Leu

705 710 715 720

Ile Tyr Thr Gly Thr Trp His Asp Asp Asn Asn Ala Ser Phe Ser Glu

725 730 735

Gly Thr Ala Arg Tyr Thr Asn Ser Thr Asp Ala Ser Val Val Phe Ser

740 745 750

Phe Thr Gly Thr Ser Ile Arg Trp Tyr Gly Gln Arg Asp Thr Asn Phe

755 760 765

Gly Thr Ala Glu Val Tyr Leu Asp Asp Glu Leu Lys Thr Thr Val Asp

770 775 780

Ala Asn Gly Ala Ala Glu Ala Gly Val Cys Leu Phe Glu Ala Leu Asp

785 790 795 800

Leu Pro Ala Ala Glu His Thr Ile Lys Ile Val Cys Lys Ser Gly Val

805 810 815

Ile Asp Ile Asp Arg Phe Ala Tyr Glu Ala Ala Thr Leu Glu Pro Ile

820 825 830

Tyr Glu Lys Val Asp Ala Leu Ser Asp Arg Ile Thr Tyr Val Gly Asn

835 840 845

Trp Glu Glu Tyr His Asn Ser Glu Phe Tyr Met Gly Asn Ala Met Arg

850 855 860

Thr Asp Glu Ala Gly Ala Tyr Ala Glu Leu Thr Phe Arg Gly Thr Ala

865 870 875 880

Val Arg Leu Tyr Ala Glu Met Ser Phe Asn Phe Gly Thr Ala Asp Val

885 890 895

Tyr Leu Asp Gly Glu Leu Val Glu Asn Ile Ile Leu Tyr Gly Gln Glu

900 905 910

Ala Thr Gly Gln Leu Met Phe Glu Arg Thr Gly Leu Glu Glu Gly Glu

915 920 925

His Thr Ile Arg Leu Val Gln Asn Ala Trp Asn Ile Asn Leu Asp Tyr

930 935 940

Ile Ser Tyr Leu Pro Glu Gln Asp Gln Pro Thr Pro Pro Glu Thr Thr

945 950 955 960

Val Thr Val Asp Ala Met Asp Ala Gln Leu Val Tyr Thr Gly Val Trp

965 970 975

Asn Asp Asp Tyr His Asp Val Phe Gln Glu Gly Thr Ala Arg Tyr Ala

980 985 990

Ser Ser Ala Gly Ala Ser Val Glu Phe Glu Phe Thr Gly Ser Glu Ile

995 1000 1005

Arg Trp Tyr Gly Gln Asn Asp Ser Asn Phe Gly Val Ala Ser Val

1010 1015 1020

Tyr Ile Asp Asn Glu Phe Val Gln Gln Val Asn Val Asn Gly Ala

1025 1030 1035

Ala Ala Val Gly Lys Leu Leu Phe Gln Lys Ala Asp Leu Pro Ala

1040 1045 1050

Gly Ser His Thr Ile Arg Ile Val Cys Asp Thr Pro Val Ile Asp

1055 1060 1065

Leu Asp Tyr Leu Thr Tyr Thr Thr Asn Ala

1070 1075

<210> 8

<211> 3159

<212> DNA

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 8

gcggcgcctg caacggacac cggcaacgca ggactgattg cagaaggtga ttatgccatt 60

gccggcaatg gcgtccgcgt cacttatgac gcggacgggc agacaatcac tctgtaccgc 120

acagagggat ctgggcttat ccagatgagc aagccttctc cattgggagg gccagtgatt 180

ggagggcagg aggttcagga cttcagccat atttcatgtg atgtggagca gagcaccagc 240

ggagtgatgg gcagcggtca gagaatgacc attacctctc agagcatgag cacgggccta 300

attcgtacct atgtgctgga gacctctgat atcgaggagg gtgtggtata tactgcaaca 360

tcctatgagg caggagcttc tgatgtggaa gtgtcttggt tcattggcag tgtgtatgag 420

ctttatggtg cggaagatcg tatctggagt tataacggcg gcggtgaggg gccgatgcac 480

tactatgata cgcttcaaaa gattgacctg accgactctg gcaagttcag tagggagaat 540

aaacaggatg acacggctgc aagtattcct gtgtcagata tttacattgc tgatggaggg 600

attaccgttg gcgatgcttc tgcaaccaga agggaggtac atactccggt tcaggaaacc 660

agtgattcag ctcaagtttc tatcgggtgg ccaggcaaag tcattgccgc cggaagcgtg 720

atcgaaattg gtgagagctt tgctgtagtc catccgggtg actattataa cggcttgaga 780

ggttacaaaa atgcaatgga tcacttgggc gtgattatgc ctgcacctgg ggatattcct 840

gatagcagct atgatctccg atgggaaagc tggggctggg ggtttaactg gacgatcgat 900

ttaataatcg gcaaattgga tgaacttcag gcagccggag tcaagcagat cactttggat 960

gatggttggt ataccaatgc aggagactgg gccttaaatc cagaaaagtt tccaaatgga 1020

gcctccgatg cgttgcggct gacagatgca attcatgagc atggtatgac tgcactcctt 1080

tggtggagac cttgtgacgg cgggatcgat agtatactct atcagcaaca ccctgaatat 1140

ttcgttatgg atgcagatgg aagacctgca aggcttccta ctcctggtgg tgggaccaat 1200

cccagcttgg gatatgcact ttgccctatg gcggatggtg cgattgcaag ccaagttgac 1260

tttgtaaacc gtgcaatgaa tgattggggg ttcgatggct tcaagggaga ttatgtgtgg 1320

agtatgcctg aatgctacaa tcctgcacat aaccacgcct cgccagaaga atccactgaa 1380

aagcaatccg agatataccg cgtctcttat gaggctatgg tggccaacga ccccaatgtg 1440

ttcaatttgt tgtgcaactg cggtacgccc caggactact atagtttacc atatatgaca 1500

cagattgcta cggctgaccc cacttctgtg gatcaaacaa ggagacgcgt gaaagcctac 1560

aaggcactga tgggagatta tttccctgtt acagccgacc acaataacat ctggtatcca 1620

agtgccgtcg gtacgggctc tgttctcatt gaaaaacgtg accttagcgg tactgccaag 1680

gaagaatatg aaaaatggct tgggattgcg gatacagttc agttgcagaa aggccggttt 1740

attggcgatc tttacagtta tggttttgac ccttacgaaa cctatgtggt ggagaaagac 1800

ggggttatgt actatgcctt ctacaaagat gggagcaaat atagccccac tggctatcca 1860

gatattgagt tgaaggggct agatccaaat aaaatgtata ggattgttga ctatgtcaat 1920

gatcgtgtcg tggcaacaaa cctgatgggt gataacgctg tattcaatac acgtttttcc 1980

gactatctac tggttaaagc ggtggaaatt tcggaaccgg atccagaacc tgttgaccct 2040

gattatggtt tcacctctgt tgatgacaga gacgaggctc ttatttacac agggacatgg 2100

catgatgaca ataacgcatc tttcagcgaa gggactgcac gttataccaa cagtacggat 2160

gcttcggttg tattctcctt tactggaact tccattcgct ggtatggcca gagggatacc 2220

aattttggca cggcagaagt ttatttggac gatgaactga aaacaacagt tgatgcgaat 2280

ggggccgcag aagcaggcgt atgtcttttt gaggcgcttg atcttccggc tgccgagcat 2340

accattaaaa ttgtgtgcaa gagcggagtg attgatattg accgctttgc atatgaagct 2400

gctacccttg aacccatcta tgaaaaggtc gatgcgctct cggatcggat cacttatgtt 2460

gggaattggg aagagtatca caacagcgag ttctacatgg gaaacgcaat gcgcacagac 2520

gaagccggcg cttatgctga actgactttc cgtggtacag ccgtacgcct gtatgcagag 2580

atgagcttca attttggcac tgcagatgtc tatttagacg gagagttagt ggaaaacata 2640

atcctatacg gccaggaagc aactgggcag ctaatgtttg agcgtacggg actggaggaa 2700

ggagaacata ccattcgcct tgtacaaaac gcctggaaca tcaatttgga ctatatttct 2760

tatctaccag agcaagatca accaacgccg ccggagacga cggttactgt tgatgcaatg 2820

gacgcccaac tggtgtatac aggcgtatgg aatgatgact atcatgacgt ctttcaggaa 2880

ggaaccgccc gttatgccag tagtgccggc gcctcggtcg agttcgaatt tactggaagc 2940

gaaatccgtt ggtatggaca aaatgattcc aacttcggtg ttgccagcgt ttatatcgat 3000

aatgagtttg tgcagcaggt aaatgttaac ggagctgcgg ctgtgggaaa gcttttgttt 3060

caaaaggctg atctaccagc cggttcgcac acgatccgca ttgtgtgcga tactccggtt 3120

attgatttgg actatttgac ttataccact aacgcataa 3159

<210> 9

<211> 1052

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 9

Ala Ala Pro Ala Thr Asp Thr Gly Asn Ala Gly Leu Ile Ala Glu Gly

1 5 10 15

Asp Tyr Ala Ile Ala Gly Asn Gly Val Arg Val Thr Tyr Asp Ala Asp

20 25 30

Gly Gln Thr Ile Thr Leu Tyr Arg Thr Glu Gly Ser Gly Leu Ile Gln

35 40 45

Met Ser Lys Pro Ser Pro Leu Gly Gly Pro Val Ile Gly Gly Gln Glu

50 55 60

Val Gln Asp Phe Ser His Ile Ser Cys Asp Val Glu Gln Ser Thr Ser

65 70 75 80

Gly Val Met Gly Ser Gly Gln Arg Met Thr Ile Thr Ser Gln Ser Met

85 90 95

Ser Thr Gly Leu Ile Arg Thr Tyr Val Leu Glu Thr Ser Asp Ile Glu

100 105 110

Glu Gly Val Val Tyr Thr Ala Thr Ser Tyr Glu Ala Gly Ala Ser Asp

115 120 125

Val Glu Val Ser Trp Phe Ile Gly Ser Val Tyr Glu Leu Tyr Gly Ala

130 135 140

Glu Asp Arg Ile Trp Ser Tyr Asn Gly Gly Gly Glu Gly Pro Met His

145 150 155 160

Tyr Tyr Asp Thr Leu Gln Lys Ile Asp Leu Thr Asp Ser Gly Lys Phe

165 170 175

Ser Arg Glu Asn Lys Gln Asp Asp Thr Ala Ala Ser Ile Pro Val Ser

180 185 190

Asp Ile Tyr Ile Ala Asp Gly Gly Ile Thr Val Gly Asp Ala Ser Ala

195 200 205

Thr Arg Arg Glu Val His Thr Pro Val Gln Glu Thr Ser Asp Ser Ala

210 215 220

Gln Val Ser Ile Gly Trp Pro Gly Lys Val Ile Ala Ala Gly Ser Val

225 230 235 240

Ile Glu Ile Gly Glu Ser Phe Ala Val Val His Pro Gly Asp Tyr Tyr

245 250 255

Asn Gly Leu Arg Gly Tyr Lys Asn Ala Met Asp His Leu Gly Val Ile

260 265 270

Met Pro Ala Pro Gly Asp Ile Pro Asp Ser Ser Tyr Asp Leu Arg Trp

275 280 285

Glu Ser Trp Gly Trp Gly Phe Asn Trp Thr Ile Asp Leu Ile Ile Gly

290 295 300

Lys Leu Asp Glu Leu Gln Ala Ala Gly Val Lys Gln Ile Thr Leu Asp

305 310 315 320

Asp Gly Trp Tyr Thr Asn Ala Gly Asp Trp Ala Leu Asn Pro Glu Lys

325 330 335

Phe Pro Asn Gly Ala Ser Asp Ala Leu Arg Leu Thr Asp Ala Ile His

340 345 350

Glu His Gly Met Thr Ala Leu Leu Trp Trp Arg Pro Cys Asp Gly Gly

355 360 365

Ile Asp Ser Ile Leu Tyr Gln Gln His Pro Glu Tyr Phe Val Met Asp

370 375 380

Ala Asp Gly Arg Pro Ala Arg Leu Pro Thr Pro Gly Gly Gly Thr Asn

385 390 395 400

Pro Ser Leu Gly Tyr Ala Leu Cys Pro Met Ala Asp Gly Ala Ile Ala

405 410 415

Ser Gln Val Asp Phe Val Asn Arg Ala Met Asn Asp Trp Gly Phe Asp

420 425 430

Gly Phe Lys Gly Asp Tyr Val Trp Ser Met Pro Glu Cys Tyr Asn Pro

435 440 445

Ala His Asn His Ala Ser Pro Glu Glu Ser Thr Glu Lys Gln Ser Glu

450 455 460

Ile Tyr Arg Val Ser Tyr Glu Ala Met Val Ala Asn Asp Pro Asn Val

465 470 475 480

Phe Asn Leu Leu Cys Asn Cys Gly Thr Pro Gln Asp Tyr Tyr Ser Leu

485 490 495

Pro Tyr Met Thr Gln Ile Ala Thr Ala Asp Pro Thr Ser Val Asp Gln

500 505 510

Thr Arg Arg Arg Val Lys Ala Tyr Lys Ala Leu Met Gly Asp Tyr Phe

515 520 525

Pro Val Thr Ala Asp His Asn Asn Ile Trp Tyr Pro Ser Ala Val Gly

530 535 540

Thr Gly Ser Val Leu Ile Glu Lys Arg Asp Leu Ser Gly Thr Ala Lys

545 550 555 560

Glu Glu Tyr Glu Lys Trp Leu Gly Ile Ala Asp Thr Val Gln Leu Gln

565 570 575

Lys Gly Arg Phe Ile Gly Asp Leu Tyr Ser Tyr Gly Phe Asp Pro Tyr

580 585 590

Glu Thr Tyr Val Val Glu Lys Asp Gly Val Met Tyr Tyr Ala Phe Tyr

595 600 605

Lys Asp Gly Ser Lys Tyr Ser Pro Thr Gly Tyr Pro Asp Ile Glu Leu

610 615 620

Lys Gly Leu Asp Pro Asn Lys Met Tyr Arg Ile Val Asp Tyr Val Asn

625 630 635 640

Asp Arg Val Val Ala Thr Asn Leu Met Gly Asp Asn Ala Val Phe Asn

645 650 655

Thr Arg Phe Ser Asp Tyr Leu Leu Val Lys Ala Val Glu Ile Ser Glu

660 665 670

Pro Asp Pro Glu Pro Val Asp Pro Asp Tyr Gly Phe Thr Ser Val Asp

675 680 685

Asp Arg Asp Glu Ala Leu Ile Tyr Thr Gly Thr Trp His Asp Asp Asn

690 695 700

Asn Ala Ser Phe Ser Glu Gly Thr Ala Arg Tyr Thr Asn Ser Thr Asp

705 710 715 720

Ala Ser Val Val Phe Ser Phe Thr Gly Thr Ser Ile Arg Trp Tyr Gly

725 730 735

Gln Arg Asp Thr Asn Phe Gly Thr Ala Glu Val Tyr Leu Asp Asp Glu

740 745 750

Leu Lys Thr Thr Val Asp Ala Asn Gly Ala Ala Glu Ala Gly Val Cys

755 760 765

Leu Phe Glu Ala Leu Asp Leu Pro Ala Ala Glu His Thr Ile Lys Ile

770 775 780

Val Cys Lys Ser Gly Val Ile Asp Ile Asp Arg Phe Ala Tyr Glu Ala

785 790 795 800

Ala Thr Leu Glu Pro Ile Tyr Glu Lys Val Asp Ala Leu Ser Asp Arg

805 810 815

Ile Thr Tyr Val Gly Asn Trp Glu Glu Tyr His Asn Ser Glu Phe Tyr

820 825 830

Met Gly Asn Ala Met Arg Thr Asp Glu Ala Gly Ala Tyr Ala Glu Leu

835 840 845

Thr Phe Arg Gly Thr Ala Val Arg Leu Tyr Ala Glu Met Ser Phe Asn

850 855 860

Phe Gly Thr Ala Asp Val Tyr Leu Asp Gly Glu Leu Val Glu Asn Ile

865 870 875 880

Ile Leu Tyr Gly Gln Glu Ala Thr Gly Gln Leu Met Phe Glu Arg Thr

885 890 895

Gly Leu Glu Glu Gly Glu His Thr Ile Arg Leu Val Gln Asn Ala Trp

900 905 910

Asn Ile Asn Leu Asp Tyr Ile Ser Tyr Leu Pro Glu Gln Asp Gln Pro

915 920 925

Thr Pro Pro Glu Thr Thr Val Thr Val Asp Ala Met Asp Ala Gln Leu

930 935 940

Val Tyr Thr Gly Val Trp Asn Asp Asp Tyr His Asp Val Phe Gln Glu

945 950 955 960

Gly Thr Ala Arg Tyr Ala Ser Ser Ala Gly Ala Ser Val Glu Phe Glu

965 970 975

Phe Thr Gly Ser Glu Ile Arg Trp Tyr Gly Gln Asn Asp Ser Asn Phe

980 985 990

Gly Val Ala Ser Val Tyr Ile Asp Asn Glu Phe Val Gln Gln Val Asn

995 1000 1005

Val Asn Gly Ala Ala Ala Val Gly Lys Leu Leu Phe Gln Lys Ala

1010 1015 1020

Asp Leu Pro Ala Gly Ser His Thr Ile Arg Ile Val Cys Asp Thr

1025 1030 1035

Pro Val Ile Asp Leu Asp Tyr Leu Thr Tyr Thr Thr Asn Ala

1040 1045 1050

<210> 10

<211> 1067

<212> PRT

<213> Clostridium prodigiosus (Flavonifractor planutii)

<400> 10

Met Gly His His His His His His His His His His Ser Ser Gly Ala

1 5 10 15

Ala Pro Ala Thr Asp Thr Gly Asn Ala Gly Leu Ile Ala Glu Gly Asp

20 25 30

Tyr Ala Ile Ala Gly Asn Gly Val Arg Val Thr Tyr Asp Ala Asp Gly

35 40 45

Gln Thr Ile Thr Leu Tyr Arg Thr Glu Gly Ser Gly Leu Ile Gln Met

50 55 60

Ser Lys Pro Ser Pro Leu Gly Gly Pro Val Ile Gly Gly Gln Glu Val

65 70 75 80

Gln Asp Phe Ser His Ile Ser Cys Asp Val Glu Gln Ser Thr Ser Gly

85 90 95

Val Met Gly Ser Gly Gln Arg Met Thr Ile Thr Ser Gln Ser Met Ser

100 105 110

Thr Gly Leu Ile Arg Thr Tyr Val Leu Glu Thr Ser Asp Ile Glu Glu

115 120 125

Gly Val Val Tyr Thr Ala Thr Ser Tyr Glu Ala Gly Ala Ser Asp Val

130 135 140

Glu Val Ser Trp Phe Ile Gly Ser Val Tyr Glu Leu Tyr Gly Ala Glu

145 150 155 160

Asp Arg Ile Trp Ser Tyr Asn Gly Gly Gly Glu Gly Pro Met His Tyr

165 170 175

Tyr Asp Thr Leu Gln Lys Ile Asp Leu Thr Asp Ser Gly Lys Phe Ser

180 185 190

Arg Glu Asn Lys Gln Asp Asp Thr Ala Ala Ser Ile Pro Val Ser Asp

195 200 205

Ile Tyr Ile Ala Asp Gly Gly Ile Thr Val Gly Asp Ala Ser Ala Thr

210 215 220

Arg Arg Glu Val His Thr Pro Val Gln Glu Thr Ser Asp Ser Ala Gln

225 230 235 240

Val Ser Ile Gly Trp Pro Gly Lys Val Ile Ala Ala Gly Ser Val Ile

245 250 255

Glu Ile Gly Glu Ser Phe Ala Val Val His Pro Gly Asp Tyr Tyr Asn

260 265 270

Gly Leu Arg Gly Tyr Lys Asn Ala Met Asp His Leu Gly Val Ile Met

275 280 285

Pro Ala Pro Gly Asp Ile Pro Asp Ser Ser Tyr Asp Leu Arg Trp Glu

290 295 300

Ser Trp Gly Trp Gly Phe Asn Trp Thr Ile Asp Leu Ile Ile Gly Lys

305 310 315 320

Leu Asp Glu Leu Gln Ala Ala Gly Val Lys Gln Ile Thr Leu Asp Asp

325 330 335

Gly Trp Tyr Thr Asn Ala Gly Asp Trp Ala Leu Asn Pro Glu Lys Phe

340 345 350

Pro Asn Gly Ala Ser Asp Ala Leu Arg Leu Thr Asp Ala Ile His Glu

355 360 365

His Gly Met Thr Ala Leu Leu Trp Trp Arg Pro Cys Asp Gly Gly Ile

370 375 380

Asp Ser Ile Leu Tyr Gln Gln His Pro Glu Tyr Phe Val Met Asp Ala

385 390 395 400

Asp Gly Arg Pro Ala Arg Leu Pro Thr Pro Gly Gly Gly Thr Asn Pro

405 410 415

Ser Leu Gly Tyr Ala Leu Cys Pro Met Ala Asp Gly Ala Ile Ala Ser

420 425 430

Gln Val Asp Phe Val Asn Arg Ala Met Asn Asp Trp Gly Phe Asp Gly

435 440 445

Phe Lys Gly Asp Tyr Val Trp Ser Met Pro Glu Cys Tyr Asn Pro Ala

450 455 460

His Asn His Ala Ser Pro Glu Glu Ser Thr Glu Lys Gln Ser Glu Ile

465 470 475 480

Tyr Arg Val Ser Tyr Glu Ala Met Val Ala Asn Asp Pro Asn Val Phe

485 490 495

Asn Leu Leu Cys Asn Cys Gly Thr Pro Gln Asp Tyr Tyr Ser Leu Pro

500 505 510

Tyr Met Thr Gln Ile Ala Thr Ala Asp Pro Thr Ser Val Asp Gln Thr

515 520 525

Arg Arg Arg Val Lys Ala Tyr Lys Ala Leu Met Gly Asp Tyr Phe Pro

530 535 540

Val Thr Ala Asp His Asn Asn Ile Trp Tyr Pro Ser Ala Val Gly Thr

545 550 555 560

Gly Ser Val Leu Ile Glu Lys Arg Asp Leu Ser Gly Thr Ala Lys Glu

565 570 575

Glu Tyr Glu Lys Trp Leu Gly Ile Ala Asp Thr Val Gln Leu Gln Lys

580 585 590

Gly Arg Phe Ile Gly Asp Leu Tyr Ser Tyr Gly Phe Asp Pro Tyr Glu

595 600 605

Thr Tyr Val Val Glu Lys Asp Gly Val Met Tyr Tyr Ala Phe Tyr Lys

610 615 620

Asp Gly Ser Lys Tyr Ser Pro Thr Gly Tyr Pro Asp Ile Glu Leu Lys

625 630 635 640

Gly Leu Asp Pro Asn Lys Met Tyr Arg Ile Val Asp Tyr Val Asn Asp

645 650 655

Arg Val Val Ala Thr Asn Leu Met Gly Asp Asn Ala Val Phe Asn Thr

660 665 670

Arg Phe Ser Asp Tyr Leu Leu Val Lys Ala Val Glu Ile Ser Glu Pro

675 680 685

Asp Pro Glu Pro Val Asp Pro Asp Tyr Gly Phe Thr Ser Val Asp Asp

690 695 700

Arg Asp Glu Ala Leu Ile Tyr Thr Gly Thr Trp His Asp Asp Asn Asn

705 710 715 720

Ala Ser Phe Ser Glu Gly Thr Ala Arg Tyr Thr Asn Ser Thr Asp Ala

725 730 735

Ser Val Val Phe Ser Phe Thr Gly Thr Ser Ile Arg Trp Tyr Gly Gln

740 745 750

Arg Asp Thr Asn Phe Gly Thr Ala Glu Val Tyr Leu Asp Asp Glu Leu

755 760 765

Lys Thr Thr Val Asp Ala Asn Gly Ala Ala Glu Ala Gly Val Cys Leu

770 775 780

Phe Glu Ala Leu Asp Leu Pro Ala Ala Glu His Thr Ile Lys Ile Val

785 790 795 800

Cys Lys Ser Gly Val Ile Asp Ile Asp Arg Phe Ala Tyr Glu Ala Ala

805 810 815

Thr Leu Glu Pro Ile Tyr Glu Lys Val Asp Ala Leu Ser Asp Arg Ile

820 825 830

Thr Tyr Val Gly Asn Trp Glu Glu Tyr His Asn Ser Glu Phe Tyr Met

835 840 845

Gly Asn Ala Met Arg Thr Asp Glu Ala Gly Ala Tyr Ala Glu Leu Thr

850 855 860

Phe Arg Gly Thr Ala Val Arg Leu Tyr Ala Glu Met Ser Phe Asn Phe

865 870 875 880

Gly Thr Ala Asp Val Tyr Leu Asp Gly Glu Leu Val Glu Asn Ile Ile

885 890 895

Leu Tyr Gly Gln Glu Ala Thr Gly Gln Leu Met Phe Glu Arg Thr Gly

900 905 910

Leu Glu Glu Gly Glu His Thr Ile Arg Leu Val Gln Asn Ala Trp Asn

915 920 925

Ile Asn Leu Asp Tyr Ile Ser Tyr Leu Pro Glu Gln Asp Gln Pro Thr

930 935 940

Pro Pro Glu Thr Thr Val Thr Val Asp Ala Met Asp Ala Gln Leu Val

945 950 955 960

Tyr Thr Gly Val Trp Asn Asp Asp Tyr His Asp Val Phe Gln Glu Gly

965 970 975

Thr Ala Arg Tyr Ala Ser Ser Ala Gly Ala Ser Val Glu Phe Glu Phe

980 985 990

Thr Gly Ser Glu Ile Arg Trp Tyr Gly Gln Asn Asp Ser Asn Phe Gly

995 1000 1005

Val Ala Ser Val Tyr Ile Asp Asn Glu Phe Val Gln Gln Val Asn

1010 1015 1020

Val Asn Gly Ala Ala Ala Val Gly Lys Leu Leu Phe Gln Lys Ala

1025 1030 1035

Asp Leu Pro Ala Gly Ser His Thr Ile Arg Ile Val Cys Asp Thr

1040 1045 1050

Pro Val Ile Asp Leu Asp Tyr Leu Thr Tyr Thr Thr Asn Ala

1055 1060 1065

<210> 11

<211> 3963

<212> DNA

<213> third Clostridium bacterium (Clostridium terrium)

<400> 11

atgaaaaaaa gaattttagc tacttttatt acagctatgt gtggactggg atttttttca 60

aactggactt caagtaatgc ttataattta attgataata ttagtgttga aaaattagat 120

actgatattt cacaagcaaa tgaaaatgtt tttttgaatg gaaatggaat tgctttagaa 180

gtagataata gaggcgctac atgtatttat ctagtagatg aaaatggagt taaaacaaaa 240

gctacgactt ctttagatac agcagatttt tcaggttatc caataatagg tggacaaaag 300

ataagagatt ttgtaattat atcaaaaaat ctagaagaaa acataaactc gatattaggt 360

gttggaaata gacttactat tatatctaaa agttcatcta ctaatctgat aagaaagata 420

gtatttgaaa catctaacag caatccagga gcaatatatt caacagtaag ttataaagca 480

gaaagtaacg atttattagt agatagcttt catgaaaatg agtatacaat gagtttaggg 540

caaggacctt ttcttgcata tcaagggtgt gcagatcaac aaggagcaaa tactatcgtt 600

aatgttacta atggatataa ccataatagt ggacaaaata attattctgt aggagttcca 660

tttagttatg tttataactc tgtgggggga attggaatag gtgatgcatc aacttcaaga 720

agagaattta agttgcctat tataggaaaa gataatacag tttcattagg aatggagtgg 780

aatggacaaa ctttaaaaaa aggtgctgaa actgctatag gtacaagtgt tataactaca 840

acaaatggtg attattattc tgggctaaag agttacgcag aagttatgaa agataaggga 900

atatctgcac cagcttcaat acctgatata gcatatgatt ctagatggga aagttgggga 960

ttcgaatttg attttacaat agaaaaaata gttaataaat tagatgaact taaagcgatg 1020

gggataaaac aaattactct agatgatggg tggtacactt atgctggtga ttggaaatta 1080

agtcctcaaa agtttccaaa tggaaatgca gacatgaaat atcttacaga tgaaatccat 1140

aaaagaggaa tgacagctat tttatggtgg agaccagtag acggagggat aaatagcaaa 1200

ttagtatctg aacatccaga gtggtttatt aagaactcac aagggaatat ggttaggtta 1260

ccagggcctg gaggtggaaa tggaggaaca gcaggatatg cattatgtcc aaattcagaa 1320

ggttcaattc aacatcataa agattttgta actgtggcat tagaagaatg gggatttgat 1380

ggattcaaag aagattatgt atggggaata cctaaatgct atgatagttc tcataaacac 1440

tcaagtttat cagatacatt agaaaatcaa tataaattct atgaagccat atatgaacag 1500

tccatagcga taaatccaga tacttttata gaattatgta attgcggaac acctcaggat 1560

ttttattcaa caccatatgt gaaccatgca ccaacagcag atccaatttc gagagtacaa 1620

acaagaacaa gagtgaaagc atttaaagct atatttggag atgattttcc agtaacaaca 1680

gatcataatt cagtttggtt accgtcagca ttaggtacag gatcagttat gattactaaa 1740

catacaacat taagtagttc agatagagaa caatataata aatacttcgg acttgcaaga 1800

gatttagaat tagcaaaggg agaatttata ggaaacttat ataaatacgg aatagatcca 1860

ttagagtcat atgttataag aaaaggagaa gatatttatt attcattcta caaagataat 1920

tctagttatt caggaaatat agaaataaag gggttagaca gtaacgccac atatagaatt 1980

gaagattatg ttaacaatag agttattgct agaggagtaa agggaccaac agcgactata 2040

aatacaagct ttactgataa tttattagtt agagcaatac cagatgatac accagcagag 2100

gttactacat ttgatgttgg aaataataca atattatcat caacagatag tggaaattct 2160

aaatatttaa atgctgtttc tactacatta gaaaagacag caacaataga tagtttaagt 2220

atttatatag gaaataattc agaaaatggc aaactacaaa ttgctattta tgacgataat 2280

aacgggaaac ctggtactaa aaaagcttac gtagaagagt ttgttcctac taaaaatagt 2340

tggaatacaa agaaggttgt aaattctgtt acattacctt cagggcaata ttggttagtt 2400

ttccaacctg ataacgatgt actacaaaca aaaactaatc catcatccat gaaacaaagt 2460

gctaacaata atccatataa ttataatata ttaccaaatt catttcctat tggaacagga 2520

tataatgctt ataaaggcga tgtatctttc tatgcaacct ttaaagaagc aagcagtcaa 2580

gcaattcctc aaaattcttg ggctctaaaa tatgtagata gtgaagaaac tacaggcgaa 2640

aatggaagag ctacaaatgc ttttgatggt aataataata ctatttggca cacaaaatat 2700

agtggcggaa acgctgcacc aatgccgcat gagattcaaa ttgatttaag aggagtatat 2760

aatataaatc aaattaatta tctaccaaga caagatggag gaaccaatgg tacaataaag 2820

gactatgaag tttatttaag tttagatgga gtgaactggg gacaacctat atcaaaagga 2880

acctttgaat caaactctac agaaaaaata gtaaaattca acgaaacaaa atctaggtat 2940

gtaaaactta aagctctgtc agaaattaat aataaacaat ttactacagt agctgattta 3000

aaggtatttg gatgggagat atccaaaata gaaaaaccat tacaaaatgc tgaaacttat 3060

ttgaatatac caacttatga tggattaaat caaagtactc atccagatgt caaatatttt 3120

aaaaatggtt ggaatggata taaatattgg atgataatga ctccaaatag aacaggtagc 3180

tcagttgctg aaaatccttc aatactagca tctgatgatg gaataaattg ggaggttcct 3240

gcaggtgtta caaatcctat agctccaatg ccacaagtag gacataattg tgatgttgat 3300

atgatatata atgaagcaac tgatgagtta tgggtgtact gggtagaatc agatgatata 3360

acaaaaggat gggttaaatt aataaaatca aaggatggag taaattggag ttctcagcaa 3420

gtggtagttg atgataatag ggcaaaatat agtactttat caccatctat aatattcaaa 3480

gataataaat actatatgtg gtcagttaat acaggaaata gtggttggaa caatcaaagt 3540

aataaagttg aattaagaga atcaagtgac ggagtaaatt ggtcaaatcc aacagttgta 3600

aacacattag ctcaagatgg ttctcaaata tggcatgtaa atgtagaata tataccatca 3660

aaaaacgaat attgggctat atatccagca tataaaaatg gaacaggtag cgataaaaca 3720

gaattgtatt atgcgaaatc aagtgatgga gtaaattgga caacttataa gaatcctata 3780

ttatcaaaag gaacatctgg taaatgggat gatatggaga tatatagaag ttgttttgtg 3840

tacgatgaag atacaaatat gataaaggtt tggtatggag ctgtgagtca aaatccacaa 3900

atatggaaaa taggttttac tgaaaatgat tatgataagt ttattgaggg tttaacacaa 3960

taa 3963

<210> 12

<211> 1320

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 12

Met Lys Lys Arg Ile Leu Ala Thr Phe Ile Thr Ala Met Cys Gly Leu

1 5 10 15

Gly Phe Phe Ser Asn Trp Thr Ser Ser Asn Ala Tyr Asn Leu Ile Asp

20 25 30

Asn Ile Ser Val Glu Lys Leu Asp Thr Asp Ile Ser Gln Ala Asn Glu

35 40 45

Asn Val Phe Leu Asn Gly Asn Gly Ile Ala Leu Glu Val Asp Asn Arg

50 55 60

Gly Ala Thr Cys Ile Tyr Leu Val Asp Glu Asn Gly Val Lys Thr Lys

65 70 75 80

Ala Thr Thr Ser Leu Asp Thr Ala Asp Phe Ser Gly Tyr Pro Ile Ile

85 90 95

Gly Gly Gln Lys Ile Arg Asp Phe Val Ile Ile Ser Lys Asn Leu Glu

100 105 110

Glu Asn Ile Asn Ser Ile Leu Gly Val Gly Asn Arg Leu Thr Ile Ile

115 120 125

Ser Lys Ser Ser Ser Thr Asn Leu Ile Arg Lys Ile Val Phe Glu Thr

130 135 140

Ser Asn Ser Asn Pro Gly Ala Ile Tyr Ser Thr Val Ser Tyr Lys Ala

145 150 155 160

Glu Ser Asn Asp Leu Leu Val Asp Ser Phe His Glu Asn Glu Tyr Thr

165 170 175

Met Ser Leu Gly Gln Gly Pro Phe Leu Ala Tyr Gln Gly Cys Ala Asp

180 185 190

Gln Gln Gly Ala Asn Thr Ile Val Asn Val Thr Asn Gly Tyr Asn His

195 200 205

Asn Ser Gly Gln Asn Asn Tyr Ser Val Gly Val Pro Phe Ser Tyr Val

210 215 220

Tyr Asn Ser Val Gly Gly Ile Gly Ile Gly Asp Ala Ser Thr Ser Arg

225 230 235 240

Arg Glu Phe Lys Leu Pro Ile Ile Gly Lys Asp Asn Thr Val Ser Leu

245 250 255

Gly Met Glu Trp Asn Gly Gln Thr Leu Lys Lys Gly Ala Glu Thr Ala

260 265 270

Ile Gly Thr Ser Val Ile Thr Thr Thr Asn Gly Asp Tyr Tyr Ser Gly

275 280 285

Leu Lys Ser Tyr Ala Glu Val Met Lys Asp Lys Gly Ile Ser Ala Pro

290 295 300

Ala Ser Ile Pro Asp Ile Ala Tyr Asp Ser Arg Trp Glu Ser Trp Gly

305 310 315 320

Phe Glu Phe Asp Phe Thr Ile Glu Lys Ile Val Asn Lys Leu Asp Glu

325 330 335

Leu Lys Ala Met Gly Ile Lys Gln Ile Thr Leu Asp Asp Gly Trp Tyr

340 345 350

Thr Tyr Ala Gly Asp Trp Lys Leu Ser Pro Gln Lys Phe Pro Asn Gly

355 360 365

Asn Ala Asp Met Lys Tyr Leu Thr Asp Glu Ile His Lys Arg Gly Met

370 375 380

Thr Ala Ile Leu Trp Trp Arg Pro Val Asp Gly Gly Ile Asn Ser Lys

385 390 395 400

Leu Val Ser Glu His Pro Glu Trp Phe Ile Lys Asn Ser Gln Gly Asn

405 410 415

Met Val Arg Leu Pro Gly Pro Gly Gly Gly Asn Gly Gly Thr Ala Gly

420 425 430

Tyr Ala Leu Cys Pro Asn Ser Glu Gly Ser Ile Gln His His Lys Asp

435 440 445

Phe Val Thr Val Ala Leu Glu Glu Trp Gly Phe Asp Gly Phe Lys Glu

450 455 460

Asp Tyr Val Trp Gly Ile Pro Lys Cys Tyr Asp Ser Ser His Lys His

465 470 475 480

Ser Ser Leu Ser Asp Thr Leu Glu Asn Gln Tyr Lys Phe Tyr Glu Ala

485 490 495

Ile Tyr Glu Gln Ser Ile Ala Ile Asn Pro Asp Thr Phe Ile Glu Leu

500 505 510

Cys Asn Cys Gly Thr Pro Gln Asp Phe Tyr Ser Thr Pro Tyr Val Asn

515 520 525

His Ala Pro Thr Ala Asp Pro Ile Ser Arg Val Gln Thr Arg Thr Arg

530 535 540

Val Lys Ala Phe Lys Ala Ile Phe Gly Asp Asp Phe Pro Val Thr Thr

545 550 555 560

Asp His Asn Ser Val Trp Leu Pro Ser Ala Leu Gly Thr Gly Ser Val

565 570 575

Met Ile Thr Lys His Thr Thr Leu Ser Ser Ser Asp Arg Glu Gln Tyr

580 585 590

Asn Lys Tyr Phe Gly Leu Ala Arg Asp Leu Glu Leu Ala Lys Gly Glu

595 600 605

Phe Ile Gly Asn Leu Tyr Lys Tyr Gly Ile Asp Pro Leu Glu Ser Tyr

610 615 620

Val Ile Arg Lys Gly Glu Asp Ile Tyr Tyr Ser Phe Tyr Lys Asp Asn

625 630 635 640

Ser Ser Tyr Ser Gly Asn Ile Glu Ile Lys Gly Leu Asp Ser Asn Ala

645 650 655

Thr Tyr Arg Ile Glu Asp Tyr Val Asn Asn Arg Val Ile Ala Arg Gly

660 665 670

Val Lys Gly Pro Thr Ala Thr Ile Asn Thr Ser Phe Thr Asp Asn Leu

675 680 685

Leu Val Arg Ala Ile Pro Asp Asp Thr Pro Ala Glu Val Thr Thr Phe

690 695 700

Asp Val Gly Asn Asn Thr Ile Leu Ser Ser Thr Asp Ser Gly Asn Ser

705 710 715 720

Lys Tyr Leu Asn Ala Val Ser Thr Thr Leu Glu Lys Thr Ala Thr Ile

725 730 735

Asp Ser Leu Ser Ile Tyr Ile Gly Asn Asn Ser Glu Asn Gly Lys Leu

740 745 750

Gln Ile Ala Ile Tyr Asp Asp Asn Asn Gly Lys Pro Gly Thr Lys Lys

755 760 765

Ala Tyr Val Glu Glu Phe Val Pro Thr Lys Asn Ser Trp Asn Thr Lys

770 775 780

Lys Val Val Asn Ser Val Thr Leu Pro Ser Gly Gln Tyr Trp Leu Val

785 790 795 800

Phe Gln Pro Asp Asn Asp Val Leu Gln Thr Lys Thr Asn Pro Ser Ser

805 810 815

Met Lys Gln Ser Ala Asn Asn Asn Pro Tyr Asn Tyr Asn Ile Leu Pro

820 825 830

Asn Ser Phe Pro Ile Gly Thr Gly Tyr Asn Ala Tyr Lys Gly Asp Val

835 840 845

Ser Phe Tyr Ala Thr Phe Lys Glu Ala Ser Ser Gln Ala Ile Pro Gln

850 855 860

Asn Ser Trp Ala Leu Lys Tyr Val Asp Ser Glu Glu Thr Thr Gly Glu

865 870 875 880

Asn Gly Arg Ala Thr Asn Ala Phe Asp Gly Asn Asn Asn Thr Ile Trp

885 890 895

His Thr Lys Tyr Ser Gly Gly Asn Ala Ala Pro Met Pro His Glu Ile

900 905 910

Gln Ile Asp Leu Arg Gly Val Tyr Asn Ile Asn Gln Ile Asn Tyr Leu

915 920 925

Pro Arg Gln Asp Gly Gly Thr Asn Gly Thr Ile Lys Asp Tyr Glu Val

930 935 940

Tyr Leu Ser Leu Asp Gly Val Asn Trp Gly Gln Pro Ile Ser Lys Gly

945 950 955 960

Thr Phe Glu Ser Asn Ser Thr Glu Lys Ile Val Lys Phe Asn Glu Thr

965 970 975

Lys Ser Arg Tyr Val Lys Leu Lys Ala Leu Ser Glu Ile Asn Asn Lys

980 985 990

Gln Phe Thr Thr Val Ala Asp Leu Lys Val Phe Gly Trp Glu Ile Ser

995 1000 1005

Lys Ile Glu Lys Pro Leu Gln Asn Ala Glu Thr Tyr Leu Asn Ile

1010 1015 1020

Pro Thr Tyr Asp Gly Leu Asn Gln Ser Thr His Pro Asp Val Lys

1025 1030 1035

Tyr Phe Lys Asn Gly Trp Asn Gly Tyr Lys Tyr Trp Met Ile Met

1040 1045 1050

Thr Pro Asn Arg Thr Gly Ser Ser Val Ala Glu Asn Pro Ser Ile

1055 1060 1065

Leu Ala Ser Asp Asp Gly Ile Asn Trp Glu Val Pro Ala Gly Val

1070 1075 1080

Thr Asn Pro Ile Ala Pro Met Pro Gln Val Gly His Asn Cys Asp

1085 1090 1095

Val Asp Met Ile Tyr Asn Glu Ala Thr Asp Glu Leu Trp Val Tyr

1100 1105 1110

Trp Val Glu Ser Asp Asp Ile Thr Lys Gly Trp Val Lys Leu Ile

1115 1120 1125

Lys Ser Lys Asp Gly Val Asn Trp Ser Ser Gln Gln Val Val Val

1130 1135 1140

Asp Asp Asn Arg Ala Lys Tyr Ser Thr Leu Ser Pro Ser Ile Ile

1145 1150 1155

Phe Lys Asp Asn Lys Tyr Tyr Met Trp Ser Val Asn Thr Gly Asn

1160 1165 1170

Ser Gly Trp Asn Asn Gln Ser Asn Lys Val Glu Leu Arg Glu Ser

1175 1180 1185

Ser Asp Gly Val Asn Trp Ser Asn Pro Thr Val Val Asn Thr Leu

1190 1195 1200

Ala Gln Asp Gly Ser Gln Ile Trp His Val Asn Val Glu Tyr Ile

1205 1210 1215

Pro Ser Lys Asn Glu Tyr Trp Ala Ile Tyr Pro Ala Tyr Lys Asn

1220 1225 1230

Gly Thr Gly Ser Asp Lys Thr Glu Leu Tyr Tyr Ala Lys Ser Ser

1235 1240 1245

Asp Gly Val Asn Trp Thr Thr Tyr Lys Asn Pro Ile Leu Ser Lys

1250 1255 1260

Gly Thr Ser Gly Lys Trp Asp Asp Met Glu Ile Tyr Arg Ser Cys

1265 1270 1275

Phe Val Tyr Asp Glu Asp Thr Asn Met Ile Lys Val Trp Tyr Gly

1280 1285 1290

Ala Val Ser Gln Asn Pro Gln Ile Trp Lys Ile Gly Phe Thr Glu

1295 1300 1305

Asn Asp Tyr Asp Lys Phe Ile Glu Gly Leu Thr Gln

1310 1315 1320

<210> 13

<211> 3882

<212> DNA

<213> third Clostridium bacterium (Clostridium terrium)

<400> 13

tataatttaa ttgataatat tagtgttgaa aaattagata ctgatatttc acaagcaaat 60

gaaaatgttt ttttgaatgg aaatggaatt gctttagaag tagataatag aggcgctaca 120

tgtatttatc tagtagatga aaatggagtt aaaacaaaag ctacgacttc tttagataca 180

gcagattttt caggttatcc aataataggt ggacaaaaga taagagattt tgtaattata 240

tcaaaaaatc tagaagaaaa cataaactcg atattaggtg ttggaaatag acttactatt 300

atatctaaaa gttcatctac taatctgata agaaagatag tatttgaaac atctaacagc 360

aatccaggag caatatattc aacagtaagt tataaagcag aaagtaacga tttattagta 420

gatagctttc atgaaaatga gtatacaatg agtttagggc aaggaccttt tcttgcatat 480

caagggtgtg cagatcaaca aggagcaaat actatcgtta atgttactaa tggatataac 540

cataatagtg gacaaaataa ttattctgta ggagttccat ttagttatgt ttataactct 600

gtggggggaa ttggaatagg tgatgcatca acttcaagaa gagaatttaa gttgcctatt 660

ataggaaaag ataatacagt ttcattagga atggagtgga atggacaaac tttaaaaaaa 720

ggtgctgaaa ctgctatagg tacaagtgtt ataactacaa caaatggtga ttattattct 780

gggctaaaga gttacgcaga agttatgaaa gataagggaa tatctgcacc agcttcaata 840

cctgatatag catatgattc tagatgggaa agttggggat tcgaatttga ttttacaata 900

gaaaaaatag ttaataaatt agatgaactt aaagcgatgg ggataaaaca aattactcta 960

gatgatgggt ggtacactta tgctggtgat tggaaattaa gtcctcaaaa gtttccaaat 1020

ggaaatgcag acatgaaata tcttacagat gaaatccata aaagaggaat gacagctatt 1080

ttatggtgga gaccagtaga cggagggata aatagcaaat tagtatctga acatccagag 1140

tggtttatta agaactcaca agggaatatg gttaggttac cagggcctgg aggtggaaat 1200

ggaggaacag caggatatgc attatgtcca aattcagaag gttcaattca acatcataaa 1260

gattttgtaa ctgtggcatt agaagaatgg ggatttgatg gattcaaaga agattatgta 1320

tggggaatac ctaaatgcta tgatagttct cataaacact caagtttatc agatacatta 1380

gaaaatcaat ataaattcta tgaagccata tatgaacagt ccatagcgat aaatccagat 1440

acttttatag aattatgtaa ttgcggaaca cctcaggatt tttattcaac accatatgtg 1500

aaccatgcac caacagcaga tccaatttcg agagtacaaa caagaacaag agtgaaagca 1560

tttaaagcta tatttggaga tgattttcca gtaacaacag atcataattc agtttggtta 1620

ccgtcagcat taggtacagg atcagttatg attactaaac atacaacatt aagtagttca 1680

gatagagaac aatataataa atacttcgga cttgcaagag atttagaatt agcaaaggga 1740

gaatttatag gaaacttata taaatacgga atagatccat tagagtcata tgttataaga 1800

aaaggagaag atatttatta ttcattctac aaagataatt ctagttattc aggaaatata 1860

gaaataaagg ggttagacag taacgccaca tatagaattg aagattatgt taacaataga 1920

gttattgcta gaggagtaaa gggaccaaca gcgactataa atacaagctt tactgataat 1980

ttattagtta gagcaatacc agatgataca ccagcagagg ttactacatt tgatgttgga 2040

aataatacaa tattatcatc aacagatagt ggaaattcta aatatttaaa tgctgtttct 2100

actacattag aaaagacagc aacaatagat agtttaagta tttatatagg aaataattca 2160

gaaaatggca aactacaaat tgctatttat gacgataata acgggaaacc tggtactaaa 2220

aaagcttacg tagaagagtt tgttcctact aaaaatagtt ggaatacaaa gaaggttgta 2280

aattctgtta cattaccttc agggcaatat tggttagttt tccaacctga taacgatgta 2340

ctacaaacaa aaactaatcc atcatccatg aaacaaagtg ctaacaataa tccatataat 2400

tataatatat taccaaattc atttcctatt ggaacaggat ataatgctta taaaggcgat 2460

gtatctttct atgcaacctt taaagaagca agcagtcaag caattcctca aaattcttgg 2520

gctctaaaat atgtagatag tgaagaaact acaggcgaaa atggaagagc tacaaatgct 2580

tttgatggta ataataatac tatttggcac acaaaatata gtggcggaaa cgctgcacca 2640

atgccgcatg agattcaaat tgatttaaga ggagtatata atataaatca aattaattat 2700

ctaccaagac aagatggagg aaccaatggt acaataaagg actatgaagt ttatttaagt 2760

ttagatggag tgaactgggg acaacctata tcaaaaggaa cctttgaatc aaactctaca 2820

gaaaaaatag taaaattcaa cgaaacaaaa tctaggtatg taaaacttaa agctctgtca 2880

gaaattaata ataaacaatt tactacagta gctgatttaa aggtatttgg atgggagata 2940

tccaaaatag aaaaaccatt acaaaatgct gaaacttatt tgaatatacc aacttatgat 3000

ggattaaatc aaagtactca tccagatgtc aaatatttta aaaatggttg gaatggatat 3060

aaatattgga tgataatgac tccaaataga acaggtagct cagttgctga aaatccttca 3120

atactagcat ctgatgatgg aataaattgg gaggttcctg caggtgttac aaatcctata 3180

gctccaatgc cacaagtagg acataattgt gatgttgata tgatatataa tgaagcaact 3240

gatgagttat gggtgtactg ggtagaatca gatgatataa caaaaggatg ggttaaatta 3300

ataaaatcaa aggatggagt aaattggagt tctcagcaag tggtagttga tgataatagg 3360

gcaaaatata gtactttatc accatctata atattcaaag ataataaata ctatatgtgg 3420

tcagttaata caggaaatag tggttggaac aatcaaagta ataaagttga attaagagaa 3480

tcaagtgacg gagtaaattg gtcaaatcca acagttgtaa acacattagc tcaagatggt 3540

tctcaaatat ggcatgtaaa tgtagaatat ataccatcaa aaaacgaata ttgggctata 3600

tatccagcat ataaaaatgg aacaggtagc gataaaacag aattgtatta tgcgaaatca 3660

agtgatggag taaattggac aacttataag aatcctatat tatcaaaagg aacatctggt 3720

aaatgggatg atatggagat atatagaagt tgttttgtgt acgatgaaga tacaaatatg 3780

ataaaggttt ggtatggagc tgtgagtcaa aatccacaaa tatggaaaat aggttttact 3840

gaaaatgatt atgataagtt tattgagggt ttaacacaat aa 3882

<210> 14

<211> 1293

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 14

Tyr Asn Leu Ile Asp Asn Ile Ser Val Glu Lys Leu Asp Thr Asp Ile

1 5 10 15

Ser Gln Ala Asn Glu Asn Val Phe Leu Asn Gly Asn Gly Ile Ala Leu

20 25 30

Glu Val Asp Asn Arg Gly Ala Thr Cys Ile Tyr Leu Val Asp Glu Asn

35 40 45

Gly Val Lys Thr Lys Ala Thr Thr Ser Leu Asp Thr Ala Asp Phe Ser

50 55 60

Gly Tyr Pro Ile Ile Gly Gly Gln Lys Ile Arg Asp Phe Val Ile Ile

65 70 75 80

Ser Lys Asn Leu Glu Glu Asn Ile Asn Ser Ile Leu Gly Val Gly Asn

85 90 95

Arg Leu Thr Ile Ile Ser Lys Ser Ser Ser Thr Asn Leu Ile Arg Lys

100 105 110

Ile Val Phe Glu Thr Ser Asn Ser Asn Pro Gly Ala Ile Tyr Ser Thr

115 120 125

Val Ser Tyr Lys Ala Glu Ser Asn Asp Leu Leu Val Asp Ser Phe His

130 135 140

Glu Asn Glu Tyr Thr Met Ser Leu Gly Gln Gly Pro Phe Leu Ala Tyr

145 150 155 160

Gln Gly Cys Ala Asp Gln Gln Gly Ala Asn Thr Ile Val Asn Val Thr

165 170 175

Asn Gly Tyr Asn His Asn Ser Gly Gln Asn Asn Tyr Ser Val Gly Val

180 185 190

Pro Phe Ser Tyr Val Tyr Asn Ser Val Gly Gly Ile Gly Ile Gly Asp

195 200 205

Ala Ser Thr Ser Arg Arg Glu Phe Lys Leu Pro Ile Ile Gly Lys Asp

210 215 220

Asn Thr Val Ser Leu Gly Met Glu Trp Asn Gly Gln Thr Leu Lys Lys

225 230 235 240

Gly Ala Glu Thr Ala Ile Gly Thr Ser Val Ile Thr Thr Thr Asn Gly

245 250 255

Asp Tyr Tyr Ser Gly Leu Lys Ser Tyr Ala Glu Val Met Lys Asp Lys

260 265 270

Gly Ile Ser Ala Pro Ala Ser Ile Pro Asp Ile Ala Tyr Asp Ser Arg

275 280 285

Trp Glu Ser Trp Gly Phe Glu Phe Asp Phe Thr Ile Glu Lys Ile Val

290 295 300

Asn Lys Leu Asp Glu Leu Lys Ala Met Gly Ile Lys Gln Ile Thr Leu

305 310 315 320

Asp Asp Gly Trp Tyr Thr Tyr Ala Gly Asp Trp Lys Leu Ser Pro Gln

325 330 335

Lys Phe Pro Asn Gly Asn Ala Asp Met Lys Tyr Leu Thr Asp Glu Ile

340 345 350

His Lys Arg Gly Met Thr Ala Ile Leu Trp Trp Arg Pro Val Asp Gly

355 360 365

Gly Ile Asn Ser Lys Leu Val Ser Glu His Pro Glu Trp Phe Ile Lys

370 375 380

Asn Ser Gln Gly Asn Met Val Arg Leu Pro Gly Pro Gly Gly Gly Asn

385 390 395 400

Gly Gly Thr Ala Gly Tyr Ala Leu Cys Pro Asn Ser Glu Gly Ser Ile

405 410 415

Gln His His Lys Asp Phe Val Thr Val Ala Leu Glu Glu Trp Gly Phe

420 425 430

Asp Gly Phe Lys Glu Asp Tyr Val Trp Gly Ile Pro Lys Cys Tyr Asp

435 440 445

Ser Ser His Lys His Ser Ser Leu Ser Asp Thr Leu Glu Asn Gln Tyr

450 455 460

Lys Phe Tyr Glu Ala Ile Tyr Glu Gln Ser Ile Ala Ile Asn Pro Asp

465 470 475 480

Thr Phe Ile Glu Leu Cys Asn Cys Gly Thr Pro Gln Asp Phe Tyr Ser

485 490 495

Thr Pro Tyr Val Asn His Ala Pro Thr Ala Asp Pro Ile Ser Arg Val

500 505 510

Gln Thr Arg Thr Arg Val Lys Ala Phe Lys Ala Ile Phe Gly Asp Asp

515 520 525

Phe Pro Val Thr Thr Asp His Asn Ser Val Trp Leu Pro Ser Ala Leu

530 535 540

Gly Thr Gly Ser Val Met Ile Thr Lys His Thr Thr Leu Ser Ser Ser

545 550 555 560

Asp Arg Glu Gln Tyr Asn Lys Tyr Phe Gly Leu Ala Arg Asp Leu Glu

565 570 575

Leu Ala Lys Gly Glu Phe Ile Gly Asn Leu Tyr Lys Tyr Gly Ile Asp

580 585 590

Pro Leu Glu Ser Tyr Val Ile Arg Lys Gly Glu Asp Ile Tyr Tyr Ser

595 600 605

Phe Tyr Lys Asp Asn Ser Ser Tyr Ser Gly Asn Ile Glu Ile Lys Gly

610 615 620

Leu Asp Ser Asn Ala Thr Tyr Arg Ile Glu Asp Tyr Val Asn Asn Arg

625 630 635 640

Val Ile Ala Arg Gly Val Lys Gly Pro Thr Ala Thr Ile Asn Thr Ser

645 650 655

Phe Thr Asp Asn Leu Leu Val Arg Ala Ile Pro Asp Asp Thr Pro Ala

660 665 670

Glu Val Thr Thr Phe Asp Val Gly Asn Asn Thr Ile Leu Ser Ser Thr

675 680 685

Asp Ser Gly Asn Ser Lys Tyr Leu Asn Ala Val Ser Thr Thr Leu Glu

690 695 700

Lys Thr Ala Thr Ile Asp Ser Leu Ser Ile Tyr Ile Gly Asn Asn Ser

705 710 715 720

Glu Asn Gly Lys Leu Gln Ile Ala Ile Tyr Asp Asp Asn Asn Gly Lys

725 730 735

Pro Gly Thr Lys Lys Ala Tyr Val Glu Glu Phe Val Pro Thr Lys Asn

740 745 750

Ser Trp Asn Thr Lys Lys Val Val Asn Ser Val Thr Leu Pro Ser Gly

755 760 765

Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn Asp Val Leu Gln Thr Lys

770 775 780

Thr Asn Pro Ser Ser Met Lys Gln Ser Ala Asn Asn Asn Pro Tyr Asn

785 790 795 800

Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile Gly Thr Gly Tyr Asn Ala

805 810 815

Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr Phe Lys Glu Ala Ser Ser

820 825 830

Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu Lys Tyr Val Asp Ser Glu

835 840 845

Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr Asn Ala Phe Asp Gly Asn

850 855 860

Asn Asn Thr Ile Trp His Thr Lys Tyr Ser Gly Gly Asn Ala Ala Pro

865 870 875 880

Met Pro His Glu Ile Gln Ile Asp Leu Arg Gly Val Tyr Asn Ile Asn

885 890 895

Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly Gly Thr Asn Gly Thr Ile

900 905 910

Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp Gly Val Asn Trp Gly Gln

915 920 925

Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn Ser Thr Glu Lys Ile Val

930 935 940

Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val Lys Leu Lys Ala Leu Ser

945 950 955 960

Glu Ile Asn Asn Lys Gln Phe Thr Thr Val Ala Asp Leu Lys Val Phe

965 970 975

Gly Trp Glu Ile Ser Lys Ile Glu Lys Pro Leu Gln Asn Ala Glu Thr

980 985 990

Tyr Leu Asn Ile Pro Thr Tyr Asp Gly Leu Asn Gln Ser Thr His Pro

995 1000 1005

Asp Val Lys Tyr Phe Lys Asn Gly Trp Asn Gly Tyr Lys Tyr Trp

1010 1015 1020

Met Ile Met Thr Pro Asn Arg Thr Gly Ser Ser Val Ala Glu Asn

1025 1030 1035

Pro Ser Ile Leu Ala Ser Asp Asp Gly Ile Asn Trp Glu Val Pro

1040 1045 1050

Ala Gly Val Thr Asn Pro Ile Ala Pro Met Pro Gln Val Gly His

1055 1060 1065

Asn Cys Asp Val Asp Met Ile Tyr Asn Glu Ala Thr Asp Glu Leu

1070 1075 1080

Trp Val Tyr Trp Val Glu Ser Asp Asp Ile Thr Lys Gly Trp Val

1085 1090 1095

Lys Leu Ile Lys Ser Lys Asp Gly Val Asn Trp Ser Ser Gln Gln

1100 1105 1110

Val Val Val Asp Asp Asn Arg Ala Lys Tyr Ser Thr Leu Ser Pro

1115 1120 1125

Ser Ile Ile Phe Lys Asp Asn Lys Tyr Tyr Met Trp Ser Val Asn

1130 1135 1140

Thr Gly Asn Ser Gly Trp Asn Asn Gln Ser Asn Lys Val Glu Leu

1145 1150 1155

Arg Glu Ser Ser Asp Gly Val Asn Trp Ser Asn Pro Thr Val Val

1160 1165 1170

Asn Thr Leu Ala Gln Asp Gly Ser Gln Ile Trp His Val Asn Val

1175 1180 1185

Glu Tyr Ile Pro Ser Lys Asn Glu Tyr Trp Ala Ile Tyr Pro Ala

1190 1195 1200

Tyr Lys Asn Gly Thr Gly Ser Asp Lys Thr Glu Leu Tyr Tyr Ala

1205 1210 1215

Lys Ser Ser Asp Gly Val Asn Trp Thr Thr Tyr Lys Asn Pro Ile

1220 1225 1230

Leu Ser Lys Gly Thr Ser Gly Lys Trp Asp Asp Met Glu Ile Tyr

1235 1240 1245

Arg Ser Cys Phe Val Tyr Asp Glu Asp Thr Asn Met Ile Lys Val

1250 1255 1260

Trp Tyr Gly Ala Val Ser Gln Asn Pro Gln Ile Trp Lys Ile Gly

1265 1270 1275

Phe Thr Glu Asn Asp Tyr Asp Lys Phe Ile Glu Gly Leu Thr Gln

1280 1285 1290

<210> 15

<211> 1313

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 15

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Tyr Asn Leu Ile Asp Asn Ile Ser Val Glu Lys Leu

20 25 30

Asp Thr Asp Ile Ser Gln Ala Asn Glu Asn Val Phe Leu Asn Gly Asn

35 40 45

Gly Ile Ala Leu Glu Val Asp Asn Arg Gly Ala Thr Cys Ile Tyr Leu

50 55 60

Val Asp Glu Asn Gly Val Lys Thr Lys Ala Thr Thr Ser Leu Asp Thr

65 70 75 80

Ala Asp Phe Ser Gly Tyr Pro Ile Ile Gly Gly Gln Lys Ile Arg Asp

85 90 95

Phe Val Ile Ile Ser Lys Asn Leu Glu Glu Asn Ile Asn Ser Ile Leu

100 105 110

Gly Val Gly Asn Arg Leu Thr Ile Ile Ser Lys Ser Ser Ser Thr Asn

115 120 125

Leu Ile Arg Lys Ile Val Phe Glu Thr Ser Asn Ser Asn Pro Gly Ala

130 135 140

Ile Tyr Ser Thr Val Ser Tyr Lys Ala Glu Ser Asn Asp Leu Leu Val

145 150 155 160

Asp Ser Phe His Glu Asn Glu Tyr Thr Met Ser Leu Gly Gln Gly Pro

165 170 175

Phe Leu Ala Tyr Gln Gly Cys Ala Asp Gln Gln Gly Ala Asn Thr Ile

180 185 190

Val Asn Val Thr Asn Gly Tyr Asn His Asn Ser Gly Gln Asn Asn Tyr

195 200 205

Ser Val Gly Val Pro Phe Ser Tyr Val Tyr Asn Ser Val Gly Gly Ile

210 215 220

Gly Ile Gly Asp Ala Ser Thr Ser Arg Arg Glu Phe Lys Leu Pro Ile

225 230 235 240

Ile Gly Lys Asp Asn Thr Val Ser Leu Gly Met Glu Trp Asn Gly Gln

245 250 255

Thr Leu Lys Lys Gly Ala Glu Thr Ala Ile Gly Thr Ser Val Ile Thr

260 265 270

Thr Thr Asn Gly Asp Tyr Tyr Ser Gly Leu Lys Ser Tyr Ala Glu Val

275 280 285

Met Lys Asp Lys Gly Ile Ser Ala Pro Ala Ser Ile Pro Asp Ile Ala

290 295 300

Tyr Asp Ser Arg Trp Glu Ser Trp Gly Phe Glu Phe Asp Phe Thr Ile

305 310 315 320

Glu Lys Ile Val Asn Lys Leu Asp Glu Leu Lys Ala Met Gly Ile Lys

325 330 335

Gln Ile Thr Leu Asp Asp Gly Trp Tyr Thr Tyr Ala Gly Asp Trp Lys

340 345 350

Leu Ser Pro Gln Lys Phe Pro Asn Gly Asn Ala Asp Met Lys Tyr Leu

355 360 365

Thr Asp Glu Ile His Lys Arg Gly Met Thr Ala Ile Leu Trp Trp Arg

370 375 380

Pro Val Asp Gly Gly Ile Asn Ser Lys Leu Val Ser Glu His Pro Glu

385 390 395 400

Trp Phe Ile Lys Asn Ser Gln Gly Asn Met Val Arg Leu Pro Gly Pro

405 410 415

Gly Gly Gly Asn Gly Gly Thr Ala Gly Tyr Ala Leu Cys Pro Asn Ser

420 425 430

Glu Gly Ser Ile Gln His His Lys Asp Phe Val Thr Val Ala Leu Glu

435 440 445

Glu Trp Gly Phe Asp Gly Phe Lys Glu Asp Tyr Val Trp Gly Ile Pro

450 455 460

Lys Cys Tyr Asp Ser Ser His Lys His Ser Ser Leu Ser Asp Thr Leu

465 470 475 480

Glu Asn Gln Tyr Lys Phe Tyr Glu Ala Ile Tyr Glu Gln Ser Ile Ala

485 490 495

Ile Asn Pro Asp Thr Phe Ile Glu Leu Cys Asn Cys Gly Thr Pro Gln

500 505 510

Asp Phe Tyr Ser Thr Pro Tyr Val Asn His Ala Pro Thr Ala Asp Pro

515 520 525

Ile Ser Arg Val Gln Thr Arg Thr Arg Val Lys Ala Phe Lys Ala Ile

530 535 540

Phe Gly Asp Asp Phe Pro Val Thr Thr Asp His Asn Ser Val Trp Leu

545 550 555 560

Pro Ser Ala Leu Gly Thr Gly Ser Val Met Ile Thr Lys His Thr Thr

565 570 575

Leu Ser Ser Ser Asp Arg Glu Gln Tyr Asn Lys Tyr Phe Gly Leu Ala

580 585 590

Arg Asp Leu Glu Leu Ala Lys Gly Glu Phe Ile Gly Asn Leu Tyr Lys

595 600 605

Tyr Gly Ile Asp Pro Leu Glu Ser Tyr Val Ile Arg Lys Gly Glu Asp

610 615 620

Ile Tyr Tyr Ser Phe Tyr Lys Asp Asn Ser Ser Tyr Ser Gly Asn Ile

625 630 635 640

Glu Ile Lys Gly Leu Asp Ser Asn Ala Thr Tyr Arg Ile Glu Asp Tyr

645 650 655

Val Asn Asn Arg Val Ile Ala Arg Gly Val Lys Gly Pro Thr Ala Thr

660 665 670

Ile Asn Thr Ser Phe Thr Asp Asn Leu Leu Val Arg Ala Ile Pro Asp

675 680 685

Asp Thr Pro Ala Glu Val Thr Thr Phe Asp Val Gly Asn Asn Thr Ile

690 695 700

Leu Ser Ser Thr Asp Ser Gly Asn Ser Lys Tyr Leu Asn Ala Val Ser

705 710 715 720

Thr Thr Leu Glu Lys Thr Ala Thr Ile Asp Ser Leu Ser Ile Tyr Ile

725 730 735

Gly Asn Asn Ser Glu Asn Gly Lys Leu Gln Ile Ala Ile Tyr Asp Asp

740 745 750

Asn Asn Gly Lys Pro Gly Thr Lys Lys Ala Tyr Val Glu Glu Phe Val

755 760 765

Pro Thr Lys Asn Ser Trp Asn Thr Lys Lys Val Val Asn Ser Val Thr

770 775 780

Leu Pro Ser Gly Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn Asp Val

785 790 795 800

Leu Gln Thr Lys Thr Asn Pro Ser Ser Met Lys Gln Ser Ala Asn Asn

805 810 815

Asn Pro Tyr Asn Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile Gly Thr

820 825 830

Gly Tyr Asn Ala Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr Phe Lys

835 840 845

Glu Ala Ser Ser Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu Lys Tyr

850 855 860

Val Asp Ser Glu Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr Asn Ala

865 870 875 880

Phe Asp Gly Asn Asn Asn Thr Ile Trp His Thr Lys Tyr Ser Gly Gly

885 890 895

Asn Ala Ala Pro Met Pro His Glu Ile Gln Ile Asp Leu Arg Gly Val

900 905 910

Tyr Asn Ile Asn Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly Gly Thr

915 920 925

Asn Gly Thr Ile Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp Gly Val

930 935 940

Asn Trp Gly Gln Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn Ser Thr

945 950 955 960

Glu Lys Ile Val Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val Lys Leu

965 970 975

Lys Ala Leu Ser Glu Ile Asn Asn Lys Gln Phe Thr Thr Val Ala Asp

980 985 990

Leu Lys Val Phe Gly Trp Glu Ile Ser Lys Ile Glu Lys Pro Leu Gln

995 1000 1005

Asn Ala Glu Thr Tyr Leu Asn Ile Pro Thr Tyr Asp Gly Leu Asn

1010 1015 1020

Gln Ser Thr His Pro Asp Val Lys Tyr Phe Lys Asn Gly Trp Asn

1025 1030 1035

Gly Tyr Lys Tyr Trp Met Ile Met Thr Pro Asn Arg Thr Gly Ser

1040 1045 1050

Ser Val Ala Glu Asn Pro Ser Ile Leu Ala Ser Asp Asp Gly Ile

1055 1060 1065

Asn Trp Glu Val Pro Ala Gly Val Thr Asn Pro Ile Ala Pro Met

1070 1075 1080

Pro Gln Val Gly His Asn Cys Asp Val Asp Met Ile Tyr Asn Glu

1085 1090 1095

Ala Thr Asp Glu Leu Trp Val Tyr Trp Val Glu Ser Asp Asp Ile

1100 1105 1110

Thr Lys Gly Trp Val Lys Leu Ile Lys Ser Lys Asp Gly Val Asn

1115 1120 1125

Trp Ser Ser Gln Gln Val Val Val Asp Asp Asn Arg Ala Lys Tyr

1130 1135 1140

Ser Thr Leu Ser Pro Ser Ile Ile Phe Lys Asp Asn Lys Tyr Tyr

1145 1150 1155

Met Trp Ser Val Asn Thr Gly Asn Ser Gly Trp Asn Asn Gln Ser

1160 1165 1170

Asn Lys Val Glu Leu Arg Glu Ser Ser Asp Gly Val Asn Trp Ser

1175 1180 1185

Asn Pro Thr Val Val Asn Thr Leu Ala Gln Asp Gly Ser Gln Ile

1190 1195 1200

Trp His Val Asn Val Glu Tyr Ile Pro Ser Lys Asn Glu Tyr Trp

1205 1210 1215

Ala Ile Tyr Pro Ala Tyr Lys Asn Gly Thr Gly Ser Asp Lys Thr

1220 1225 1230

Glu Leu Tyr Tyr Ala Lys Ser Ser Asp Gly Val Asn Trp Thr Thr

1235 1240 1245

Tyr Lys Asn Pro Ile Leu Ser Lys Gly Thr Ser Gly Lys Trp Asp

1250 1255 1260

Asp Met Glu Ile Tyr Arg Ser Cys Phe Val Tyr Asp Glu Asp Thr

1265 1270 1275

Asn Met Ile Lys Val Trp Tyr Gly Ala Val Ser Gln Asn Pro Gln

1280 1285 1290

Ile Trp Lys Ile Gly Phe Thr Glu Asn Asp Tyr Asp Lys Phe Ile

1295 1300 1305

Glu Gly Leu Thr Gln

1310

<210> 16

<211> 1584

<212> DNA

<213> third Clostridium bacterium (Clostridium terrium)

<400> 16

tcagggcaat attggttagt tttccaacct gataacgatg tactacaaac aaaaactaat 60

ccatcatcca tgaaacaaag tgctaacaat aatccatata attataatat attaccaaat 120

tcatttccta ttggaacagg atataatgct tataaaggcg atgtatcttt ctatgcaacc 180

tttaaagaag caagcagtca agcaattcct caaaattctt gggctctaaa atatgtagat 240

agtgaagaaa ctacaggcga aaatggaaga gctacaaatg cttttgatgg taataataat 300

actatttggc acacaaaata tagtggcgga aacgctgcac caatgccgca tgagattcaa 360

attgatttaa gaggagtata taatataaat caaattaatt atctaccaag acaagatgga 420

ggaaccaatg gtacaataaa ggactatgaa gtttatttaa gtttagatgg agtgaactgg 480

ggacaaccta tatcaaaagg aacctttgaa tcaaactcta cagaaaaaat agtaaaattc 540

aacgaaacaa aatctaggta tgtaaaactt aaagctctgt cagaaattaa taataaacaa 600

tttactacag tagctgattt aaaggtattt ggatgggaga tatccaaaat agaaaaacca 660

ttacaaaatg ctgaaactta tttgaatata ccaacttatg atggattaaa tcaaagtact 720

catccagatg tcaaatattt taaaaatggt tggaatggat ataaatattg gatgataatg 780

actccaaata gaacaggtag ctcagttgct gaaaatcctt caatactagc atctgatgat 840

ggaataaatt gggaggttcc tgcaggtgtt acaaatccta tagctccaat gccacaagta 900

ggacataatt gtgatgttga tatgatatat aatgaagcaa ctgatgagtt atgggtgtac 960

tgggtagaat cagatgatat aacaaaagga tgggttaaat taataaaatc aaaggatgga 1020

gtaaattgga gttctcagca agtggtagtt gatgataata gggcaaaata tagtacttta 1080

tcaccatcta taatattcaa agataataaa tactatatgt ggtcagttaa tacaggaaat 1140

agtggttgga acaatcaaag taataaagtt gaattaagag aatcaagtga cggagtaaat 1200

tggtcaaatc caacagttgt aaacacatta gctcaagatg gttctcaaat atggcatgta 1260

aatgtagaat atataccatc aaaaaacgaa tattgggcta tatatccagc atataaaaat 1320

ggaacaggta gcgataaaac agaattgtat tatgcgaaat caagtgatgg agtaaattgg 1380

acaacttata agaatcctat attatcaaaa ggaacatctg gtaaatggga tgatatggag 1440

atatatagaa gttgttttgt gtacgatgaa gatacaaata tgataaaggt ttggtatgga 1500

gctgtgagtc aaaatccaca aatatggaaa ataggtttta ctgaaaatga ttatgataag 1560

tttattgagg gtttaacaca ataa 1584

<210> 17

<211> 547

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 17

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Ser Gly Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn

20 25 30

Asp Val Leu Gln Thr Lys Thr Asn Pro Ser Ser Met Lys Gln Ser Ala

35 40 45

Asn Asn Asn Pro Tyr Asn Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile

50 55 60

Gly Thr Gly Tyr Asn Ala Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr

65 70 75 80

Phe Lys Glu Ala Ser Ser Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu

85 90 95

Lys Tyr Val Asp Ser Glu Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr

100 105 110

Asn Ala Phe Asp Gly Asn Asn Asn Thr Ile Trp His Thr Lys Tyr Ser

115 120 125

Gly Gly Asn Ala Ala Pro Met Pro His Glu Ile Gln Ile Asp Leu Arg

130 135 140

Gly Val Tyr Asn Ile Asn Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly

145 150 155 160

Gly Thr Asn Gly Thr Ile Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp

165 170 175

Gly Val Asn Trp Gly Gln Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn

180 185 190

Ser Thr Glu Lys Ile Val Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val

195 200 205

Lys Leu Lys Ala Leu Ser Glu Ile Asn Asn Lys Gln Phe Thr Thr Val

210 215 220

Ala Asp Leu Lys Val Phe Gly Trp Glu Ile Ser Lys Ile Glu Lys Pro

225 230 235 240

Leu Gln Asn Ala Glu Thr Tyr Leu Asn Ile Pro Thr Tyr Asp Gly Leu

245 250 255

Asn Gln Ser Thr His Pro Asp Val Lys Tyr Phe Lys Asn Gly Trp Asn

260 265 270

Gly Tyr Lys Tyr Trp Met Ile Met Thr Pro Asn Arg Thr Gly Ser Ser

275 280 285

Val Ala Glu Asn Pro Ser Ile Leu Ala Ser Asp Asp Gly Ile Asn Trp

290 295 300

Glu Val Pro Ala Gly Val Thr Asn Pro Ile Ala Pro Met Pro Gln Val

305 310 315 320

Gly His Asn Cys Asp Val Asp Met Ile Tyr Asn Glu Ala Thr Asp Glu

325 330 335

Leu Trp Val Tyr Trp Val Glu Ser Asp Asp Ile Thr Lys Gly Trp Val

340 345 350

Lys Leu Ile Lys Ser Lys Asp Gly Val Asn Trp Ser Ser Gln Gln Val

355 360 365

Val Val Asp Asp Asn Arg Ala Lys Tyr Ser Thr Leu Ser Pro Ser Ile

370 375 380

Ile Phe Lys Asp Asn Lys Tyr Tyr Met Trp Ser Val Asn Thr Gly Asn

385 390 395 400

Ser Gly Trp Asn Asn Gln Ser Asn Lys Val Glu Leu Arg Glu Ser Ser

405 410 415

Asp Gly Val Asn Trp Ser Asn Pro Thr Val Val Asn Thr Leu Ala Gln

420 425 430

Asp Gly Ser Gln Ile Trp His Val Asn Val Glu Tyr Ile Pro Ser Lys

435 440 445

Asn Glu Tyr Trp Ala Ile Tyr Pro Ala Tyr Lys Asn Gly Thr Gly Ser

450 455 460

Asp Lys Thr Glu Leu Tyr Tyr Ala Lys Ser Ser Asp Gly Val Asn Trp

465 470 475 480

Thr Thr Tyr Lys Asn Pro Ile Leu Ser Lys Gly Thr Ser Gly Lys Trp

485 490 495

Asp Asp Met Glu Ile Tyr Arg Ser Cys Phe Val Tyr Asp Glu Asp Thr

500 505 510

Asn Met Ile Lys Val Trp Tyr Gly Ala Val Ser Gln Asn Pro Gln Ile

515 520 525

Trp Lys Ile Gly Phe Thr Glu Asn Asp Tyr Asp Lys Phe Ile Glu Gly

530 535 540

Leu Thr Gln

545

<210> 18

<211> 2958

<212> DNA

<213> third Clostridium bacterium (Clostridium terrium)

<400> 18

tataatttaa ttgataatat tagtgttgaa aaattagata ctgatatttc acaagcaaat 60

gaaaatgttt ttttgaatgg aaatggaatt gctttagaag tagataatag aggcgctaca 120

tgtatttatc tagtagatga aaatggagtt aaaacaaaag ctacgacttc tttagataca 180

gcagattttt caggttatcc aataataggt ggacaaaaga taagagattt tgtaattata 240

tcaaaaaatc tagaagaaaa cataaactcg atattaggtg ttggaaatag acttactatt 300

atatctaaaa gttcatctac taatctgata agaaagatag tatttgaaac atctaacagc 360

aatccaggag caatatattc aacagtaagt tataaagcag aaagtaacga tttattagta 420

gatagctttc atgaaaatga gtatacaatg agtttagggc aaggaccttt tcttgcatat 480

caagggtgtg cagatcaaca aggagcaaat actatcgtta atgttactaa tggatataac 540

cataatagtg gacaaaataa ttattctgta ggagttccat ttagttatgt ttataactct 600

gtggggggaa ttggaatagg tgatgcatca acttcaagaa gagaatttaa gttgcctatt 660

ataggaaaag ataatacagt ttcattagga atggagtgga atggacaaac tttaaaaaaa 720

ggtgctgaaa ctgctatagg tacaagtgtt ataactacaa caaatggtga ttattattct 780

gggctaaaga gttacgcaga agttatgaaa gataagggaa tatctgcacc agcttcaata 840

cctgatatag catatgattc tagatgggaa agttggggat tcgaatttga ttttacaata 900

gaaaaaatag ttaataaatt agatgaactt aaagcgatgg ggataaaaca aattactcta 960

gatgatgggt ggtacactta tgctggtgat tggaaattaa gtcctcaaaa gtttccaaat 1020

ggaaatgcag acatgaaata tcttacagat gaaatccata aaagaggaat gacagctatt 1080

ttatggtgga gaccagtaga cggagggata aatagcaaat tagtatctga acatccagag 1140

tggtttatta agaactcaca agggaatatg gttaggttac cagggcctgg aggtggaaat 1200

ggaggaacag caggatatgc attatgtcca aattcagaag gttcaattca acatcataaa 1260

gattttgtaa ctgtggcatt agaagaatgg ggatttgatg gattcaaaga agattatgta 1320

tggggaatac ctaaatgcta tgatagttct cataaacact caagtttatc agatacatta 1380

gaaaatcaat ataaattcta tgaagccata tatgaacagt ccatagcgat aaatccagat 1440

acttttatag aattatgtaa ttgcggaaca cctcaggatt tttattcaac accatatgtg 1500

aaccatgcac caacagcaga tccaatttcg agagtacaaa caagaacaag agtgaaagca 1560

tttaaagcta tatttggaga tgattttcca gtaacaacag atcataattc agtttggtta 1620

ccgtcagcat taggtacagg atcagttatg attactaaac atacaacatt aagtagttca 1680

gatagagaac aatataataa atacttcgga cttgcaagag atttagaatt agcaaaggga 1740

gaatttatag gaaacttata taaatacgga atagatccat tagagtcata tgttataaga 1800

aaaggagaag atatttatta ttcattctac aaagataatt ctagttattc aggaaatata 1860

gaaataaagg ggttagacag taacgccaca tatagaattg aagattatgt taacaataga 1920

gttattgcta gaggagtaaa gggaccaaca gcgactataa atacaagctt tactgataat 1980

ttattagtta gagcaatacc agatgataca ccagcagagg ttactacatt tgatgttgga 2040

aataatacaa tattatcatc aacagatagt ggaaattcta aatatttaaa tgctgtttct 2100

actacattag aaaagacagc aacaatagat agtttaagta tttatatagg aaataattca 2160

gaaaatggca aactacaaat tgctatttat gacgataata acgggaaacc tggtactaaa 2220

aaagcttacg tagaagagtt tgttcctact aaaaatagtt ggaatacaaa gaaggttgta 2280

aattctgtta cattaccttc agggcaatat tggttagttt tccaacctga taacgatgta 2340

ctacaaacaa aaactaatcc atcatccatg aaacaaagtg ctaacaataa tccatataat 2400

tataatatat taccaaattc atttcctatt ggaacaggat ataatgctta taaaggcgat 2460

gtatctttct atgcaacctt taaagaagca agcagtcaag caattcctca aaattcttgg 2520

gctctaaaat atgtagatag tgaagaaact acaggcgaaa atggaagagc tacaaatgct 2580

tttgatggta ataataatac tatttggcac acaaaatata gtggcggaaa cgctgcacca 2640

atgccgcatg agattcaaat tgatttaaga ggagtatata atataaatca aattaattat 2700

ctaccaagac aagatggagg aaccaatggt acaataaagg actatgaagt ttatttaagt 2760

ttagatggag tgaactgggg acaacctata tcaaaaggaa cctttgaatc aaactctaca 2820

gaaaaaatag taaaattcaa cgaaacaaaa tctaggtatg taaaacttaa agctctgtca 2880

gaaattaata ataaacaatt tactacagta gctgatttaa aggtatttgg atgggagata 2940

tccaaaatag aaaaataa 2958

<210> 19

<211> 1005

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 19

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Tyr Asn Leu Ile Asp Asn Ile Ser Val Glu Lys Leu

20 25 30

Asp Thr Asp Ile Ser Gln Ala Asn Glu Asn Val Phe Leu Asn Gly Asn

35 40 45

Gly Ile Ala Leu Glu Val Asp Asn Arg Gly Ala Thr Cys Ile Tyr Leu

50 55 60

Val Asp Glu Asn Gly Val Lys Thr Lys Ala Thr Thr Ser Leu Asp Thr

65 70 75 80

Ala Asp Phe Ser Gly Tyr Pro Ile Ile Gly Gly Gln Lys Ile Arg Asp

85 90 95

Phe Val Ile Ile Ser Lys Asn Leu Glu Glu Asn Ile Asn Ser Ile Leu

100 105 110

Gly Val Gly Asn Arg Leu Thr Ile Ile Ser Lys Ser Ser Ser Thr Asn

115 120 125

Leu Ile Arg Lys Ile Val Phe Glu Thr Ser Asn Ser Asn Pro Gly Ala

130 135 140

Ile Tyr Ser Thr Val Ser Tyr Lys Ala Glu Ser Asn Asp Leu Leu Val

145 150 155 160

Asp Ser Phe His Glu Asn Glu Tyr Thr Met Ser Leu Gly Gln Gly Pro

165 170 175

Phe Leu Ala Tyr Gln Gly Cys Ala Asp Gln Gln Gly Ala Asn Thr Ile

180 185 190

Val Asn Val Thr Asn Gly Tyr Asn His Asn Ser Gly Gln Asn Asn Tyr

195 200 205

Ser Val Gly Val Pro Phe Ser Tyr Val Tyr Asn Ser Val Gly Gly Ile

210 215 220

Gly Ile Gly Asp Ala Ser Thr Ser Arg Arg Glu Phe Lys Leu Pro Ile

225 230 235 240

Ile Gly Lys Asp Asn Thr Val Ser Leu Gly Met Glu Trp Asn Gly Gln

245 250 255

Thr Leu Lys Lys Gly Ala Glu Thr Ala Ile Gly Thr Ser Val Ile Thr

260 265 270

Thr Thr Asn Gly Asp Tyr Tyr Ser Gly Leu Lys Ser Tyr Ala Glu Val

275 280 285

Met Lys Asp Lys Gly Ile Ser Ala Pro Ala Ser Ile Pro Asp Ile Ala

290 295 300

Tyr Asp Ser Arg Trp Glu Ser Trp Gly Phe Glu Phe Asp Phe Thr Ile

305 310 315 320

Glu Lys Ile Val Asn Lys Leu Asp Glu Leu Lys Ala Met Gly Ile Lys

325 330 335

Gln Ile Thr Leu Asp Asp Gly Trp Tyr Thr Tyr Ala Gly Asp Trp Lys

340 345 350

Leu Ser Pro Gln Lys Phe Pro Asn Gly Asn Ala Asp Met Lys Tyr Leu

355 360 365

Thr Asp Glu Ile His Lys Arg Gly Met Thr Ala Ile Leu Trp Trp Arg

370 375 380

Pro Val Asp Gly Gly Ile Asn Ser Lys Leu Val Ser Glu His Pro Glu

385 390 395 400

Trp Phe Ile Lys Asn Ser Gln Gly Asn Met Val Arg Leu Pro Gly Pro

405 410 415

Gly Gly Gly Asn Gly Gly Thr Ala Gly Tyr Ala Leu Cys Pro Asn Ser

420 425 430

Glu Gly Ser Ile Gln His His Lys Asp Phe Val Thr Val Ala Leu Glu

435 440 445

Glu Trp Gly Phe Asp Gly Phe Lys Glu Asp Tyr Val Trp Gly Ile Pro

450 455 460

Lys Cys Tyr Asp Ser Ser His Lys His Ser Ser Leu Ser Asp Thr Leu

465 470 475 480

Glu Asn Gln Tyr Lys Phe Tyr Glu Ala Ile Tyr Glu Gln Ser Ile Ala

485 490 495

Ile Asn Pro Asp Thr Phe Ile Glu Leu Cys Asn Cys Gly Thr Pro Gln

500 505 510

Asp Phe Tyr Ser Thr Pro Tyr Val Asn His Ala Pro Thr Ala Asp Pro

515 520 525

Ile Ser Arg Val Gln Thr Arg Thr Arg Val Lys Ala Phe Lys Ala Ile

530 535 540

Phe Gly Asp Asp Phe Pro Val Thr Thr Asp His Asn Ser Val Trp Leu

545 550 555 560

Pro Ser Ala Leu Gly Thr Gly Ser Val Met Ile Thr Lys His Thr Thr

565 570 575

Leu Ser Ser Ser Asp Arg Glu Gln Tyr Asn Lys Tyr Phe Gly Leu Ala

580 585 590

Arg Asp Leu Glu Leu Ala Lys Gly Glu Phe Ile Gly Asn Leu Tyr Lys

595 600 605

Tyr Gly Ile Asp Pro Leu Glu Ser Tyr Val Ile Arg Lys Gly Glu Asp

610 615 620

Ile Tyr Tyr Ser Phe Tyr Lys Asp Asn Ser Ser Tyr Ser Gly Asn Ile

625 630 635 640

Glu Ile Lys Gly Leu Asp Ser Asn Ala Thr Tyr Arg Ile Glu Asp Tyr

645 650 655

Val Asn Asn Arg Val Ile Ala Arg Gly Val Lys Gly Pro Thr Ala Thr

660 665 670

Ile Asn Thr Ser Phe Thr Asp Asn Leu Leu Val Arg Ala Ile Pro Asp

675 680 685

Asp Thr Pro Ala Glu Val Thr Thr Phe Asp Val Gly Asn Asn Thr Ile

690 695 700

Leu Ser Ser Thr Asp Ser Gly Asn Ser Lys Tyr Leu Asn Ala Val Ser

705 710 715 720

Thr Thr Leu Glu Lys Thr Ala Thr Ile Asp Ser Leu Ser Ile Tyr Ile

725 730 735

Gly Asn Asn Ser Glu Asn Gly Lys Leu Gln Ile Ala Ile Tyr Asp Asp

740 745 750

Asn Asn Gly Lys Pro Gly Thr Lys Lys Ala Tyr Val Glu Glu Phe Val

755 760 765

Pro Thr Lys Asn Ser Trp Asn Thr Lys Lys Val Val Asn Ser Val Thr

770 775 780

Leu Pro Ser Gly Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn Asp Val

785 790 795 800

Leu Gln Thr Lys Thr Asn Pro Ser Ser Met Lys Gln Ser Ala Asn Asn

805 810 815

Asn Pro Tyr Asn Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile Gly Thr

820 825 830

Gly Tyr Asn Ala Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr Phe Lys

835 840 845

Glu Ala Ser Ser Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu Lys Tyr

850 855 860

Val Asp Ser Glu Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr Asn Ala

865 870 875 880

Phe Asp Gly Asn Asn Asn Thr Ile Trp His Thr Lys Tyr Ser Gly Gly

885 890 895

Asn Ala Ala Pro Met Pro His Glu Ile Gln Ile Asp Leu Arg Gly Val

900 905 910

Tyr Asn Ile Asn Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly Gly Thr

915 920 925

Asn Gly Thr Ile Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp Gly Val

930 935 940

Asn Trp Gly Gln Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn Ser Thr

945 950 955 960

Glu Lys Ile Val Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val Lys Leu

965 970 975

Lys Ala Leu Ser Glu Ile Asn Asn Lys Gln Phe Thr Thr Val Ala Asp

980 985 990

Leu Lys Val Phe Gly Trp Glu Ile Ser Lys Ile Glu Lys

995 1000 1005

<210> 20

<211> 3786

<212> DNA

<213> Robinsoniella peoriensis

<400> 20

gggaacggat tagaggtgaa agcctcgcca agggaggtgg cacaaataac cggaaacggg 60

gtatcggtga cgttttttca ggaagatggc acggtgcagt tatcctgtat agaggatgat 120

ggcaatactg cttttatgac caggaactca gaggtctctt atccggtggt gggtggggag 180

gaagtaacag acttttcaga ctttcaatgt gaagtacagg aaaacgtaac cggagctgcg 240

ggagccggca gccggatgac aatcacctcc atttccagcg gcagggggat tcagcggtcg 300

gtagtcattg agacggtaga tgaggtaaaa ggcctgctcc atatcagcag ttcttatagg 360

gcagaagaag aggtagatgc agacgaattt attgacagca gattcagcct ggataatccc 420

tcagatacag tctggagtta caatggcggc ggtgaggggg cccagagccg atacgatact 480

ctacagaaaa tagatctgtc ggatggtgaa agcttctata gggagaactt acagaatcaa 540

actgcggcag gtattccggt ggcggatatc tacgggaaag acgggggtat tacggtgggt 600

gatgccagtg tgacccggcg acagctttcc actccggtaa acgagaggaa tggtaccgct 660

tatgtgtccg tgaaacatcc aggtgcagtt attacccaaa gggaaacaga aatcagccag 720

agctttgtca atgtacacag aggcgactat tattcggggc tgcggggtta tgccgatggt 780

atgaagcaga taggatttac cacactctcc cgggaacaga ttcctgaaag cagctatgat 840

ctccgctggg agagctgggg atgggaattt gactggacag tggaactgat tatcaataag 900

ctggacgagt taaaagagat gggaatcaaa cagattaccc tggatgacgg ctggtataat 960

gccgcaggag aatgggggct gaacaactgg aagcttccta atggtgcttt ggacatgcgg 1020

catctgactg atgcaattca tgaaaggggg atgactgcag tattgtggtg gcgtccctgt 1080

gacggtggaa gggaagacag cgcattattt aaagagcatc cagagtattt tataaaaaac 1140

caggacggaa gctttgggaa gctggcagga ccgggacagt ggaacagttt tctgggaagc 1200

tgcggttatg cgctgtgtcc tttgtcagaa ggggcagtac agagccaggt tgattttatt 1260

aaccgtgcta tgaatgaatg gggatttgat ggatttaaaa gtgattatgt atggagcctt 1320

ccaaagtgct acagtcagga ccatcaccat gaatacccgg aagaatccac agaacagcag 1380

gctgtgttct accgggcagt ttatgaggct atgacagaca atgacccgaa tgcatttcac 1440

cttctatgca actgcggaac gccacaggat tattattctc tgccctatgt aacccaggtg 1500

cctactgccg atcccacttc tgtggatcag acaaggagaa gggtaaaggc atataaagca 1560

ctatgcggtg attatttccc tgttacgaca gatcataatg aagtctggta tccttcaacc 1620

ataggaacgg gagccatact gattgaaaaa cgtgacttgt caggctggga agaggaggag 1680

tatgcaaaat ggcttaaaat tgctcaggaa aaccaattgc ataaagggac atttattggg 1740

gatttgtaca gttacggata tgacccttat gaaacctata cggtgtataa agacggaatc 1800

atgtattatg cattctataa agacggaaac cggtaccgtc cgtccggtaa cccggatatc 1860

gaattaaaag ggctggaaga cggaaagctg taccgcatcg tagattatgt aaataatcag 1920

gtagttgcca caaatgtaac cagtagcaat gctgtatttt cttacccttt cagcgattac 1980

ttgctggtaa aagcagtaga aatcagcgaa ccggatacgg atggacctgg acctgtaccg 2040

gatcctgagg gggcggtaac agtagaggaa aatgatcctg aactggtata tacaggggat 2100

tgggtaaggg aagaaaatga cggataccat ggaggaggag cccgttatac aaaagaagca 2160

gaagcttctg tagaattggc attctatgga acaggtgctg cctggtatgg acagcacgac 2220

gttaactttg gtagtgcacg gatatatata gacggaacct atgtcaagac cgtatcatgc 2280

atgggagaac ctggaataaa tattaaattg tttgaaatca gcggcttgga cttggcttcc 2340

cacaggatta aaatagaatg tgagacaccg gtaattgata ttgacaggct gacttacatc 2400

aaaggagaag aagttcctgc taaagtaatg acggcggacc tccgggcttt gactgttata 2460

gcaaaccaat acgatatgaa cagttttgca gatggcaatt acaaagacca gctgggggta 2520

tccttagttc gtgccaacca gcttctggca gcggatgatg taacccaggg ggctgtaaat 2580

gaagaacaga aataccttct gaatgccatg ctgaaaataa gaaaaaaagt tgataagagt 2640

tggatcgggc ttcccggacc aatcccgcag gatatacaga cagaaaatat cagcagagat 2700

aaccttgcta aagtaatatc ttatactggg cagttggaca gagatgagat tattcctgcc 2760

ataaaagaac agctgaacga ttcttatgat aaggctgtct ccatagcaga acgccaggat 2820

gcatcccagc cggaaataga cagagcgtgg gcagagttaa tgaatgcagt gcaatatagc 2880

agctatatca ggggatcaaa agaggaactg ttatcacttc tggatgaata cggaaaggta 2940

gataccaccg tttataaaga cgctgcttta tttatagaat ccttagaagc cgctaaaaag 3000

gtgtatcagg atgaaaatgc aatggatggg gagatcagtg attgtatcaa acaattgcga 3060

gatgcaaaag atcagctaca actaaaggat ccggtagatc cgccgaaacc cgatccggac 3120

cccgatccaa agcctgatcc aacaccagac ccgggaccag atccaaagcc cgatccaaca 3180

cctgacccga cgccagaccc aaagcccaat ccaacaccga cgcccgatcc aacaccagag 3240

ccagctctaa aaaagccgga acaggtatct ggtttgaagt cgaaagcgga gactgattat 3300

ctgacggttt cctggaagaa attgaataat gctgaatcct ataaggtgta tatttataaa 3360

agcggcaaat ggcgcctggc tggaaaaact acaaagacat ccataaagat aaaaaaactg 3420

gtttcgggaa cgaaatacac cgtaaaagtt gctgcggtca ataaagcagg gcaggggaaa 3480

tattcatcac aggtgtatac ggcagcaaag cccaaaaaag tcaaattaaa atccgtcagc 3540

aggtaccgca catcaaaagt aaagttaaac tatggaaaag taaaagcagg cggatatgaa 3600

atatggatga agaatggaaa gggttcttat aagaaggcag ccaccagtac gaagacaaca 3660

gccataaaga gcggattaaa aaaaggaaaa acatattact ttaaagtcag ggcttatgtt 3720

aaaaataaaa atcaggtgat ttacggcagc ttttccaata taaagaaata caaaatggta 3780

ttatga 3786

<210> 21

<211> 1281

<212> PRT

<213> Robinsoniella peoriensis

<400> 21

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Gly Asn Gly Leu Glu Val Lys Ala Ser Pro Arg Glu

20 25 30

Val Ala Gln Ile Thr Gly Asn Gly Val Ser Val Thr Phe Phe Gln Glu

35 40 45

Asp Gly Thr Val Gln Leu Ser Cys Ile Glu Asp Asp Gly Asn Thr Ala

50 55 60

Phe Met Thr Arg Asn Ser Glu Val Ser Tyr Pro Val Val Gly Gly Glu

65 70 75 80

Glu Val Thr Asp Phe Ser Asp Phe Gln Cys Glu Val Gln Glu Asn Val

85 90 95

Thr Gly Ala Ala Gly Ala Gly Ser Arg Met Thr Ile Thr Ser Ile Ser

100 105 110

Ser Gly Arg Gly Ile Gln Arg Ser Val Val Ile Glu Thr Val Asp Glu

115 120 125

Val Lys Gly Leu Leu His Ile Ser Ser Ser Tyr Arg Ala Glu Glu Glu

130 135 140

Val Asp Ala Asp Glu Phe Ile Asp Ser Arg Phe Ser Leu Asp Asn Pro

145 150 155 160

Ser Asp Thr Val Trp Ser Tyr Asn Gly Gly Gly Glu Gly Ala Gln Ser

165 170 175

Arg Tyr Asp Thr Leu Gln Lys Ile Asp Leu Ser Asp Gly Glu Ser Phe

180 185 190

Tyr Arg Glu Asn Leu Gln Asn Gln Thr Ala Ala Gly Ile Pro Val Ala

195 200 205

Asp Ile Tyr Gly Lys Asp Gly Gly Ile Thr Val Gly Asp Ala Ser Val

210 215 220

Thr Arg Arg Gln Leu Ser Thr Pro Val Asn Glu Arg Asn Gly Thr Ala

225 230 235 240

Tyr Val Ser Val Lys His Pro Gly Ala Val Ile Thr Gln Arg Glu Thr

245 250 255

Glu Ile Ser Gln Ser Phe Val Asn Val His Arg Gly Asp Tyr Tyr Ser

260 265 270

Gly Leu Arg Gly Tyr Ala Asp Gly Met Lys Gln Ile Gly Phe Thr Thr

275 280 285

Leu Ser Arg Glu Gln Ile Pro Glu Ser Ser Tyr Asp Leu Arg Trp Glu

290 295 300

Ser Trp Gly Trp Glu Phe Asp Trp Thr Val Glu Leu Ile Ile Asn Lys

305 310 315 320

Leu Asp Glu Leu Lys Glu Met Gly Ile Lys Gln Ile Thr Leu Asp Asp

325 330 335

Gly Trp Tyr Asn Ala Ala Gly Glu Trp Gly Leu Asn Asn Trp Lys Leu

340 345 350

Pro Asn Gly Ala Leu Asp Met Arg His Leu Thr Asp Ala Ile His Glu

355 360 365

Arg Gly Met Thr Ala Val Leu Trp Trp Arg Pro Cys Asp Gly Gly Arg

370 375 380

Glu Asp Ser Ala Leu Phe Lys Glu His Pro Glu Tyr Phe Ile Lys Asn

385 390 395 400

Gln Asp Gly Ser Phe Gly Lys Leu Ala Gly Pro Gly Gln Trp Asn Ser

405 410 415

Phe Leu Gly Ser Cys Gly Tyr Ala Leu Cys Pro Leu Ser Glu Gly Ala

420 425 430

Val Gln Ser Gln Val Asp Phe Ile Asn Arg Ala Met Asn Glu Trp Gly

435 440 445

Phe Asp Gly Phe Lys Ser Asp Tyr Val Trp Ser Leu Pro Lys Cys Tyr

450 455 460

Ser Gln Asp His His His Glu Tyr Pro Glu Glu Ser Thr Glu Gln Gln

465 470 475 480

Ala Val Phe Tyr Arg Ala Val Tyr Glu Ala Met Thr Asp Asn Asp Pro

485 490 495

Asn Ala Phe His Leu Leu Cys Asn Cys Gly Thr Pro Gln Asp Tyr Tyr

500 505 510

Ser Leu Pro Tyr Val Thr Gln Val Pro Thr Ala Asp Pro Thr Ser Val

515 520 525

Asp Gln Thr Arg Arg Arg Val Lys Ala Tyr Lys Ala Leu Cys Gly Asp

530 535 540

Tyr Phe Pro Val Thr Thr Asp His Asn Glu Val Trp Tyr Pro Ser Thr

545 550 555 560

Ile Gly Thr Gly Ala Ile Leu Ile Glu Lys Arg Asp Leu Ser Gly Trp

565 570 575

Glu Glu Glu Glu Tyr Ala Lys Trp Leu Lys Ile Ala Gln Glu Asn Gln

580 585 590

Leu His Lys Gly Thr Phe Ile Gly Asp Leu Tyr Ser Tyr Gly Tyr Asp

595 600 605

Pro Tyr Glu Thr Tyr Thr Val Tyr Lys Asp Gly Ile Met Tyr Tyr Ala

610 615 620

Phe Tyr Lys Asp Gly Asn Arg Tyr Arg Pro Ser Gly Asn Pro Asp Ile

625 630 635 640

Glu Leu Lys Gly Leu Glu Asp Gly Lys Leu Tyr Arg Ile Val Asp Tyr

645 650 655

Val Asn Asn Gln Val Val Ala Thr Asn Val Thr Ser Ser Asn Ala Val

660 665 670

Phe Ser Tyr Pro Phe Ser Asp Tyr Leu Leu Val Lys Ala Val Glu Ile

675 680 685

Ser Glu Pro Asp Thr Asp Gly Pro Gly Pro Val Pro Asp Pro Glu Gly

690 695 700

Ala Val Thr Val Glu Glu Asn Asp Pro Glu Leu Val Tyr Thr Gly Asp

705 710 715 720

Trp Val Arg Glu Glu Asn Asp Gly Tyr His Gly Gly Gly Ala Arg Tyr

725 730 735

Thr Lys Glu Ala Glu Ala Ser Val Glu Leu Ala Phe Tyr Gly Thr Gly

740 745 750

Ala Ala Trp Tyr Gly Gln His Asp Val Asn Phe Gly Ser Ala Arg Ile

755 760 765

Tyr Ile Asp Gly Thr Tyr Val Lys Thr Val Ser Cys Met Gly Glu Pro

770 775 780

Gly Ile Asn Ile Lys Leu Phe Glu Ile Ser Gly Leu Asp Leu Ala Ser

785 790 795 800

His Arg Ile Lys Ile Glu Cys Glu Thr Pro Val Ile Asp Ile Asp Arg

805 810 815

Leu Thr Tyr Ile Lys Gly Glu Glu Val Pro Ala Lys Val Met Thr Ala

820 825 830

Asp Leu Arg Ala Leu Thr Val Ile Ala Asn Gln Tyr Asp Met Asn Ser

835 840 845

Phe Ala Asp Gly Asn Tyr Lys Asp Gln Leu Gly Val Ser Leu Val Arg

850 855 860

Ala Asn Gln Leu Leu Ala Ala Asp Asp Val Thr Gln Gly Ala Val Asn

865 870 875 880

Glu Glu Gln Lys Tyr Leu Leu Asn Ala Met Leu Lys Ile Arg Lys Lys

885 890 895

Val Asp Lys Ser Trp Ile Gly Leu Pro Gly Pro Ile Pro Gln Asp Ile

900 905 910

Gln Thr Glu Asn Ile Ser Arg Asp Asn Leu Ala Lys Val Ile Ser Tyr

915 920 925

Thr Gly Gln Leu Asp Arg Asp Glu Ile Ile Pro Ala Ile Lys Glu Gln

930 935 940

Leu Asn Asp Ser Tyr Asp Lys Ala Val Ser Ile Ala Glu Arg Gln Asp

945 950 955 960

Ala Ser Gln Pro Glu Ile Asp Arg Ala Trp Ala Glu Leu Met Asn Ala

965 970 975

Val Gln Tyr Ser Ser Tyr Ile Arg Gly Ser Lys Glu Glu Leu Leu Ser

980 985 990

Leu Leu Asp Glu Tyr Gly Lys Val Asp Thr Thr Val Tyr Lys Asp Ala

995 1000 1005

Ala Leu Phe Ile Glu Ser Leu Glu Ala Ala Lys Lys Val Tyr Gln

1010 1015 1020

Asp Glu Asn Ala Met Asp Gly Glu Ile Ser Asp Cys Ile Lys Gln

1025 1030 1035

Leu Arg Asp Ala Lys Asp Gln Leu Gln Leu Lys Asp Pro Val Asp

1040 1045 1050

Pro Pro Lys Pro Asp Pro Asp Pro Asp Pro Lys Pro Asp Pro Thr

1055 1060 1065

Pro Asp Pro Gly Pro Asp Pro Lys Pro Asp Pro Thr Pro Asp Pro

1070 1075 1080

Thr Pro Asp Pro Lys Pro Asn Pro Thr Pro Thr Pro Asp Pro Thr

1085 1090 1095

Pro Glu Pro Ala Leu Lys Lys Pro Glu Gln Val Ser Gly Leu Lys

1100 1105 1110

Ser Lys Ala Glu Thr Asp Tyr Leu Thr Val Ser Trp Lys Lys Leu

1115 1120 1125

Asn Asn Ala Glu Ser Tyr Lys Val Tyr Ile Tyr Lys Ser Gly Lys

1130 1135 1140

Trp Arg Leu Ala Gly Lys Thr Thr Lys Thr Ser Ile Lys Ile Lys

1145 1150 1155

Lys Leu Val Ser Gly Thr Lys Tyr Thr Val Lys Val Ala Ala Val

1160 1165 1170

Asn Lys Ala Gly Gln Gly Lys Tyr Ser Ser Gln Val Tyr Thr Ala

1175 1180 1185

Ala Lys Pro Lys Lys Val Lys Leu Lys Ser Val Ser Arg Tyr Arg

1190 1195 1200

Thr Ser Lys Val Lys Leu Asn Tyr Gly Lys Val Lys Ala Gly Gly

1205 1210 1215

Tyr Glu Ile Trp Met Lys Asn Gly Lys Gly Ser Tyr Lys Lys Ala

1220 1225 1230

Ala Thr Ser Thr Lys Thr Thr Ala Ile Lys Ser Gly Leu Lys Lys

1235 1240 1245

Gly Lys Thr Tyr Tyr Phe Lys Val Arg Ala Tyr Val Lys Asn Lys

1250 1255 1260

Asn Gln Val Ile Tyr Gly Ser Phe Ser Asn Ile Lys Lys Tyr Lys

1265 1270 1275

Met Val Leu

1280

<210> 22

<211> 1347

<212> DNA

<213> Ruthenibacterium lactatiformans

<400> 22

gaagaaaccg atttgcttgt aaacggaggt tttgagaccg gcgacagcac cggatggaat 60

tggttcaata acgccgttgt tgacagcgct gctccgcata gcggaaacta ttgtgctaaa 120

gtagccaaaa acagcagtta tgagcaagtt gttacggtat ctccggatac gaaatatgtt 180

ttaacagggt gggcaaaatc tgagggcagt tccgttatga cgctgggcgt aaaaaattac 240

ggtgggcagg aaactttttc ggctacgctt tcagccgact atcagcagct ggcggttact 300

ttcacaaccg ggcccaatgc gcaaacagcg actatatatg gatatcgaca gaatagtggt 360

tccggtgcag gctatttcga cgatgtagaa cttacagcgg tgcaagattt tgctccatat 420

cagccgttgg caaatgccat agcgcctcaa gcaattccta cctatgacgg cgccaaccag 480

cctacacatc cctcggtggt gaaatttgaa cagccttgga atggttatct gtattggatg 540

gcaatgacac cttatccctt caatgatggg agctacgaaa acccatcgat tgttgcgtca 600

aacgatggag aaaattggat tgtgccagaa ggggtctcga atcctttggc cggcacgcca 660

agtccgggcc acaattgtga cgtggatctt gtatatgttc cagcctcgga tgaattgcgg 720

atgtactacg tagaggcaga tgatatcatc agctcaaggg taaaaatgat aagttcccgt 780

gacggtgtac actggagcga gccgcaggtc gtaatgcagg atctggtaag gaaatacagt 840

attctatcgc cgtctattga gattctgcca gatggcacct atatgatgtg gtatgtggat 900

acggggaatg caggatggaa tagccagaat aaccaagtaa aatatcgtac atctgcggat 960

ggaatcaaat ggtcaggcgc agtcacctgt acggattttg tacaacctgg atatcaaata 1020

tggcacatcg atgtacatta tgacacatca agcggagctt actatgcagt ttatccggct 1080

tatccgaatg gcaccgattg cgaccactgc aatttgtttt tcgcagtgaa tcggacagga 1140

aaacagtggg aaacttttag ccggccaatt ttgaagccgt caacggaagg cggctgggat 1200

gatttctgca tttaccggtc ctctatgctg attgacgacg gaatgttgaa agtgtggtac 1260

ggagcaaaaa agcaagagga ttcttcctgg catactgggc taaccatgcg tgatttttct 1320

gaatttatga aaatattgga acgctaa 1347

<210> 23

<211> 468

<212> PRT

<213> Ruthenibacterium lactatiformans

<400> 23

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Glu Glu Thr Asp Leu Leu Val Asn Gly Gly Phe Glu

20 25 30

Thr Gly Asp Ser Thr Gly Trp Asn Trp Phe Asn Asn Ala Val Val Asp

35 40 45

Ser Ala Ala Pro His Ser Gly Asn Tyr Cys Ala Lys Val Ala Lys Asn

50 55 60

Ser Ser Tyr Glu Gln Val Val Thr Val Ser Pro Asp Thr Lys Tyr Val

65 70 75 80

Leu Thr Gly Trp Ala Lys Ser Glu Gly Ser Ser Val Met Thr Leu Gly

85 90 95

Val Lys Asn Tyr Gly Gly Gln Glu Thr Phe Ser Ala Thr Leu Ser Ala

100 105 110

Asp Tyr Gln Gln Leu Ala Val Thr Phe Thr Thr Gly Pro Asn Ala Gln

115 120 125

Thr Ala Thr Ile Tyr Gly Tyr Arg Gln Asn Ser Gly Ser Gly Ala Gly

130 135 140

Tyr Phe Asp Asp Val Glu Leu Thr Ala Val Gln Asp Phe Ala Pro Tyr

145 150 155 160

Gln Pro Leu Ala Asn Ala Ile Ala Pro Gln Ala Ile Pro Thr Tyr Asp

165 170 175

Gly Ala Asn Gln Pro Thr His Pro Ser Val Val Lys Phe Glu Gln Pro

180 185 190

Trp Asn Gly Tyr Leu Tyr Trp Met Ala Met Thr Pro Tyr Pro Phe Asn

195 200 205

Asp Gly Ser Tyr Glu Asn Pro Ser Ile Val Ala Ser Asn Asp Gly Glu

210 215 220

Asn Trp Ile Val Pro Glu Gly Val Ser Asn Pro Leu Ala Gly Thr Pro

225 230 235 240

Ser Pro Gly His Asn Cys Asp Val Asp Leu Val Tyr Val Pro Ala Ser

245 250 255

Asp Glu Leu Arg Met Tyr Tyr Val Glu Ala Asp Asp Ile Ile Ser Ser

260 265 270

Arg Val Lys Met Ile Ser Ser Arg Asp Gly Val His Trp Ser Glu Pro

275 280 285

Gln Val Val Met Gln Asp Leu Val Arg Lys Tyr Ser Ile Leu Ser Pro

290 295 300

Ser Ile Glu Ile Leu Pro Asp Gly Thr Tyr Met Met Trp Tyr Val Asp

305 310 315 320

Thr Gly Asn Ala Gly Trp Asn Ser Gln Asn Asn Gln Val Lys Tyr Arg

325 330 335

Thr Ser Ala Asp Gly Ile Lys Trp Ser Gly Ala Val Thr Cys Thr Asp

340 345 350

Phe Val Gln Pro Gly Tyr Gln Ile Trp His Ile Asp Val His Tyr Asp

355 360 365

Thr Ser Ser Gly Ala Tyr Tyr Ala Val Tyr Pro Ala Tyr Pro Asn Gly

370 375 380

Thr Asp Cys Asp His Cys Asn Leu Phe Phe Ala Val Asn Arg Thr Gly

385 390 395 400

Lys Gln Trp Glu Thr Phe Ser Arg Pro Ile Leu Lys Pro Ser Thr Glu

405 410 415

Gly Gly Trp Asp Asp Phe Cys Ile Tyr Arg Ser Ser Met Leu Ile Asp

420 425 430

Asp Gly Met Leu Lys Val Trp Tyr Gly Ala Lys Lys Gln Glu Asp Ser

435 440 445

Ser Trp His Thr Gly Leu Thr Met Arg Asp Phe Ser Glu Phe Met Lys

450 455 460

Ile Leu Glu Arg

465

<210> 24

<211> 5277

<212> DNA

<213> Robinsoniella peoriensis

<400> 24

tcaccattga gcgctgcggc agaaagtggc acaggaacca gattagtgaa agggcaaacg 60

gggtatttga cagaggaaca ggctatccgg aaccaggagc agacaaccga agaaagggag 120

cagaagttaa ccggggaaga gacagcagag gttttgatgg aaggtacaaa agacagcggg 180

attgtacaga cagaagaagt acagacaaaa gaaatgcaga cagaagatgc gcagacagaa 240

gaagtacaga cagaagaaat gcagacagaa gatgcgcaga caaaagaagt acagacagaa 300

gaaatgcaga cagaagatgc gcagacagaa gaagtacaga caaaagaaga accggcagaa 360

gaaacacaca tgaaagaaat acagacgcaa gggacaaaga aagcgtcaga taggaacgga 420

aaggcaaggg taactgaaat tctggaagat gcccaggatc cagcaaaccg gattgtgtat 480

ctgtcagacc tgcaatggaa gtcagaaaat catacagtag atagcgagct gcctaccaga 540

aaggataagt cctttggcgg cggaaaaatt acgctaaaag tggatggaac ggtaacagaa 600

tttgataagg ggattggaac acagacagat tccaccattg tgtacgatct ggagggaaag 660

ggatatacaa agtttgaaac ttacgtgggt gtagactaca gccagaaaga aaacattccg 720

ggggaagtct gcgacgtaaa attcagggtg aaaattgatg acaagattgt atcagaaacc 780

ggtgtactgg atccgctttc gaatgcggtt aagatttctg ttaacatacc cgatacagcc 840

aaaactttaa cattatacgc ggataaagta acggaaactt ggtctgatca cgccaattgg 900

gcagatgcaa aattttatca ggcactgccg gaacccgaaa atgttgcatt caaaaaaacg 960

gtagtgacac gaaagacatc agataattcg gaggctcctg ttaatccgga ttcagcagtt 1020

aacagttcta aggctgttga cggtgttatt gacagctcca gttattttga ttttggagat 1080

caggcaaata gcggagccgt aagggagtca ctctatatgg aggtagattt aaaagggagc 1140

tatttactgt ccgatataca actgtggaga tactggaaag atggcagaac ttatgcagct 1200

actgcaattg tagtagctga ggatgagaac tttgaaaatg cagcagttat ctataactcg 1260

gatacgacgg gagaaataca tcacctggga gcaggaagtg atatgctcta tgcagaaaca 1320

gaaagtggca agacatttcc ggtaccggaa aatacaaaag caaggtatat cagagtttat 1380

acatatggtg ttaatgggac atcaggcgta acaaatcaca ttgtagaatt aaaggtgaat 1440

gcttacgtat ttggagatga aatcttaccg gaaaagccgg atgacagcaa gattttccca 1500

aatgcagtta atccgctgaa gctacaggga ccgggcacga atgatcaggt aacccacccg 1560

gatgttacgg tgtttgatga gccgtggaat gggtataaat actggatggc atatacaccg 1620

aataaaccgg gaagttccta ttttgaaaat ccctgtatag ctgcatccaa cgatggcgta 1680

aactgggagt ttcctgccca gaaccctgta cagccgcgct atgacagtga aatagaaaat 1740

caaaatgaac ataactgtga taccgatatt gtatatgacc cggtaaatga ccggttgatt 1800

atgtactggg aatgggcaca ggatgaggcg gttaatggta aaacacatcg ttctgaaatc 1860

agataccgtg tttcttatga tgggattaac tggggagtgg aagacaaaac tggtgttttg 1920

atgactggac caacggatca tggctgcgcc attgccacag aaggcgaaag atattcagac 1980

ctttctccaa ccgtagtata tgataaaaca gaaaaaatct acaaaatgtg ggcaaatgat 2040

gccggagatg taggatatga aaacaaacag aataacaaag tatggtatcg gacatcccaa 2100

gacgggatca gcaattggtc ggataagact tacgtggaga attttcttgg agtaaatgaa 2160

gacgggctgc agatgtatcc atggcaccag gatatccagt gggtagagga atttcaggaa 2220

tattgggcac ttcagcaggc atttccggca ggaagcggac cggataattc ttccctgcgt 2280

ttctcgaaat ccaaagatgg tcttcattgg gagccggtat ctgaaaaagc tttaattaca 2340

gtaggggcac ccgggacctg ggatgcagga cagatatacc gttctacttt ctggtatgag 2400

ccaggtgggg caaaaggaaa cggaacattc catatctggt atgctgcatt ggcggaaggc 2460

cagtctcact gggatatagg atatacatct gcaaactatg cagatgccat gtacaaatta 2520

acgggaagca gaccggaagt ggaaaaaaga atagaggtaa ataatgaaaa tcctctgctg 2580

attatgccgc tttacggaaa gtcttacagt gaatcaggaa gtaccctgga ttggggagat 2640

gatctggttt cacgctggaa acaggttccg gaagatttaa aagaaaacgc agttattgaa 2700

attcatctgg gtggcaagat tggcttaaat gaaagtgatt cccacacggc aaaagcgttt 2760

tatgagcagc agctggcaat tgcccaggaa aataacatcc cggtaatgat ggtggtagct 2820

acggcaggcc agcagaacta ctggacggga acagcgaatc tggatgctga gtggattgac 2880

cggatgttca agcagcacag tgtgttaaaa ggaattatgt ccactgaaaa ttattggact 2940

gactacaata aggttgctac tatgggtgcc gattatctgc gggttgcagc tgaaaacggc 3000

ggatattttg tatggagcga gcaccaggag ggtgttattg aaaatgtaat agcaaatgag 3060

aaatttaatg aagcattgaa actttacggt aataatttta ttttcacctg gaagaacacg 3120

cctgccggta ctaactccaa tgcaggaaca gccagctata tgcagggcct ctggctaacc 3180

ggaatttgtg cacaatgggg cggtctggct gatacctgga aatggtatga aaaaggattt 3240

ggtaaattat ttgatggtca gtattcttat aatccgggtg gggaagaagc aagaccggtt 3300

gcaaccgaac cggaagcact gcttggtatc gaaatgatga gtatctatac aaatggcgga 3360

tgtgtctaca actttgagca cccggcgtat gtatatggtt cttataacca gaattcacct 3420

tgctttgaaa atgtaattgc agagttcatg cgctatgcga ttaagaatcc ggcaccaggt 3480

aaagaggaag tgcttgctga tacaaaagca gtgttctatg gaaaattaag ttctttaaag 3540

agtgcaggaa acttactgca aaaaggtttg aactgggaag atgccacact gccaacccag 3600

actacgggtc gatatggatt aatacctgca gtcccggagg cagtagatga aaaaactgta 3660

aaagcagtat tcggcgatat tgagatattg aatcaatcca gtgcacagct tgcgaataaa 3720

gatgcgaaaa aagcatattt tgaagaaaaa tatccggaac agtataccgg tacggcattt 3780

ggacagctat tgaatgatac ctggtattta tacaacagta atgtgaatgt ggatggggtg 3840

caaaatgcaa aacttccgtt agaaggtaat aaatccgtag atattacaat gacaccgcat 3900

acttatgtga tcctggatga tcaggatggt gagcttcaga ttaaactgaa caattatcgt 3960

gtggataaag acagtatctg ggaaggatac ggcaccacgg tgacggaccg ctgggatacg 4020

gaccacaata ccaaacttca ggactggata cgggatgagt atattccaaa tccggacgat 4080

gataccttca gagatacaac ctttgaactg gttggactgg aaagtgagcc ggaggtaaat 4140

gtaactaatg gcttaaagga tcagtatcag gaaccggttg tggaatatga tgccgctgca 4200

ggtacggcta tgattactgt atccggaaat ggctgggtag atctgacaat tgacacgaac 4260

acggcagaag taccccaggt tgataaagca aagttaaatt ccaaaatagc agaagctaaa 4320

gggatcagac aggggaacta tacggatgaa tcctacaaag ctcttcagga agagattgga 4380

aaatcccagg cggtatcaaa caaaacagat gccacacagg aggaagtaaa tgcacagtta 4440

agcaggttag aaagtgcaat agccagatta aaagaaaaac cggcggtggt atccaaaacg 4500

gcattaaatg caaaaatagc tgaggcaaaa gggatcagac aaggaaacta tacggatgaa 4560

tcctacaaag ccctgcaaaa tgcaatagta aaagctcagg agttatcaaa caaaacagat 4620

gccacacagc agcaggtaaa tgatctggta tcagcattaa caaatgcaat taaaaattta 4680

aaaatagatg cagataagct ggcagcagag tcagcaaaga aagtagcggc agttaaggtt 4740

gccgtaaaag cagtatccta taaatcaaaa gagattaaat tatcctggaa aacggtagca 4800

gatgcggacg gatatgtaat ccgtgtaaag acaggcaaaa agtggagtac ggagaagacc 4860

attaagaaca accgcataat cacttatact tataagaaag gtactcccgg taagaaatat 4920

gtatttgaag taaaagcttt taagaaagta aatggaaaga cgacctatag taaatacaaa 4980

acagccacta aaaaagttgt gccgcaaacg gtgaccgcaa aggcaaaagc ttctaaaaat 5040

aatgtagtgg taaaatggaa caaagtgtct ggcgcatccg gatatgttgt tatgaaaaag 5100

aaagggaaaa catgggtaaa ggctgcgcag gtaaatgcaa agaaactata ctttacggat 5160

aagaaggtca aaaaaggaaa agtatattca tacaaagtaa aggcttacaa agtatataaa 5220

ggtaaaaaag tatatggaag ctatagcaag tctgtaaatg ttaaaacaaa gtcataa 5277

<210> 25

<211> 1778

<212> PRT

<213> Robinsoniella peoriensis

<400> 25

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Ser Pro Leu Ser Ala Ala Ala Glu Ser Gly Thr Gly

20 25 30

Thr Arg Leu Val Lys Gly Gln Thr Gly Tyr Leu Thr Glu Glu Gln Ala

35 40 45

Ile Arg Asn Gln Glu Gln Thr Thr Glu Glu Arg Glu Gln Lys Leu Thr

50 55 60

Gly Glu Glu Thr Ala Glu Val Leu Met Glu Gly Thr Lys Asp Ser Gly

65 70 75 80

Ile Val Gln Thr Glu Glu Val Gln Thr Lys Glu Met Gln Thr Glu Asp

85 90 95

Ala Gln Thr Glu Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala

100 105 110

Gln Thr Lys Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala Gln

115 120 125

Thr Glu Glu Val Gln Thr Lys Glu Glu Pro Ala Glu Glu Thr His Met

130 135 140

Lys Glu Ile Gln Thr Gln Gly Thr Lys Lys Ala Ser Asp Arg Asn Gly

145 150 155 160

Lys Ala Arg Val Thr Glu Ile Leu Glu Asp Ala Gln Asp Pro Ala Asn

165 170 175

Arg Ile Val Tyr Leu Ser Asp Leu Gln Trp Lys Ser Glu Asn His Thr

180 185 190

Val Asp Ser Glu Leu Pro Thr Arg Lys Asp Lys Ser Phe Gly Gly Gly

195 200 205

Lys Ile Thr Leu Lys Val Asp Gly Thr Val Thr Glu Phe Asp Lys Gly

210 215 220

Ile Gly Thr Gln Thr Asp Ser Thr Ile Val Tyr Asp Leu Glu Gly Lys

225 230 235 240

Gly Tyr Thr Lys Phe Glu Thr Tyr Val Gly Val Asp Tyr Ser Gln Lys

245 250 255

Glu Asn Ile Pro Gly Glu Val Cys Asp Val Lys Phe Arg Val Lys Ile

260 265 270

Asp Asp Lys Ile Val Ser Glu Thr Gly Val Leu Asp Pro Leu Ser Asn

275 280 285

Ala Val Lys Ile Ser Val Asn Ile Pro Asp Thr Ala Lys Thr Leu Thr

290 295 300

Leu Tyr Ala Asp Lys Val Thr Glu Thr Trp Ser Asp His Ala Asn Trp

305 310 315 320

Ala Asp Ala Lys Phe Tyr Gln Ala Leu Pro Glu Pro Glu Asn Val Ala

325 330 335

Phe Lys Lys Thr Val Val Thr Arg Lys Thr Ser Asp Asn Ser Glu Ala

340 345 350

Pro Val Asn Pro Asp Ser Ala Val Asn Ser Ser Lys Ala Val Asp Gly

355 360 365

Val Ile Asp Ser Ser Ser Tyr Phe Asp Phe Gly Asp Gln Ala Asn Ser

370 375 380

Gly Ala Val Arg Glu Ser Leu Tyr Met Glu Val Asp Leu Lys Gly Ser

385 390 395 400

Tyr Leu Leu Ser Asp Ile Gln Leu Trp Arg Tyr Trp Lys Asp Gly Arg

405 410 415

Thr Tyr Ala Ala Thr Ala Ile Val Val Ala Glu Asp Glu Asn Phe Glu

420 425 430

Asn Ala Ala Val Ile Tyr Asn Ser Asp Thr Thr Gly Glu Ile His His

435 440 445

Leu Gly Ala Gly Ser Asp Met Leu Tyr Ala Glu Thr Glu Ser Gly Lys

450 455 460

Thr Phe Pro Val Pro Glu Asn Thr Lys Ala Arg Tyr Ile Arg Val Tyr

465 470 475 480

Thr Tyr Gly Val Asn Gly Thr Ser Gly Val Thr Asn His Ile Val Glu

485 490 495

Leu Lys Val Asn Ala Tyr Val Phe Gly Asp Glu Ile Leu Pro Glu Lys

500 505 510

Pro Asp Asp Ser Lys Ile Phe Pro Asn Ala Val Asn Pro Leu Lys Leu

515 520 525

Gln Gly Pro Gly Thr Asn Asp Gln Val Thr His Pro Asp Val Thr Val

530 535 540

Phe Asp Glu Pro Trp Asn Gly Tyr Lys Tyr Trp Met Ala Tyr Thr Pro

545 550 555 560

Asn Lys Pro Gly Ser Ser Tyr Phe Glu Asn Pro Cys Ile Ala Ala Ser

565 570 575

Asn Asp Gly Val Asn Trp Glu Phe Pro Ala Gln Asn Pro Val Gln Pro

580 585 590

Arg Tyr Asp Ser Glu Ile Glu Asn Gln Asn Glu His Asn Cys Asp Thr

595 600 605

Asp Ile Val Tyr Asp Pro Val Asn Asp Arg Leu Ile Met Tyr Trp Glu

610 615 620

Trp Ala Gln Asp Glu Ala Val Asn Gly Lys Thr His Arg Ser Glu Ile

625 630 635 640

Arg Tyr Arg Val Ser Tyr Asp Gly Ile Asn Trp Gly Val Glu Asp Lys

645 650 655

Thr Gly Val Leu Met Thr Gly Pro Thr Asp His Gly Cys Ala Ile Ala

660 665 670

Thr Glu Gly Glu Arg Tyr Ser Asp Leu Ser Pro Thr Val Val Tyr Asp

675 680 685

Lys Thr Glu Lys Ile Tyr Lys Met Trp Ala Asn Asp Ala Gly Asp Val

690 695 700

Gly Tyr Glu Asn Lys Gln Asn Asn Lys Val Trp Tyr Arg Thr Ser Gln

705 710 715 720

Asp Gly Ile Ser Asn Trp Ser Asp Lys Thr Tyr Val Glu Asn Phe Leu

725 730 735

Gly Val Asn Glu Asp Gly Leu Gln Met Tyr Pro Trp His Gln Asp Ile

740 745 750

Gln Trp Val Glu Glu Phe Gln Glu Tyr Trp Ala Leu Gln Gln Ala Phe

755 760 765

Pro Ala Gly Ser Gly Pro Asp Asn Ser Ser Leu Arg Phe Ser Lys Ser

770 775 780

Lys Asp Gly Leu His Trp Glu Pro Val Ser Glu Lys Ala Leu Ile Thr

785 790 795 800

Val Gly Ala Pro Gly Thr Trp Asp Ala Gly Gln Ile Tyr Arg Ser Thr

805 810 815

Phe Trp Tyr Glu Pro Gly Gly Ala Lys Gly Asn Gly Thr Phe His Ile

820 825 830

Trp Tyr Ala Ala Leu Ala Glu Gly Gln Ser His Trp Asp Ile Gly Tyr

835 840 845

Thr Ser Ala Asn Tyr Ala Asp Ala Met Tyr Lys Leu Thr Gly Ser Arg

850 855 860

Pro Glu Val Glu Lys Arg Ile Glu Val Asn Asn Glu Asn Pro Leu Leu

865 870 875 880

Ile Met Pro Leu Tyr Gly Lys Ser Tyr Ser Glu Ser Gly Ser Thr Leu

885 890 895

Asp Trp Gly Asp Asp Leu Val Ser Arg Trp Lys Gln Val Pro Glu Asp

900 905 910

Leu Lys Glu Asn Ala Val Ile Glu Ile His Leu Gly Gly Lys Ile Gly

915 920 925

Leu Asn Glu Ser Asp Ser His Thr Ala Lys Ala Phe Tyr Glu Gln Gln

930 935 940

Leu Ala Ile Ala Gln Glu Asn Asn Ile Pro Val Met Met Val Val Ala

945 950 955 960

Thr Ala Gly Gln Gln Asn Tyr Trp Thr Gly Thr Ala Asn Leu Asp Ala

965 970 975

Glu Trp Ile Asp Arg Met Phe Lys Gln His Ser Val Leu Lys Gly Ile

980 985 990

Met Ser Thr Glu Asn Tyr Trp Thr Asp Tyr Asn Lys Val Ala Thr Met

995 1000 1005

Gly Ala Asp Tyr Leu Arg Val Ala Ala Glu Asn Gly Gly Tyr Phe

1010 1015 1020

Val Trp Ser Glu His Gln Glu Gly Val Ile Glu Asn Val Ile Ala

1025 1030 1035

Asn Glu Lys Phe Asn Glu Ala Leu Lys Leu Tyr Gly Asn Asn Phe

1040 1045 1050

Ile Phe Thr Trp Lys Asn Thr Pro Ala Gly Thr Asn Ser Asn Ala

1055 1060 1065

Gly Thr Ala Ser Tyr Met Gln Gly Leu Trp Leu Thr Gly Ile Cys

1070 1075 1080

Ala Gln Trp Gly Gly Leu Ala Asp Thr Trp Lys Trp Tyr Glu Lys

1085 1090 1095

Gly Phe Gly Lys Leu Phe Asp Gly Gln Tyr Ser Tyr Asn Pro Gly

1100 1105 1110

Gly Glu Glu Ala Arg Pro Val Ala Thr Glu Pro Glu Ala Leu Leu

1115 1120 1125

Gly Ile Glu Met Met Ser Ile Tyr Thr Asn Gly Gly Cys Val Tyr

1130 1135 1140

Asn Phe Glu His Pro Ala Tyr Val Tyr Gly Ser Tyr Asn Gln Asn

1145 1150 1155

Ser Pro Cys Phe Glu Asn Val Ile Ala Glu Phe Met Arg Tyr Ala

1160 1165 1170

Ile Lys Asn Pro Ala Pro Gly Lys Glu Glu Val Leu Ala Asp Thr

1175 1180 1185

Lys Ala Val Phe Tyr Gly Lys Leu Ser Ser Leu Lys Ser Ala Gly

1190 1195 1200

Asn Leu Leu Gln Lys Gly Leu Asn Trp Glu Asp Ala Thr Leu Pro

1205 1210 1215

Thr Gln Thr Thr Gly Arg Tyr Gly Leu Ile Pro Ala Val Pro Glu

1220 1225 1230

Ala Val Asp Glu Lys Thr Val Lys Ala Val Phe Gly Asp Ile Glu

1235 1240 1245

Ile Leu Asn Gln Ser Ser Ala Gln Leu Ala Asn Lys Asp Ala Lys

1250 1255 1260

Lys Ala Tyr Phe Glu Glu Lys Tyr Pro Glu Gln Tyr Thr Gly Thr

1265 1270 1275

Ala Phe Gly Gln Leu Leu Asn Asp Thr Trp Tyr Leu Tyr Asn Ser

1280 1285 1290

Asn Val Asn Val Asp Gly Val Gln Asn Ala Lys Leu Pro Leu Glu

1295 1300 1305

Gly Asn Lys Ser Val Asp Ile Thr Met Thr Pro His Thr Tyr Val

1310 1315 1320

Ile Leu Asp Asp Gln Asp Gly Glu Leu Gln Ile Lys Leu Asn Asn

1325 1330 1335

Tyr Arg Val Asp Lys Asp Ser Ile Trp Glu Gly Tyr Gly Thr Thr

1340 1345 1350

Val Thr Asp Arg Trp Asp Thr Asp His Asn Thr Lys Leu Gln Asp

1355 1360 1365

Trp Ile Arg Asp Glu Tyr Ile Pro Asn Pro Asp Asp Asp Thr Phe

1370 1375 1380

Arg Asp Thr Thr Phe Glu Leu Val Gly Leu Glu Ser Glu Pro Glu

1385 1390 1395

Val Asn Val Thr Asn Gly Leu Lys Asp Gln Tyr Gln Glu Pro Val

1400 1405 1410

Val Glu Tyr Asp Ala Ala Ala Gly Thr Ala Met Ile Thr Val Ser

1415 1420 1425

Gly Asn Gly Trp Val Asp Leu Thr Ile Asp Thr Asn Thr Ala Glu

1430 1435 1440

Val Pro Gln Val Asp Lys Ala Lys Leu Asn Ser Lys Ile Ala Glu

1445 1450 1455

Ala Lys Gly Ile Arg Gln Gly Asn Tyr Thr Asp Glu Ser Tyr Lys

1460 1465 1470

Ala Leu Gln Glu Glu Ile Gly Lys Ser Gln Ala Val Ser Asn Lys

1475 1480 1485

Thr Asp Ala Thr Gln Glu Glu Val Asn Ala Gln Leu Ser Arg Leu

1490 1495 1500

Glu Ser Ala Ile Ala Arg Leu Lys Glu Lys Pro Ala Val Val Ser

1505 1510 1515

Lys Thr Ala Leu Asn Ala Lys Ile Ala Glu Ala Lys Gly Ile Arg

1520 1525 1530

Gln Gly Asn Tyr Thr Asp Glu Ser Tyr Lys Ala Leu Gln Asn Ala

1535 1540 1545

Ile Val Lys Ala Gln Glu Leu Ser Asn Lys Thr Asp Ala Thr Gln

1550 1555 1560

Gln Gln Val Asn Asp Leu Val Ser Ala Leu Thr Asn Ala Ile Lys

1565 1570 1575

Asn Leu Lys Ile Asp Ala Asp Lys Leu Ala Ala Glu Ser Ala Lys

1580 1585 1590

Lys Val Ala Ala Val Lys Val Ala Val Lys Ala Val Ser Tyr Lys

1595 1600 1605

Ser Lys Glu Ile Lys Leu Ser Trp Lys Thr Val Ala Asp Ala Asp

1610 1615 1620

Gly Tyr Val Ile Arg Val Lys Thr Gly Lys Lys Trp Ser Thr Glu

1625 1630 1635

Lys Thr Ile Lys Asn Asn Arg Ile Ile Thr Tyr Thr Tyr Lys Lys

1640 1645 1650

Gly Thr Pro Gly Lys Lys Tyr Val Phe Glu Val Lys Ala Phe Lys

1655 1660 1665

Lys Val Asn Gly Lys Thr Thr Tyr Ser Lys Tyr Lys Thr Ala Thr

1670 1675 1680

Lys Lys Val Val Pro Gln Thr Val Thr Ala Lys Ala Lys Ala Ser

1685 1690 1695

Lys Asn Asn Val Val Val Lys Trp Asn Lys Val Ser Gly Ala Ser

1700 1705 1710

Gly Tyr Val Val Met Lys Lys Lys Gly Lys Thr Trp Val Lys Ala

1715 1720 1725

Ala Gln Val Asn Ala Lys Lys Leu Tyr Phe Thr Asp Lys Lys Val

1730 1735 1740

Lys Lys Gly Lys Val Tyr Ser Tyr Lys Val Lys Ala Tyr Lys Val

1745 1750 1755

Tyr Lys Gly Lys Lys Val Tyr Gly Ser Tyr Ser Lys Ser Val Asn

1760 1765 1770

Val Lys Thr Lys Ser

1775

<210> 26

<211> 7899

<212> DNA

<213> Robinsoniella peoriensis

<400> 26

gctgagactg caacagaaga aaatgcggcg ctggaaaaaa cagttacatt gcataagagc 60

gatggaacag aactgccgga ggattatcga aatccccaaa gaccagctac catggcggta 120

gatggtatta ttgacgatac aggagagtac aactattgcg atttcggtaa agacggtgat 180

aaagcagccc tgtatatgca ggtggacctt ggaggtctgt atgatttaag cagagtcaat 240

atgtggagat actggaaaga cagcagaact tacgatgcaa cagtaattac cacatctgag 300

agcggcgatt tcacagatga agcagtcata tataattcag acaggtcgaa tgtacatgga 360

tttggggcag gaggagatga acgctacgca gagactgcct ccggacatga attcccagta 420

ccggacggta caaaggcaca ggcagtacgc gtatatgtat ttggcagcca aaacggtact 480

acaaaccaca tcaatgaatt gcaggtctgg ggaactcccc atacagagaa tccggatgta 540

aattcttatc aggtgacaat tccacaggga aatggatatc aggtaatacc ttatgaaaat 600

gacccgacga cagtggaaga aggcggttct ttccgttttc aggtactgat tgactccgat 660

aatggttaca gcgcaaccag tgcggtaaaa gcaaatggag taagtctgga ggcagttgac 720

agtgtttata ccattgagaa cattactgaa gatcaggtaa tcaccattga aggcgtacat 780

aaagcacagt atgaagtgaa attcccggaa aatccacagg gatacagtgt tgagattcag 840

aatgaaggaa gtacaacggt agactataat ggttctgtca gttttaagct tattatagac 900

gaagcttata atgaatccgt accggttgta aaagcaaacg gcggtgcagc tttgggaaaa 960

gatgagctcg gtgtatatac aattgcaaat atccaggacg atattacggt tacagttgag 1020

ggtatccagg aaaataccgt agtaaagaca aaaacaatgt acttgtctga tatggattgg 1080

aagagtgctg caaatgcagt aggtgcaaca ggagaaaaag acactccaac aaaggacctg 1140

aatcatttac agcagcagat gaaattattg gtaaacggag cagagaagtc ttttgataaa 1200

ggaattggag ttcagacgga ttcttctatc gtttatgatc tggaagacaa aggctacact 1260

tctttccaca ccctggcagg cgttgattat tcagcaatgg aatatgtaga cggagaaggc 1320

tgtgatatcc agtttaaagt atatctggat gatgtcgtag tatttgacag cggagtagtt 1380

gatgcatctg atgaggctca ggaagttaat gttgctataa catcagagaa taaagaacta 1440

aaactggaag ctaaaatggt taaagagcct tataatgact ggggaaactg ggcagatgcc 1500

agctttgaaa tggcttatcc cgaaccgtct aatgtggctt taaataaaac agttaccgtt 1560

aagaaaacag cggataactc agactctgaa gtaaattcca gcagaccggg atcaatggct 1620

gtagatggaa tcattggacc tacatcagat tctaactatt gtgattttgg acaggatggg 1680

gataatactt cccgttatct gcaggtagat ttaggggatg tttatgaact tacccagatt 1740

aatatgttta gatactgggc agatggcaga gtatataatg gtactgtaat tgcagtttcc 1800

gaaaacgcag actttagtaa tccaactttt atttataatt cagataaagc agacaaacac 1860

ggacttggcg caggcagtga tgacacttat ggagaaaccc agagtggaaa attattcgaa 1920

gttccggcgg gaaccatggg acagtatgtc cgtgtgtata tggctggttc caacaaaggt 1980

acaacgaacc atatcgctga attacaggta atgggttata atttcaatac agaaccaaaa 2040

ccatatgaag caaatgcatt tgaaaatgca gaagtttatt tagatatgcc aactcatttc 2100

caggatctgg attccaataa aaacgacgat ggaagcttaa agcacattgg cggacaggtg 2160

acacatcctg atatccaggt atttgaccaa ccgtggaacg gttataaata ctggatgatt 2220

tacacaccaa atacaatgat cacttcccag tatgaaaatc catatatcgt agcatctgaa 2280

gatggacaga catgggtaga accggaaggg atttccaatc caattgaacc agaaccgcca 2340

tcaaccagat ttcataactg tgatgcagat ctgttatacg actctgtcaa tgaccgttta 2400

cttgcttact ggaactgggc agatgacggc ggcggaattg atgacgaatt aaaagatcag 2460

aactgtcaga ttcgtctgag aatttcttat gatggaatta actggggagt tccttacgac 2520

aaagacggca atattgccac aacagctgat actgtagtaa gaatggaaac aggagataag 2580

gatttcattc ctgcaatcag cgaaaaagac cgttatggta tgctttcccc aacatttacc 2640

tatgacgatt tccgcggcat atatacaatg tgggcacaaa actcgggtga tgcgggatac 2700

aaccagtccg gaaagttcat cgaaatgaga tggtctgagg atggaataaa ctggtctgaa 2760

ccacaaaaag tgaataattt ccttggaaaa gatgagaatg gcagacagct ttggccatgg 2820

catcaggata ttcagtatat ccctgagcta caggaatatt ggggactgtc ccagtgtttc 2880

tctacatcta atcccgatgg atccgtatta tacctgacca agtccagaga tggtgtcaac 2940

tgggagcagg caggaacaca gccggtatta agggcaggaa aatcaggtac ctgggatgat 3000

ttccagattt accgttctac cttctattat gataatcagt cagacagccc tactggtggg 3060

aaatttagaa tctggtacag tgcactgcag gcaaatactt caggcaagac cgttttggct 3120

cctgatggaa cagtgtctct tcaggttgga agccaggata ccaggatctg gcgtatcggg 3180

tatacagaaa atgactacat ggaagtcatg aaagctctga cccagaataa aaactatgaa 3240

gaaccggaat tagtagacgc agtttcctta aatctgtcaa tggataaaac aagcatttca 3300

gtaggtgaag aagcaacggt aagcactgct ttcgtaccgg aaaatgctac cgaccgcatt 3360

gtaaaatata catctcagga tccggaaatt gcagttattg atccaacagg cattgttaca 3420

ggggttaagg atggaaccac aactattgtt gcagaaacaa aatcgggcgc aaaaggtgaa 3480

ttatccgtaa cggttggtga gcttcaaaga ggtgaaattc gatttgaggt cagcaatgac 3540

catccgatgt atctggagaa ttactattgg agtgatgatg caccaaaaaa agacggctta 3600

gacgcaaaca agaactacta tggggatgaa cgtgtcgaca gtccggtaat gctgtataat 3660

accgttcctg aagaattgaa ggataataca gtcatcctgt taattgcaga gagaagctta 3720

aacagcacag atgcagtaag ggattggatt aaaaagaatg ttgaattatg taatgaaaat 3780

aagattccat gtgcagttca gattgcaaat ggagaaacaa atgtaaatac aaccattcca 3840

ttatcgttct ggaatgagct ggcaacgaac aatgaatacc tggttggatt taatgcagcc 3900

gagatgtata accgttttgc aggtgacaac cgcagctatg ttatggatat gatccgttta 3960

ggggtatccc acggcgtatg catgatgtgg accgatacca atatttttgg tacaaacggt 4020

gtgttgtatg actggctgac tcaggatgaa aaactgtccg gtcttatgcg ggaatacaaa 4080

gagtatatct ctctgatgac aaaagagtct tacggcagtg aggcagcaaa tacagatgct 4140

ctgtttaagg gcctgtggat gacagactac tgcgagaact ggggaatcgc ctccgactgg 4200

tggcattggc agttagacag caatggagca ctctttgatg caggcagcgg cggagatgca 4260

tggaaacagt gtctgacatg gccggaaaat atgtatacgc aggacgttgt gcgtgcagta 4320

agccagggtg caacctgctt taaatcagaa gcacagtggt attcaaatgc tacaaaaggc 4380

atgcgtacac cgacatatca gtattccatg attccgttcc tggagaaact ggtaagcaaa 4440

gaggtaaaaa ttcctacaaa agaagagatg ctggaaagaa caaaagcaat tgttgtaggg 4500

gcagaaaact ggaataactt taattataat actacttatt caaatctgta tccaagcaca 4560

ggacaatatg gaatcgtacc ttatgtacct tcaaattgtc cggaagaaga actggcaggc 4620

tatgatctgg tagtaaggga aaaccttggc aaagcaggac tgaagtctgc acttgatacg 4680

gtatacccgg ttcagaaatc agaaggaacc gcatactgtg aaacctttgg agatacctgg 4740

tactggatga attcctcgga agacaaaaac gtaagccagt acactgaatt tacaaccgca 4800

atcaatggag ctgaaagtgt aaagatagcc ggcgaacccc atgtatttgg tattataaaa 4860

gaaaatccgg gatctttaaa tgtatactta agcaactacc gcctggataa aacagaactc 4920

tgggatggta caatccccgg aggattaagc gatcagggct gctataatta tgtatggcag 4980

atgtgtgagc gcatgaagaa tggaacaggg ctggatacac agcttcgtga caccgttatt 5040

accgttaaaa atgcagtaga accgaaagta aactttgtaa cagaatctcc ggcagacaga 5100

agttttgcag aagataatta tgtaagacca tacaaatata cggttgcaca aaaagaaggc 5160

acaaccgatg aatgggtgat tacggtcagc cacaatggta ttgtggaatt caatattgta 5220

acaggcgatg aaaaagtgcc ggcaacaagt gtggaattat caactgataa agttgatgta 5280

atccgtaacc ggacagcagt tgtaaaggca acggtattgc cgcagaatgc aggaaataaa 5340

cagttaacat ggacaatcgc cgatcctgag attgcttctg tagacaacaa aggaaccgta 5400

accggactaa aagaaggaaa aaccgtatta cgtgcagcta tttctggcag tgtttataaa 5460

gaatgcgaag taaatgtaat tgaccgaaaa gtaacggaag taaacttaaa caaaacagag 5520

ttgtctctta gtgcagggga ttctgcgaaa ctggaagcat ccatagcacc ggaagacccg 5580

tctgacagca gcattacctg gacttccaca aatgaaaatg ttgcaacggt tgcatcaaac 5640

ggtaccgtta cagctcataa agcaggtgta gctcagatta tcgcccagtc tgcttaccag 5700

gcaaagggta tcgcaactgt taccgttaat tatgcggctt ccgtaaaatt agaccgtaca 5760

ggaatgacgg ctacagccaa cagcgaacag tctaaatcag gtggagaagg acctgcttcc 5820

aacgtactgg acggtaagca ggacacaatg tggcatacaa gctggacaga taaacctgaa 5880

ttacatcctc actggattaa aattgattta aacggaacaa aaacaattaa caaatttgct 5940

tatacaccaa gaaccggagc atctaacgga acaatttata attatgttct gattatcacc 6000

gatctggaag gaaatgaaaa acaggttgca aagggcgtat gggcagcaaa tgcagatgta 6060

aaatatgctg aatttgacgc agttgaagct acggcgatca agctgcaggt agacggcaac 6120

gatgacaagg catcaaaagg aggatatggt tccgcggcag aaatcaatat ttttgaagtg 6180

gcacagaaac cttccgcaaa tgagcttgcc gaaaatatta aagtaattgc acctgtaaaa 6240

gcagaagata caaaagtatc tatcccagtc attactggat ttgatatcgt aatcagtaat 6300

tccagcaatc cggacgtaat tggtattgat ggcagcatca ccagaccgga aaatgataca 6360

gttgtaactt taacattaaa agtaaaagaa acagacgcaa agagtgtaaa ggcagcagga 6420

actgaagcaa ccacaaatgt ggatgttctg gttaccggta caaagacatc tgatgtagag 6480

gcagaaagcg ttacgttaga tcagacatca gctgatttaa cagttggagg cgaactttta 6540

ttaaatgccg ttgtgaagcc ggacattgca actaataagg ctgttacctg gagctcagat 6600

aagccgggaa ccgctactgt tgaaaatggc agggtaaaag cgttagcggc aggagaggca 6660

cgtattacag cagcaactgc aaatggaaag acagcagact gcgtcattaa cgtaaaggaa 6720

aaagaggagc cggaagtaat tctcccggca gaagtgcgct taaacattcc atcagctgaa 6780

tttacagtag gagatcagat tcagttaact gcttctgtac tgccggcaaa tgcagcagat 6840

aagacaatta cctggaaatc agacaaacct gaagtggcaa ctgtcgcaaa tggatgggta 6900

aaaggtattg cagccggaac tgctaagatt acagcaacat cagtcaatgg aaaaacggct 6960

gtatgtgtga tcacagtcaa agcacagcca cagaatctac caaccggtgt ttcactgaac 7020

aagaaaacag caagtgtaaa actgaataaa acccttacac tttccgctgt agtacagcct 7080

tccaatgcgg ataataagac cgttaaatgg acgtctgaca atacgtatgt tgcaacagtt 7140

gagaatggag tcgtaaaagc agttaatgca ggaacagcca gaatcactgc agctaccgta 7200

aacggacata aggcaacttg tactataaca gtaccgggca caaagatttc caaggcaaaa 7260

gtaagccttg catcatcaaa aacacataca ggcaaagcat taaaaccatc tgtaaaagta 7320

acttacggta agaatacatt aaagaaaaat actgattata ccgtatctta caaaaataat 7380

ataaatcctg gaactgcatc tgttacgatt acgggcaagg gtaaatatta tggtaccatc 7440

aacaaaactt ttgcaatcaa ggcagcagaa ggaaagacct acacggttgg taaaggaaaa 7500

tataaagtta ctgatgcttc agcaaagaac aaaacagtaa cctttatggc tcctgtaaag 7560

aagacctaca gctcattcag cgtaccttct aaggttaaga tcgggaatga tacttacaaa 7620

gtaactgcag ttgcaaaaaa tgcattcaaa aagaatacaa agcttacaaa gttaaccatt 7680

ggttcgaatg taaaaacaat tggttcttat gcattttatg gcgcttccca attaaaaacg 7740

cttaccttaa aaactaccgg acttaacagt gtaggcaaga atgcatttaa gaaaacaaat 7800

gcaaagctga ctgtaaaggt tccaaagtca aaattagcag attataagaa gctgttaaaa 7860

ggaaaaggat tatctggcaa ggcaaaaatt cagaaataa 7899

<210> 27

<211> 2652

<212> PRT

<213> Robinsoniella peoriensis

<400> 27

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Ala Glu Thr Ala Thr Glu Glu Asn Ala Ala Leu Glu

20 25 30

Lys Thr Val Thr Leu His Lys Ser Asp Gly Thr Glu Leu Pro Glu Asp

35 40 45

Tyr Arg Asn Pro Gln Arg Pro Ala Thr Met Ala Val Asp Gly Ile Ile

50 55 60

Asp Asp Thr Gly Glu Tyr Asn Tyr Cys Asp Phe Gly Lys Asp Gly Asp

65 70 75 80

Lys Ala Ala Leu Tyr Met Gln Val Asp Leu Gly Gly Leu Tyr Asp Leu

85 90 95

Ser Arg Val Asn Met Trp Arg Tyr Trp Lys Asp Ser Arg Thr Tyr Asp

100 105 110

Ala Thr Val Ile Thr Thr Ser Glu Ser Gly Asp Phe Thr Asp Glu Ala

115 120 125

Val Ile Tyr Asn Ser Asp Arg Ser Asn Val His Gly Phe Gly Ala Gly

130 135 140

Gly Asp Glu Arg Tyr Ala Glu Thr Ala Ser Gly His Glu Phe Pro Val

145 150 155 160

Pro Asp Gly Thr Lys Ala Gln Ala Val Arg Val Tyr Val Phe Gly Ser

165 170 175

Gln Asn Gly Thr Thr Asn His Ile Asn Glu Leu Gln Val Trp Gly Thr

180 185 190

Pro His Thr Glu Asn Pro Asp Val Asn Ser Tyr Gln Val Thr Ile Pro

195 200 205

Gln Gly Asn Gly Tyr Gln Val Ile Pro Tyr Glu Asn Asp Pro Thr Thr

210 215 220

Val Glu Glu Gly Gly Ser Phe Arg Phe Gln Val Leu Ile Asp Ser Asp

225 230 235 240

Asn Gly Tyr Ser Ala Thr Ser Ala Val Lys Ala Asn Gly Val Ser Leu

245 250 255

Glu Ala Val Asp Ser Val Tyr Thr Ile Glu Asn Ile Thr Glu Asp Gln

260 265 270

Val Ile Thr Ile Glu Gly Val His Lys Ala Gln Tyr Glu Val Lys Phe

275 280 285

Pro Glu Asn Pro Gln Gly Tyr Ser Val Glu Ile Gln Asn Glu Gly Ser

290 295 300

Thr Thr Val Asp Tyr Asn Gly Ser Val Ser Phe Lys Leu Ile Ile Asp

305 310 315 320

Glu Ala Tyr Asn Glu Ser Val Pro Val Val Lys Ala Asn Gly Gly Ala

325 330 335

Ala Leu Gly Lys Asp Glu Leu Gly Val Tyr Thr Ile Ala Asn Ile Gln

340 345 350

Asp Asp Ile Thr Val Thr Val Glu Gly Ile Gln Glu Asn Thr Val Val

355 360 365

Lys Thr Lys Thr Met Tyr Leu Ser Asp Met Asp Trp Lys Ser Ala Ala

370 375 380

Asn Ala Val Gly Ala Thr Gly Glu Lys Asp Thr Pro Thr Lys Asp Leu

385 390 395 400

Asn His Leu Gln Gln Gln Met Lys Leu Leu Val Asn Gly Ala Glu Lys

405 410 415

Ser Phe Asp Lys Gly Ile Gly Val Gln Thr Asp Ser Ser Ile Val Tyr

420 425 430

Asp Leu Glu Asp Lys Gly Tyr Thr Ser Phe His Thr Leu Ala Gly Val

435 440 445

Asp Tyr Ser Ala Met Glu Tyr Val Asp Gly Glu Gly Cys Asp Ile Gln

450 455 460

Phe Lys Val Tyr Leu Asp Asp Val Val Val Phe Asp Ser Gly Val Val

465 470 475 480

Asp Ala Ser Asp Glu Ala Gln Glu Val Asn Val Ala Ile Thr Ser Glu

485 490 495

Asn Lys Glu Leu Lys Leu Glu Ala Lys Met Val Lys Glu Pro Tyr Asn

500 505 510

Asp Trp Gly Asn Trp Ala Asp Ala Ser Phe Glu Met Ala Tyr Pro Glu

515 520 525

Pro Ser Asn Val Ala Leu Asn Lys Thr Val Thr Val Lys Lys Thr Ala

530 535 540

Asp Asn Ser Asp Ser Glu Val Asn Ser Ser Arg Pro Gly Ser Met Ala

545 550 555 560

Val Asp Gly Ile Ile Gly Pro Thr Ser Asp Ser Asn Tyr Cys Asp Phe

565 570 575

Gly Gln Asp Gly Asp Asn Thr Ser Arg Tyr Leu Gln Val Asp Leu Gly

580 585 590

Asp Val Tyr Glu Leu Thr Gln Ile Asn Met Phe Arg Tyr Trp Ala Asp

595 600 605

Gly Arg Val Tyr Asn Gly Thr Val Ile Ala Val Ser Glu Asn Ala Asp

610 615 620

Phe Ser Asn Pro Thr Phe Ile Tyr Asn Ser Asp Lys Ala Asp Lys His

625 630 635 640

Gly Leu Gly Ala Gly Ser Asp Asp Thr Tyr Gly Glu Thr Gln Ser Gly

645 650 655

Lys Leu Phe Glu Val Pro Ala Gly Thr Met Gly Gln Tyr Val Arg Val

660 665 670

Tyr Met Ala Gly Ser Asn Lys Gly Thr Thr Asn His Ile Ala Glu Leu

675 680 685

Gln Val Met Gly Tyr Asn Phe Asn Thr Glu Pro Lys Pro Tyr Glu Ala

690 695 700

Asn Ala Phe Glu Asn Ala Glu Val Tyr Leu Asp Met Pro Thr His Phe

705 710 715 720

Gln Asp Leu Asp Ser Asn Lys Asn Asp Asp Gly Ser Leu Lys His Ile

725 730 735

Gly Gly Gln Val Thr His Pro Asp Ile Gln Val Phe Asp Gln Pro Trp

740 745 750

Asn Gly Tyr Lys Tyr Trp Met Ile Tyr Thr Pro Asn Thr Met Ile Thr

755 760 765

Ser Gln Tyr Glu Asn Pro Tyr Ile Val Ala Ser Glu Asp Gly Gln Thr

770 775 780

Trp Val Glu Pro Glu Gly Ile Ser Asn Pro Ile Glu Pro Glu Pro Pro

785 790 795 800

Ser Thr Arg Phe His Asn Cys Asp Ala Asp Leu Leu Tyr Asp Ser Val

805 810 815

Asn Asp Arg Leu Leu Ala Tyr Trp Asn Trp Ala Asp Asp Gly Gly Gly

820 825 830

Ile Asp Asp Glu Leu Lys Asp Gln Asn Cys Gln Ile Arg Leu Arg Ile

835 840 845

Ser Tyr Asp Gly Ile Asn Trp Gly Val Pro Tyr Asp Lys Asp Gly Asn

850 855 860

Ile Ala Thr Thr Ala Asp Thr Val Val Arg Met Glu Thr Gly Asp Lys

865 870 875 880

Asp Phe Ile Pro Ala Ile Ser Glu Lys Asp Arg Tyr Gly Met Leu Ser

885 890 895

Pro Thr Phe Thr Tyr Asp Asp Phe Arg Gly Ile Tyr Thr Met Trp Ala

900 905 910

Gln Asn Ser Gly Asp Ala Gly Tyr Asn Gln Ser Gly Lys Phe Ile Glu

915 920 925

Met Arg Trp Ser Glu Asp Gly Ile Asn Trp Ser Glu Pro Gln Lys Val

930 935 940

Asn Asn Phe Leu Gly Lys Asp Glu Asn Gly Arg Gln Leu Trp Pro Trp

945 950 955 960

His Gln Asp Ile Gln Tyr Ile Pro Glu Leu Gln Glu Tyr Trp Gly Leu

965 970 975

Ser Gln Cys Phe Ser Thr Ser Asn Pro Asp Gly Ser Val Leu Tyr Leu

980 985 990

Thr Lys Ser Arg Asp Gly Val Asn Trp Glu Gln Ala Gly Thr Gln Pro

995 1000 1005

Val Leu Arg Ala Gly Lys Ser Gly Thr Trp Asp Asp Phe Gln Ile

1010 1015 1020

Tyr Arg Ser Thr Phe Tyr Tyr Asp Asn Gln Ser Asp Ser Pro Thr

1025 1030 1035

Gly Gly Lys Phe Arg Ile Trp Tyr Ser Ala Leu Gln Ala Asn Thr

1040 1045 1050

Ser Gly Lys Thr Val Leu Ala Pro Asp Gly Thr Val Ser Leu Gln

1055 1060 1065

Val Gly Ser Gln Asp Thr Arg Ile Trp Arg Ile Gly Tyr Thr Glu

1070 1075 1080

Asn Asp Tyr Met Glu Val Met Lys Ala Leu Thr Gln Asn Lys Asn

1085 1090 1095

Tyr Glu Glu Pro Glu Leu Val Asp Ala Val Ser Leu Asn Leu Ser

1100 1105 1110

Met Asp Lys Thr Ser Ile Ser Val Gly Glu Glu Ala Thr Val Ser

1115 1120 1125

Thr Ala Phe Val Pro Glu Asn Ala Thr Asp Arg Ile Val Lys Tyr

1130 1135 1140

Thr Ser Gln Asp Pro Glu Ile Ala Val Ile Asp Pro Thr Gly Ile

1145 1150 1155

Val Thr Gly Val Lys Asp Gly Thr Thr Thr Ile Val Ala Glu Thr

1160 1165 1170

Lys Ser Gly Ala Lys Gly Glu Leu Ser Val Thr Val Gly Glu Leu

1175 1180 1185

Gln Arg Gly Glu Ile Arg Phe Glu Val Ser Asn Asp His Pro Met

1190 1195 1200

Tyr Leu Glu Asn Tyr Tyr Trp Ser Asp Asp Ala Pro Lys Lys Asp

1205 1210 1215

Gly Leu Asp Ala Asn Lys Asn Tyr Tyr Gly Asp Glu Arg Val Asp

1220 1225 1230

Ser Pro Val Met Leu Tyr Asn Thr Val Pro Glu Glu Leu Lys Asp

1235 1240 1245

Asn Thr Val Ile Leu Leu Ile Ala Glu Arg Ser Leu Asn Ser Thr

1250 1255 1260

Asp Ala Val Arg Asp Trp Ile Lys Lys Asn Val Glu Leu Cys Asn

1265 1270 1275

Glu Asn Lys Ile Pro Cys Ala Val Gln Ile Ala Asn Gly Glu Thr

1280 1285 1290

Asn Val Asn Thr Thr Ile Pro Leu Ser Phe Trp Asn Glu Leu Ala

1295 1300 1305

Thr Asn Asn Glu Tyr Leu Val Gly Phe Asn Ala Ala Glu Met Tyr

1310 1315 1320

Asn Arg Phe Ala Gly Asp Asn Arg Ser Tyr Val Met Asp Met Ile

1325 1330 1335

Arg Leu Gly Val Ser His Gly Val Cys Met Met Trp Thr Asp Thr

1340 1345 1350

Asn Ile Phe Gly Thr Asn Gly Val Leu Tyr Asp Trp Leu Thr Gln

1355 1360 1365

Asp Glu Lys Leu Ser Gly Leu Met Arg Glu Tyr Lys Glu Tyr Ile

1370 1375 1380

Ser Leu Met Thr Lys Glu Ser Tyr Gly Ser Glu Ala Ala Asn Thr

1385 1390 1395

Asp Ala Leu Phe Lys Gly Leu Trp Met Thr Asp Tyr Cys Glu Asn

1400 1405 1410

Trp Gly Ile Ala Ser Asp Trp Trp His Trp Gln Leu Asp Ser Asn

1415 1420 1425

Gly Ala Leu Phe Asp Ala Gly Ser Gly Gly Asp Ala Trp Lys Gln

1430 1435 1440

Cys Leu Thr Trp Pro Glu Asn Met Tyr Thr Gln Asp Val Val Arg

1445 1450 1455

Ala Val Ser Gln Gly Ala Thr Cys Phe Lys Ser Glu Ala Gln Trp

1460 1465 1470

Tyr Ser Asn Ala Thr Lys Gly Met Arg Thr Pro Thr Tyr Gln Tyr

1475 1480 1485

Ser Met Ile Pro Phe Leu Glu Lys Leu Val Ser Lys Glu Val Lys

1490 1495 1500

Ile Pro Thr Lys Glu Glu Met Leu Glu Arg Thr Lys Ala Ile Val

1505 1510 1515

Val Gly Ala Glu Asn Trp Asn Asn Phe Asn Tyr Asn Thr Thr Tyr

1520 1525 1530

Ser Asn Leu Tyr Pro Ser Thr Gly Gln Tyr Gly Ile Val Pro Tyr

1535 1540 1545

Val Pro Ser Asn Cys Pro Glu Glu Glu Leu Ala Gly Tyr Asp Leu

1550 1555 1560

Val Val Arg Glu Asn Leu Gly Lys Ala Gly Leu Lys Ser Ala Leu

1565 1570 1575

Asp Thr Val Tyr Pro Val Gln Lys Ser Glu Gly Thr Ala Tyr Cys

1580 1585 1590

Glu Thr Phe Gly Asp Thr Trp Tyr Trp Met Asn Ser Ser Glu Asp

1595 1600 1605

Lys Asn Val Ser Gln Tyr Thr Glu Phe Thr Thr Ala Ile Asn Gly

1610 1615 1620

Ala Glu Ser Val Lys Ile Ala Gly Glu Pro His Val Phe Gly Ile

1625 1630 1635

Ile Lys Glu Asn Pro Gly Ser Leu Asn Val Tyr Leu Ser Asn Tyr

1640 1645 1650

Arg Leu Asp Lys Thr Glu Leu Trp Asp Gly Thr Ile Pro Gly Gly

1655 1660 1665

Leu Ser Asp Gln Gly Cys Tyr Asn Tyr Val Trp Gln Met Cys Glu

1670 1675 1680

Arg Met Lys Asn Gly Thr Gly Leu Asp Thr Gln Leu Arg Asp Thr

1685 1690 1695

Val Ile Thr Val Lys Asn Ala Val Glu Pro Lys Val Asn Phe Val

1700 1705 1710

Thr Glu Ser Pro Ala Asp Arg Ser Phe Ala Glu Asp Asn Tyr Val

1715 1720 1725

Arg Pro Tyr Lys Tyr Thr Val Ala Gln Lys Glu Gly Thr Thr Asp

1730 1735 1740

Glu Trp Val Ile Thr Val Ser His Asn Gly Ile Val Glu Phe Asn

1745 1750 1755

Ile Val Thr Gly Asp Glu Lys Val Pro Ala Thr Ser Val Glu Leu

1760 1765 1770

Ser Thr Asp Lys Val Asp Val Ile Arg Asn Arg Thr Ala Val Val

1775 1780 1785

Lys Ala Thr Val Leu Pro Gln Asn Ala Gly Asn Lys Gln Leu Thr

1790 1795 1800

Trp Thr Ile Ala Asp Pro Glu Ile Ala Ser Val Asp Asn Lys Gly

1805 1810 1815

Thr Val Thr Gly Leu Lys Glu Gly Lys Thr Val Leu Arg Ala Ala

1820 1825 1830

Ile Ser Gly Ser Val Tyr Lys Glu Cys Glu Val Asn Val Ile Asp

1835 1840 1845

Arg Lys Val Thr Glu Val Asn Leu Asn Lys Thr Glu Leu Ser Leu

1850 1855 1860

Ser Ala Gly Asp Ser Ala Lys Leu Glu Ala Ser Ile Ala Pro Glu

1865 1870 1875

Asp Pro Ser Asp Ser Ser Ile Thr Trp Thr Ser Thr Asn Glu Asn

1880 1885 1890

Val Ala Thr Val Ala Ser Asn Gly Thr Val Thr Ala His Lys Ala

1895 1900 1905

Gly Val Ala Gln Ile Ile Ala Gln Ser Ala Tyr Gln Ala Lys Gly

1910 1915 1920

Ile Ala Thr Val Thr Val Asn Tyr Ala Ala Ser Val Lys Leu Asp

1925 1930 1935

Arg Thr Gly Met Thr Ala Thr Ala Asn Ser Glu Gln Ser Lys Ser

1940 1945 1950

Gly Gly Glu Gly Pro Ala Ser Asn Val Leu Asp Gly Lys Gln Asp

1955 1960 1965

Thr Met Trp His Thr Ser Trp Thr Asp Lys Pro Glu Leu His Pro

1970 1975 1980

His Trp Ile Lys Ile Asp Leu Asn Gly Thr Lys Thr Ile Asn Lys

1985 1990 1995

Phe Ala Tyr Thr Pro Arg Thr Gly Ala Ser Asn Gly Thr Ile Tyr

2000 2005 2010

Asn Tyr Val Leu Ile Ile Thr Asp Leu Glu Gly Asn Glu Lys Gln

2015 2020 2025

Val Ala Lys Gly Val Trp Ala Ala Asn Ala Asp Val Lys Tyr Ala

2030 2035 2040

Glu Phe Asp Ala Val Glu Ala Thr Ala Ile Lys Leu Gln Val Asp

2045 2050 2055

Gly Asn Asp Asp Lys Ala Ser Lys Gly Gly Tyr Gly Ser Ala Ala

2060 2065 2070

Glu Ile Asn Ile Phe Glu Val Ala Gln Lys Pro Ser Ala Asn Glu

2075 2080 2085

Leu Ala Glu Asn Ile Lys Val Ile Ala Pro Val Lys Ala Glu Asp

2090 2095 2100

Thr Lys Val Ser Ile Pro Val Ile Thr Gly Phe Asp Ile Val Ile

2105 2110 2115

Ser Asn Ser Ser Asn Pro Asp Val Ile Gly Ile Asp Gly Ser Ile

2120 2125 2130

Thr Arg Pro Glu Asn Asp Thr Val Val Thr Leu Thr Leu Lys Val

2135 2140 2145

Lys Glu Thr Asp Ala Lys Ser Val Lys Ala Ala Gly Thr Glu Ala

2150 2155 2160

Thr Thr Asn Val Asp Val Leu Val Thr Gly Thr Lys Thr Ser Asp

2165 2170 2175

Val Glu Ala Glu Ser Val Thr Leu Asp Gln Thr Ser Ala Asp Leu

2180 2185 2190

Thr Val Gly Gly Glu Leu Leu Leu Asn Ala Val Val Lys Pro Asp

2195 2200 2205

Ile Ala Thr Asn Lys Ala Val Thr Trp Ser Ser Asp Lys Pro Gly

2210 2215 2220

Thr Ala Thr Val Glu Asn Gly Arg Val Lys Ala Leu Ala Ala Gly

2225 2230 2235

Glu Ala Arg Ile Thr Ala Ala Thr Ala Asn Gly Lys Thr Ala Asp

2240 2245 2250

Cys Val Ile Asn Val Lys Glu Lys Glu Glu Pro Glu Val Ile Leu

2255 2260 2265

Pro Ala Glu Val Arg Leu Asn Ile Pro Ser Ala Glu Phe Thr Val

2270 2275 2280

Gly Asp Gln Ile Gln Leu Thr Ala Ser Val Leu Pro Ala Asn Ala

2285 2290 2295

Ala Asp Lys Thr Ile Thr Trp Lys Ser Asp Lys Pro Glu Val Ala

2300 2305 2310

Thr Val Ala Asn Gly Trp Val Lys Gly Ile Ala Ala Gly Thr Ala

2315 2320 2325

Lys Ile Thr Ala Thr Ser Val Asn Gly Lys Thr Ala Val Cys Val

2330 2335 2340

Ile Thr Val Lys Ala Gln Pro Gln Asn Leu Pro Thr Gly Val Ser

2345 2350 2355

Leu Asn Lys Lys Thr Ala Ser Val Lys Leu Asn Lys Thr Leu Thr

2360 2365 2370

Leu Ser Ala Val Val Gln Pro Ser Asn Ala Asp Asn Lys Thr Val

2375 2380 2385

Lys Trp Thr Ser Asp Asn Thr Tyr Val Ala Thr Val Glu Asn Gly

2390 2395 2400

Val Val Lys Ala Val Asn Ala Gly Thr Ala Arg Ile Thr Ala Ala

2405 2410 2415

Thr Val Asn Gly His Lys Ala Thr Cys Thr Ile Thr Val Pro Gly

2420 2425 2430

Thr Lys Ile Ser Lys Ala Lys Val Ser Leu Ala Ser Ser Lys Thr

2435 2440 2445

His Thr Gly Lys Ala Leu Lys Pro Ser Val Lys Val Thr Tyr Gly

2450 2455 2460

Lys Asn Thr Leu Lys Lys Asn Thr Asp Tyr Thr Val Ser Tyr Lys

2465 2470 2475

Asn Asn Ile Asn Pro Gly Thr Ala Ser Val Thr Ile Thr Gly Lys

2480 2485 2490

Gly Lys Tyr Tyr Gly Thr Ile Asn Lys Thr Phe Ala Ile Lys Ala

2495 2500 2505

Ala Glu Gly Lys Thr Tyr Thr Val Gly Lys Gly Lys Tyr Lys Val

2510 2515 2520

Thr Asp Ala Ser Ala Lys Asn Lys Thr Val Thr Phe Met Ala Pro

2525 2530 2535

Val Lys Lys Thr Tyr Ser Ser Phe Ser Val Pro Ser Lys Val Lys

2540 2545 2550

Ile Gly Asn Asp Thr Tyr Lys Val Thr Ala Val Ala Lys Asn Ala

2555 2560 2565

Phe Lys Lys Asn Thr Lys Leu Thr Lys Leu Thr Ile Gly Ser Asn

2570 2575 2580

Val Lys Thr Ile Gly Ser Tyr Ala Phe Tyr Gly Ala Ser Gln Leu

2585 2590 2595

Lys Thr Leu Thr Leu Lys Thr Thr Gly Leu Asn Ser Val Gly Lys

2600 2605 2610

Asn Ala Phe Lys Lys Thr Asn Ala Lys Leu Thr Val Lys Val Pro

2615 2620 2625

Lys Ser Lys Leu Ala Asp Tyr Lys Lys Leu Leu Lys Gly Lys Gly

2630 2635 2640

Leu Ser Gly Lys Ala Lys Ile Gln Lys

2645 2650

<210> 28

<211> 2535

<212> DNA

<213> Robinsoniella peoriensis

<400> 28

tcaccattga gcgctgcggc agaaagtggc acaggaacca gattagtgaa agggcaaacg 60

gggtatttga cagaggaaca ggctatccgg aaccaggagc agacaaccga agaaagggag 120

cagaagttaa ccggggaaga gacagcagag gttttgatgg aaggtacaaa agacagcggg 180

attgtacaga cagaagaagt acagacaaaa gaaatgcaga cagaagatgc gcagacagaa 240

gaagtacaga cagaagaaat gcagacagaa gatgcgcaga caaaagaagt acagacagaa 300

gaaatgcaga cagaagatgc gcagacagaa gaagtacaga caaaagaaga accggcagaa 360

gaaacacaca tgaaagaaat acagacgcaa gggacaaaga aagcgtcaga taggaacgga 420

aaggcaaggg taactgaaat tctggaagat gcccaggatc cagcaaaccg gattgtgtat 480

ctgtcagacc tgcaatggaa gtcagaaaat catacagtag atagcgagct gcctaccaga 540

aaggataagt cctttggcgg cggaaaaatt acgctaaaag tggatggaac ggtaacagaa 600

tttgataagg ggattggaac acagacagat tccaccattg tgtacgatct ggagggaaag 660

ggatatacaa agtttgaaac ttacgtgggt gtagactaca gccagaaaga aaacattccg 720

ggggaagtct gcgacgtaaa attcagggtg aaaattgatg acaagattgt atcagaaacc 780

ggtgtactgg atccgctttc gaatgcggtt aagatttctg ttaacatacc cgatacagcc 840

aaaactttaa cattatacgc ggataaagta acggaaactt ggtctgatca cgccaattgg 900

gcagatgcaa aattttatca ggcactgccg gaacccgaaa atgttgcatt caaaaaaacg 960

gtagtgacac gaaagacatc agataattcg gaggctcctg ttaatccgga ttcagcagtt 1020

aacagttcta aggctgttga cggtgttatt gacagctcca gttattttga ttttggagat 1080

caggcaaata gcggagccgt aagggagtca ctctatatgg aggtagattt aaaagggagc 1140

tatttactgt ccgatataca actgtggaga tactggaaag atggcagaac ttatgcagct 1200

actgcaattg tagtagctga ggatgagaac tttgaaaatg cagcagttat ctataactcg 1260

gatacgacgg gagaaataca tcacctggga gcaggaagtg atatgctcta tgcagaaaca 1320

gaaagtggca agacatttcc ggtaccggaa aatacaaaag caaggtatat cagagtttat 1380

acatatggtg ttaatgggac atcaggcgta acaaatcaca ttgtagaatt aaaggtgaat 1440

gcttacgtat ttggagatga aatcttaccg gaaaagccgg atgacagcaa gattttccca 1500

aatgcagtta atccgctgaa gctacaggga ccgggcacga atgatcaggt aacccacccg 1560

gatgttacgg tgtttgatga gccgtggaat gggtataaat actggatggc atatacaccg 1620

aataaaccgg gaagttccta ttttgaaaat ccctgtatag ctgcatccaa cgatggcgta 1680

aactgggagt ttcctgccca gaaccctgta cagccgcgct atgacagtga aatagaaaat 1740

caaaatgaac ataactgtga taccgatatt gtatatgacc cggtaaatga ccggttgatt 1800

atgtactggg aatgggcaca ggatgaggcg gttaatggta aaacacatcg ttctgaaatc 1860

agataccgtg tttcttatga tgggattaac tggggagtgg aagacaaaac tggtgttttg 1920

atgactggac caacggatca tggctgcgcc attgccacag aaggcgaaag atattcagac 1980

ctttctccaa ccgtagtata tgataaaaca gaaaaaatct acaaaatgtg ggcaaatgat 2040

gccggagatg taggatatga aaacaaacag aataacaaag tatggtatcg gacatcccaa 2100

gacgggatca gcaattggtc ggataagact tacgtggaga attttcttgg agtaaatgaa 2160

gacgggctgc agatgtatcc atggcaccag gatatccagt gggtagagga atttcaggaa 2220

tattgggcac ttcagcaggc atttccggca ggaagcggac cggataattc ttccctgcgt 2280

ttctcgaaat ccaaagatgg tcttcattgg gagccggtat ctgaaaaagc tttaattaca 2340

gtaggggcac ccgggacctg ggatgcagga cagatatacc gttctacttt ctggtatgag 2400

ccaggtgggg caaaaggaaa cggaacattc catatctggt atgctgcatt ggcggaaggc 2460

cagtctcact gggatatagg atatacatct gcaaactatg cagatgccat gtacaaatta 2520

acgggaagca gatga 2535

<210> 29

<211> 864

<212> PRT

<213> Robinsoniella peoriensis

<400> 29

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Ser Pro Leu Ser Ala Ala Ala Glu Ser Gly Thr Gly

20 25 30

Thr Arg Leu Val Lys Gly Gln Thr Gly Tyr Leu Thr Glu Glu Gln Ala

35 40 45

Ile Arg Asn Gln Glu Gln Thr Thr Glu Glu Arg Glu Gln Lys Leu Thr

50 55 60

Gly Glu Glu Thr Ala Glu Val Leu Met Glu Gly Thr Lys Asp Ser Gly

65 70 75 80

Ile Val Gln Thr Glu Glu Val Gln Thr Lys Glu Met Gln Thr Glu Asp

85 90 95

Ala Gln Thr Glu Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala

100 105 110

Gln Thr Lys Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala Gln

115 120 125

Thr Glu Glu Val Gln Thr Lys Glu Glu Pro Ala Glu Glu Thr His Met

130 135 140

Lys Glu Ile Gln Thr Gln Gly Thr Lys Lys Ala Ser Asp Arg Asn Gly

145 150 155 160

Lys Ala Arg Val Thr Glu Ile Leu Glu Asp Ala Gln Asp Pro Ala Asn

165 170 175

Arg Ile Val Tyr Leu Ser Asp Leu Gln Trp Lys Ser Glu Asn His Thr

180 185 190

Val Asp Ser Glu Leu Pro Thr Arg Lys Asp Lys Ser Phe Gly Gly Gly

195 200 205

Lys Ile Thr Leu Lys Val Asp Gly Thr Val Thr Glu Phe Asp Lys Gly

210 215 220

Ile Gly Thr Gln Thr Asp Ser Thr Ile Val Tyr Asp Leu Glu Gly Lys

225 230 235 240

Gly Tyr Thr Lys Phe Glu Thr Tyr Val Gly Val Asp Tyr Ser Gln Lys

245 250 255

Glu Asn Ile Pro Gly Glu Val Cys Asp Val Lys Phe Arg Val Lys Ile

260 265 270

Asp Asp Lys Ile Val Ser Glu Thr Gly Val Leu Asp Pro Leu Ser Asn

275 280 285

Ala Val Lys Ile Ser Val Asn Ile Pro Asp Thr Ala Lys Thr Leu Thr

290 295 300

Leu Tyr Ala Asp Lys Val Thr Glu Thr Trp Ser Asp His Ala Asn Trp

305 310 315 320

Ala Asp Ala Lys Phe Tyr Gln Ala Leu Pro Glu Pro Glu Asn Val Ala

325 330 335

Phe Lys Lys Thr Val Val Thr Arg Lys Thr Ser Asp Asn Ser Glu Ala

340 345 350

Pro Val Asn Pro Asp Ser Ala Val Asn Ser Ser Lys Ala Val Asp Gly

355 360 365

Val Ile Asp Ser Ser Ser Tyr Phe Asp Phe Gly Asp Gln Ala Asn Ser

370 375 380

Gly Ala Val Arg Glu Ser Leu Tyr Met Glu Val Asp Leu Lys Gly Ser

385 390 395 400

Tyr Leu Leu Ser Asp Ile Gln Leu Trp Arg Tyr Trp Lys Asp Gly Arg

405 410 415

Thr Tyr Ala Ala Thr Ala Ile Val Val Ala Glu Asp Glu Asn Phe Glu

420 425 430

Asn Ala Ala Val Ile Tyr Asn Ser Asp Thr Thr Gly Glu Ile His His

435 440 445

Leu Gly Ala Gly Ser Asp Met Leu Tyr Ala Glu Thr Glu Ser Gly Lys

450 455 460

Thr Phe Pro Val Pro Glu Asn Thr Lys Ala Arg Tyr Ile Arg Val Tyr

465 470 475 480

Thr Tyr Gly Val Asn Gly Thr Ser Gly Val Thr Asn His Ile Val Glu

485 490 495

Leu Lys Val Asn Ala Tyr Val Phe Gly Asp Glu Ile Leu Pro Glu Lys

500 505 510

Pro Asp Asp Ser Lys Ile Phe Pro Asn Ala Val Asn Pro Leu Lys Leu

515 520 525

Gln Gly Pro Gly Thr Asn Asp Gln Val Thr His Pro Asp Val Thr Val

530 535 540

Phe Asp Glu Pro Trp Asn Gly Tyr Lys Tyr Trp Met Ala Tyr Thr Pro

545 550 555 560

Asn Lys Pro Gly Ser Ser Tyr Phe Glu Asn Pro Cys Ile Ala Ala Ser

565 570 575

Asn Asp Gly Val Asn Trp Glu Phe Pro Ala Gln Asn Pro Val Gln Pro

580 585 590

Arg Tyr Asp Ser Glu Ile Glu Asn Gln Asn Glu His Asn Cys Asp Thr

595 600 605

Asp Ile Val Tyr Asp Pro Val Asn Asp Arg Leu Ile Met Tyr Trp Glu

610 615 620

Trp Ala Gln Asp Glu Ala Val Asn Gly Lys Thr His Arg Ser Glu Ile

625 630 635 640

Arg Tyr Arg Val Ser Tyr Asp Gly Ile Asn Trp Gly Val Glu Asp Lys

645 650 655

Thr Gly Val Leu Met Thr Gly Pro Thr Asp His Gly Cys Ala Ile Ala

660 665 670

Thr Glu Gly Glu Arg Tyr Ser Asp Leu Ser Pro Thr Val Val Tyr Asp

675 680 685

Lys Thr Glu Lys Ile Tyr Lys Met Trp Ala Asn Asp Ala Gly Asp Val

690 695 700

Gly Tyr Glu Asn Lys Gln Asn Asn Lys Val Trp Tyr Arg Thr Ser Gln

705 710 715 720

Asp Gly Ile Ser Asn Trp Ser Asp Lys Thr Tyr Val Glu Asn Phe Leu

725 730 735

Gly Val Asn Glu Asp Gly Leu Gln Met Tyr Pro Trp His Gln Asp Ile

740 745 750

Gln Trp Val Glu Glu Phe Gln Glu Tyr Trp Ala Leu Gln Gln Ala Phe

755 760 765

Pro Ala Gly Ser Gly Pro Asp Asn Ser Ser Leu Arg Phe Ser Lys Ser

770 775 780

Lys Asp Gly Leu His Trp Glu Pro Val Ser Glu Lys Ala Leu Ile Thr

785 790 795 800

Val Gly Ala Pro Gly Thr Trp Asp Ala Gly Gln Ile Tyr Arg Ser Thr

805 810 815

Phe Trp Tyr Glu Pro Gly Gly Ala Lys Gly Asn Gly Thr Phe His Ile

820 825 830

Trp Tyr Ala Ala Leu Ala Glu Gly Gln Ser His Trp Asp Ile Gly Tyr

835 840 845

Thr Ser Ala Asn Tyr Ala Asp Ala Met Tyr Lys Leu Thr Gly Ser Arg

850 855 860

<210> 30

<211> 3246

<212> DNA

<213> Robinsoniella peoriensis

<400> 30

gctgagactg caacagaaga aaatgcggcg ctggaaaaaa cagttacatt gcataagagc 60

gatggaacag aactgccgga ggattatcga aatccccaaa gaccagctac catggcggta 120

gatggtatta ttgacgatac aggagagtac aactattgcg atttcggtaa agacggtgat 180

aaagcagccc tgtatatgca ggtggacctt ggaggtctgt atgatttaag cagagtcaat 240

atgtggagat actggaaaga cagcagaact tacgatgcaa cagtaattac cacatctgag 300

agcggcgatt tcacagatga agcagtcata tataattcag acaggtcgaa tgtacatgga 360

tttggggcag gaggagatga acgctacgca gagactgcct ccggacatga attcccagta 420

ccggacggta caaaggcaca ggcagtacgc gtatatgtat ttggcagcca aaacggtact 480

acaaaccaca tcaatgaatt gcaggtctgg ggaactcccc atacagagaa tccggatgta 540

aattcttatc aggtgacaat tccacaggga aatggatatc aggtaatacc ttatgaaaat 600

gacccgacga cagtggaaga aggcggttct ttccgttttc aggtactgat tgactccgat 660

aatggttaca gcgcaaccag tgcggtaaaa gcaaatggag taagtctgga ggcagttgac 720

agtgtttata ccattgagaa cattactgaa gatcaggtaa tcaccattga aggcgtacat 780

aaagcacagt atgaagtgaa attcccggaa aatccacagg gatacagtgt tgagattcag 840

aatgaaggaa gtacaacggt agactataat ggttctgtca gttttaagct tattatagac 900

gaagcttata atgaatccgt accggttgta aaagcaaacg gcggtgcagc tttgggaaaa 960

gatgagctcg gtgtatatac aattgcaaat atccaggacg atattacggt tacagttgag 1020

ggtatccagg aaaataccgt agtaaagaca aaaacaatgt acttgtctga tatggattgg 1080

aagagtgctg caaatgcagt aggtgcaaca ggagaaaaag acactccaac aaaggacctg 1140

aatcatttac agcagcagat gaaattattg gtaaacggag cagagaagtc ttttgataaa 1200

ggaattggag ttcagacgga ttcttctatc gtttatgatc tggaagacaa aggctacact 1260

tctttccaca ccctggcagg cgttgattat tcagcaatgg aatatgtaga cggagaaggc 1320

tgtgatatcc agtttaaagt atatctggat gatgtcgtag tatttgacag cggagtagtt 1380

gatgcatctg atgaggctca ggaagttaat gttgctataa catcagagaa taaagaacta 1440

aaactggaag ctaaaatggt taaagagcct tataatgact ggggaaactg ggcagatgcc 1500

agctttgaaa tggcttatcc cgaaccgtct aatgtggctt taaataaaac agttaccgtt 1560

aagaaaacag cggataactc agactctgaa gtaaattcca gcagaccggg atcaatggct 1620

gtagatggaa tcattggacc tacatcagat tctaactatt gtgattttgg acaggatggg 1680

gataatactt cccgttatct gcaggtagat ttaggggatg tttatgaact tacccagatt 1740

aatatgttta gatactgggc agatggcaga gtatataatg gtactgtaat tgcagtttcc 1800

gaaaacgcag actttagtaa tccaactttt atttataatt cagataaagc agacaaacac 1860

ggacttggcg caggcagtga tgacacttat ggagaaaccc agagtggaaa attattcgaa 1920

gttccggcgg gaaccatggg acagtatgtc cgtgtgtata tggctggttc caacaaaggt 1980

acaacgaacc atatcgctga attacaggta atgggttata atttcaatac agaaccaaaa 2040

ccatatgaag caaatgcatt tgaaaatgca gaagtttatt tagatatgcc aactcatttc 2100

caggatctgg attccaataa aaacgacgat ggaagcttaa agcacattgg cggacaggtg 2160

acacatcctg atatccaggt atttgaccaa ccgtggaacg gttataaata ctggatgatt 2220

tacacaccaa atacaatgat cacttcccag tatgaaaatc catatatcgt agcatctgaa 2280

gatggacaga catgggtaga accggaaggg atttccaatc caattgaacc agaaccgcca 2340

tcaaccagat ttcataactg tgatgcagat ctgttatacg actctgtcaa tgaccgttta 2400

cttgcttact ggaactgggc agatgacggc ggcggaattg atgacgaatt aaaagatcag 2460

aactgtcaga ttcgtctgag aatttcttat gatggaatta actggggagt tccttacgac 2520

aaagacggca atattgccac aacagctgat actgtagtaa gaatggaaac aggagataag 2580

gatttcattc ctgcaatcag cgaaaaagac cgttatggta tgctttcccc aacatttacc 2640

tatgacgatt tccgcggcat atatacaatg tgggcacaaa actcgggtga tgcgggatac 2700

aaccagtccg gaaagttcat cgaaatgaga tggtctgagg atggaataaa ctggtctgaa 2760

ccacaaaaag tgaataattt ccttggaaaa gatgagaatg gcagacagct ttggccatgg 2820

catcaggata ttcagtatat ccctgagcta caggaatatt ggggactgtc ccagtgtttc 2880

tctacatcta atcccgatgg atccgtatta tacctgacca agtccagaga tggtgtcaac 2940

tgggagcagg caggaacaca gccggtatta agggcaggaa aatcaggtac ctgggatgat 3000

ttccagattt accgttctac cttctattat gataatcagt cagacagccc tactggtggg 3060

aaatttagaa tctggtacag tgcactgcag gcaaatactt caggcaagac cgttttggct 3120

cctgatggaa cagtgtctct tcaggttgga agccaggata ccaggatctg gcgtatcggg 3180

tatacagaaa atgactacat ggaagtcatg aaagctctga cccagaataa aaactatgaa 3240

gaatga 3246

<210> 31

<211> 1101

<212> PRT

<213> Robinsoniella peoriensis

<400> 31

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro

1 5 10 15

Arg Gly Ser His Ala Glu Thr Ala Thr Glu Glu Asn Ala Ala Leu Glu

20 25 30

Lys Thr Val Thr Leu His Lys Ser Asp Gly Thr Glu Leu Pro Glu Asp

35 40 45

Tyr Arg Asn Pro Gln Arg Pro Ala Thr Met Ala Val Asp Gly Ile Ile

50 55 60

Asp Asp Thr Gly Glu Tyr Asn Tyr Cys Asp Phe Gly Lys Asp Gly Asp

65 70 75 80

Lys Ala Ala Leu Tyr Met Gln Val Asp Leu Gly Gly Leu Tyr Asp Leu

85 90 95

Ser Arg Val Asn Met Trp Arg Tyr Trp Lys Asp Ser Arg Thr Tyr Asp

100 105 110

Ala Thr Val Ile Thr Thr Ser Glu Ser Gly Asp Phe Thr Asp Glu Ala

115 120 125

Val Ile Tyr Asn Ser Asp Arg Ser Asn Val His Gly Phe Gly Ala Gly

130 135 140

Gly Asp Glu Arg Tyr Ala Glu Thr Ala Ser Gly His Glu Phe Pro Val

145 150 155 160

Pro Asp Gly Thr Lys Ala Gln Ala Val Arg Val Tyr Val Phe Gly Ser

165 170 175

Gln Asn Gly Thr Thr Asn His Ile Asn Glu Leu Gln Val Trp Gly Thr

180 185 190

Pro His Thr Glu Asn Pro Asp Val Asn Ser Tyr Gln Val Thr Ile Pro

195 200 205

Gln Gly Asn Gly Tyr Gln Val Ile Pro Tyr Glu Asn Asp Pro Thr Thr

210 215 220

Val Glu Glu Gly Gly Ser Phe Arg Phe Gln Val Leu Ile Asp Ser Asp

225 230 235 240

Asn Gly Tyr Ser Ala Thr Ser Ala Val Lys Ala Asn Gly Val Ser Leu

245 250 255

Glu Ala Val Asp Ser Val Tyr Thr Ile Glu Asn Ile Thr Glu Asp Gln

260 265 270

Val Ile Thr Ile Glu Gly Val His Lys Ala Gln Tyr Glu Val Lys Phe

275 280 285

Pro Glu Asn Pro Gln Gly Tyr Ser Val Glu Ile Gln Asn Glu Gly Ser

290 295 300

Thr Thr Val Asp Tyr Asn Gly Ser Val Ser Phe Lys Leu Ile Ile Asp

305 310 315 320

Glu Ala Tyr Asn Glu Ser Val Pro Val Val Lys Ala Asn Gly Gly Ala

325 330 335

Ala Leu Gly Lys Asp Glu Leu Gly Val Tyr Thr Ile Ala Asn Ile Gln

340 345 350

Asp Asp Ile Thr Val Thr Val Glu Gly Ile Gln Glu Asn Thr Val Val

355 360 365

Lys Thr Lys Thr Met Tyr Leu Ser Asp Met Asp Trp Lys Ser Ala Ala

370 375 380

Asn Ala Val Gly Ala Thr Gly Glu Lys Asp Thr Pro Thr Lys Asp Leu

385 390 395 400

Asn His Leu Gln Gln Gln Met Lys Leu Leu Val Asn Gly Ala Glu Lys

405 410 415

Ser Phe Asp Lys Gly Ile Gly Val Gln Thr Asp Ser Ser Ile Val Tyr

420 425 430

Asp Leu Glu Asp Lys Gly Tyr Thr Ser Phe His Thr Leu Ala Gly Val

435 440 445

Asp Tyr Ser Ala Met Glu Tyr Val Asp Gly Glu Gly Cys Asp Ile Gln

450 455 460

Phe Lys Val Tyr Leu Asp Asp Val Val Val Phe Asp Ser Gly Val Val

465 470 475 480

Asp Ala Ser Asp Glu Ala Gln Glu Val Asn Val Ala Ile Thr Ser Glu

485 490 495

Asn Lys Glu Leu Lys Leu Glu Ala Lys Met Val Lys Glu Pro Tyr Asn

500 505 510

Asp Trp Gly Asn Trp Ala Asp Ala Ser Phe Glu Met Ala Tyr Pro Glu

515 520 525

Pro Ser Asn Val Ala Leu Asn Lys Thr Val Thr Val Lys Lys Thr Ala

530 535 540

Asp Asn Ser Asp Ser Glu Val Asn Ser Ser Arg Pro Gly Ser Met Ala

545 550 555 560

Val Asp Gly Ile Ile Gly Pro Thr Ser Asp Ser Asn Tyr Cys Asp Phe

565 570 575

Gly Gln Asp Gly Asp Asn Thr Ser Arg Tyr Leu Gln Val Asp Leu Gly

580 585 590

Asp Val Tyr Glu Leu Thr Gln Ile Asn Met Phe Arg Tyr Trp Ala Asp

595 600 605

Gly Arg Val Tyr Asn Gly Thr Val Ile Ala Val Ser Glu Asn Ala Asp

610 615 620

Phe Ser Asn Pro Thr Phe Ile Tyr Asn Ser Asp Lys Ala Asp Lys His

625 630 635 640

Gly Leu Gly Ala Gly Ser Asp Asp Thr Tyr Gly Glu Thr Gln Ser Gly

645 650 655

Lys Leu Phe Glu Val Pro Ala Gly Thr Met Gly Gln Tyr Val Arg Val

660 665 670

Tyr Met Ala Gly Ser Asn Lys Gly Thr Thr Asn His Ile Ala Glu Leu

675 680 685

Gln Val Met Gly Tyr Asn Phe Asn Thr Glu Pro Lys Pro Tyr Glu Ala

690 695 700

Asn Ala Phe Glu Asn Ala Glu Val Tyr Leu Asp Met Pro Thr His Phe

705 710 715 720

Gln Asp Leu Asp Ser Asn Lys Asn Asp Asp Gly Ser Leu Lys His Ile

725 730 735

Gly Gly Gln Val Thr His Pro Asp Ile Gln Val Phe Asp Gln Pro Trp

740 745 750

Asn Gly Tyr Lys Tyr Trp Met Ile Tyr Thr Pro Asn Thr Met Ile Thr

755 760 765

Ser Gln Tyr Glu Asn Pro Tyr Ile Val Ala Ser Glu Asp Gly Gln Thr

770 775 780

Trp Val Glu Pro Glu Gly Ile Ser Asn Pro Ile Glu Pro Glu Pro Pro

785 790 795 800

Ser Thr Arg Phe His Asn Cys Asp Ala Asp Leu Leu Tyr Asp Ser Val

805 810 815

Asn Asp Arg Leu Leu Ala Tyr Trp Asn Trp Ala Asp Asp Gly Gly Gly

820 825 830

Ile Asp Asp Glu Leu Lys Asp Gln Asn Cys Gln Ile Arg Leu Arg Ile

835 840 845

Ser Tyr Asp Gly Ile Asn Trp Gly Val Pro Tyr Asp Lys Asp Gly Asn

850 855 860

Ile Ala Thr Thr Ala Asp Thr Val Val Arg Met Glu Thr Gly Asp Lys

865 870 875 880

Asp Phe Ile Pro Ala Ile Ser Glu Lys Asp Arg Tyr Gly Met Leu Ser

885 890 895

Pro Thr Phe Thr Tyr Asp Asp Phe Arg Gly Ile Tyr Thr Met Trp Ala

900 905 910

Gln Asn Ser Gly Asp Ala Gly Tyr Asn Gln Ser Gly Lys Phe Ile Glu

915 920 925

Met Arg Trp Ser Glu Asp Gly Ile Asn Trp Ser Glu Pro Gln Lys Val

930 935 940

Asn Asn Phe Leu Gly Lys Asp Glu Asn Gly Arg Gln Leu Trp Pro Trp

945 950 955 960

His Gln Asp Ile Gln Tyr Ile Pro Glu Leu Gln Glu Tyr Trp Gly Leu

965 970 975

Ser Gln Cys Phe Ser Thr Ser Asn Pro Asp Gly Ser Val Leu Tyr Leu

980 985 990

Thr Lys Ser Arg Asp Gly Val Asn Trp Glu Gln Ala Gly Thr Gln Pro

995 1000 1005

Val Leu Arg Ala Gly Lys Ser Gly Thr Trp Asp Asp Phe Gln Ile

1010 1015 1020

Tyr Arg Ser Thr Phe Tyr Tyr Asp Asn Gln Ser Asp Ser Pro Thr

1025 1030 1035

Gly Gly Lys Phe Arg Ile Trp Tyr Ser Ala Leu Gln Ala Asn Thr

1040 1045 1050

Ser Gly Lys Thr Val Leu Ala Pro Asp Gly Thr Val Ser Leu Gln

1055 1060 1065

Val Gly Ser Gln Asp Thr Arg Ile Trp Arg Ile Gly Tyr Thr Glu

1070 1075 1080

Asn Asp Tyr Met Glu Val Met Lys Ala Leu Thr Gln Asn Lys Asn

1085 1090 1095

Tyr Glu Glu

1100

<210> 32

<211> 528

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 32

His Ser Gly Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn Asp Val Leu

1 5 10 15

Gln Thr Lys Thr Asn Pro Ser Ser Met Lys Gln Ser Ala Asn Asn Asn

20 25 30

Pro Tyr Asn Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile Gly Thr Gly

35 40 45

Tyr Asn Ala Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr Phe Lys Glu

50 55 60

Ala Ser Ser Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu Lys Tyr Val

65 70 75 80

Asp Ser Glu Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr Asn Ala Phe

85 90 95

Asp Gly Asn Asn Asn Thr Ile Trp His Thr Lys Tyr Ser Gly Gly Asn

100 105 110

Ala Ala Pro Met Pro His Glu Ile Gln Ile Asp Leu Arg Gly Val Tyr

115 120 125

Asn Ile Asn Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly Gly Thr Asn

130 135 140

Gly Thr Ile Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp Gly Val Asn

145 150 155 160

Trp Gly Gln Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn Ser Thr Glu

165 170 175

Lys Ile Val Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val Lys Leu Lys

180 185 190

Ala Leu Ser Glu Ile Asn Asn Lys Gln Phe Thr Thr Val Ala Asp Leu

195 200 205

Lys Val Phe Gly Trp Glu Ile Ser Lys Ile Glu Lys Pro Leu Gln Asn

210 215 220

Ala Glu Thr Tyr Leu Asn Ile Pro Thr Tyr Asp Gly Leu Asn Gln Ser

225 230 235 240

Thr His Pro Asp Val Lys Tyr Phe Lys Asn Gly Trp Asn Gly Tyr Lys

245 250 255

Tyr Trp Met Ile Met Thr Pro Asn Arg Thr Gly Ser Ser Val Ala Glu

260 265 270

Asn Pro Ser Ile Leu Ala Ser Asp Asp Gly Ile Asn Trp Glu Val Pro

275 280 285

Ala Gly Val Thr Asn Pro Ile Ala Pro Met Pro Gln Val Gly His Asn

290 295 300

Cys Asp Val Asp Met Ile Tyr Asn Glu Ala Thr Asp Glu Leu Trp Val

305 310 315 320

Tyr Trp Val Glu Ser Asp Asp Ile Thr Lys Gly Trp Val Lys Leu Ile

325 330 335

Lys Ser Lys Asp Gly Val Asn Trp Ser Ser Gln Gln Val Val Val Asp

340 345 350

Asp Asn Arg Ala Lys Tyr Ser Thr Leu Ser Pro Ser Ile Ile Phe Lys

355 360 365

Asp Asn Lys Tyr Tyr Met Trp Ser Val Asn Thr Gly Asn Ser Gly Trp

370 375 380

Asn Asn Gln Ser Asn Lys Val Glu Leu Arg Glu Ser Ser Asp Gly Val

385 390 395 400

Asn Trp Ser Asn Pro Thr Val Val Asn Thr Leu Ala Gln Asp Gly Ser

405 410 415

Gln Ile Trp His Val Asn Val Glu Tyr Ile Pro Ser Lys Asn Glu Tyr

420 425 430

Trp Ala Ile Tyr Pro Ala Tyr Lys Asn Gly Thr Gly Ser Asp Lys Thr

435 440 445

Glu Leu Tyr Tyr Ala Lys Ser Ser Asp Gly Val Asn Trp Thr Thr Tyr

450 455 460

Lys Asn Pro Ile Leu Ser Lys Gly Thr Ser Gly Lys Trp Asp Asp Met

465 470 475 480

Glu Ile Tyr Arg Ser Cys Phe Val Tyr Asp Glu Asp Thr Asn Met Ile

485 490 495

Lys Val Trp Tyr Gly Ala Val Ser Gln Asn Pro Gln Ile Trp Lys Ile

500 505 510

Gly Phe Thr Glu Asn Asp Tyr Asp Lys Phe Ile Glu Gly Leu Thr Gln

515 520 525

<210> 33

<211> 449

<212> PRT

<213> Ruthenibacterium lactatiformans

<400> 33

His Glu Glu Thr Asp Leu Leu Val Asn Gly Gly Phe Glu Thr Gly Asp

1 5 10 15

Ser Thr Gly Trp Asn Trp Phe Asn Asn Ala Val Val Asp Ser Ala Ala

20 25 30

Pro His Ser Gly Asn Tyr Cys Ala Lys Val Ala Lys Asn Ser Ser Tyr

35 40 45

Glu Gln Val Val Thr Val Ser Pro Asp Thr Lys Tyr Val Leu Thr Gly

50 55 60

Trp Ala Lys Ser Glu Gly Ser Ser Val Met Thr Leu Gly Val Lys Asn

65 70 75 80

Tyr Gly Gly Gln Glu Thr Phe Ser Ala Thr Leu Ser Ala Asp Tyr Gln

85 90 95

Gln Leu Ala Val Thr Phe Thr Thr Gly Pro Asn Ala Gln Thr Ala Thr

100 105 110

Ile Tyr Gly Tyr Arg Gln Asn Ser Gly Ser Gly Ala Gly Tyr Phe Asp

115 120 125

Asp Val Glu Leu Thr Ala Val Gln Asp Phe Ala Pro Tyr Gln Pro Leu

130 135 140

Ala Asn Ala Ile Ala Pro Gln Ala Ile Pro Thr Tyr Asp Gly Ala Asn

145 150 155 160

Gln Pro Thr His Pro Ser Val Val Lys Phe Glu Gln Pro Trp Asn Gly

165 170 175

Tyr Leu Tyr Trp Met Ala Met Thr Pro Tyr Pro Phe Asn Asp Gly Ser

180 185 190

Tyr Glu Asn Pro Ser Ile Val Ala Ser Asn Asp Gly Glu Asn Trp Ile

195 200 205

Val Pro Glu Gly Val Ser Asn Pro Leu Ala Gly Thr Pro Ser Pro Gly

210 215 220

His Asn Cys Asp Val Asp Leu Val Tyr Val Pro Ala Ser Asp Glu Leu

225 230 235 240

Arg Met Tyr Tyr Val Glu Ala Asp Asp Ile Ile Ser Ser Arg Val Lys

245 250 255

Met Ile Ser Ser Arg Asp Gly Val His Trp Ser Glu Pro Gln Val Val

260 265 270

Met Gln Asp Leu Val Arg Lys Tyr Ser Ile Leu Ser Pro Ser Ile Glu

275 280 285

Ile Leu Pro Asp Gly Thr Tyr Met Met Trp Tyr Val Asp Thr Gly Asn

290 295 300

Ala Gly Trp Asn Ser Gln Asn Asn Gln Val Lys Tyr Arg Thr Ser Ala

305 310 315 320

Asp Gly Ile Lys Trp Ser Gly Ala Val Thr Cys Thr Asp Phe Val Gln

325 330 335

Pro Gly Tyr Gln Ile Trp His Ile Asp Val His Tyr Asp Thr Ser Ser

340 345 350

Gly Ala Tyr Tyr Ala Val Tyr Pro Ala Tyr Pro Asn Gly Thr Asp Cys

355 360 365

Asp His Cys Asn Leu Phe Phe Ala Val Asn Arg Thr Gly Lys Gln Trp

370 375 380

Glu Thr Phe Ser Arg Pro Ile Leu Lys Pro Ser Thr Glu Gly Gly Trp

385 390 395 400

Asp Asp Phe Cys Ile Tyr Arg Ser Ser Met Leu Ile Asp Asp Gly Met

405 410 415

Leu Lys Val Trp Tyr Gly Ala Lys Lys Gln Glu Asp Ser Ser Trp His

420 425 430

Thr Gly Leu Thr Met Arg Asp Phe Ser Glu Phe Met Lys Ile Leu Glu

435 440 445

Arg

<210> 34

<211> 845

<212> PRT

<213> Robinsoniella peoriensis

<400> 34

His Ser Pro Leu Ser Ala Ala Ala Glu Ser Gly Thr Gly Thr Arg Leu

1 5 10 15

Val Lys Gly Gln Thr Gly Tyr Leu Thr Glu Glu Gln Ala Ile Arg Asn

20 25 30

Gln Glu Gln Thr Thr Glu Glu Arg Glu Gln Lys Leu Thr Gly Glu Glu

35 40 45

Thr Ala Glu Val Leu Met Glu Gly Thr Lys Asp Ser Gly Ile Val Gln

50 55 60

Thr Glu Glu Val Gln Thr Lys Glu Met Gln Thr Glu Asp Ala Gln Thr

65 70 75 80

Glu Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala Gln Thr Lys

85 90 95

Glu Val Gln Thr Glu Glu Met Gln Thr Glu Asp Ala Gln Thr Glu Glu

100 105 110

Val Gln Thr Lys Glu Glu Pro Ala Glu Glu Thr His Met Lys Glu Ile

115 120 125

Gln Thr Gln Gly Thr Lys Lys Ala Ser Asp Arg Asn Gly Lys Ala Arg

130 135 140

Val Thr Glu Ile Leu Glu Asp Ala Gln Asp Pro Ala Asn Arg Ile Val

145 150 155 160

Tyr Leu Ser Asp Leu Gln Trp Lys Ser Glu Asn His Thr Val Asp Ser

165 170 175

Glu Leu Pro Thr Arg Lys Asp Lys Ser Phe Gly Gly Gly Lys Ile Thr

180 185 190

Leu Lys Val Asp Gly Thr Val Thr Glu Phe Asp Lys Gly Ile Gly Thr

195 200 205

Gln Thr Asp Ser Thr Ile Val Tyr Asp Leu Glu Gly Lys Gly Tyr Thr

210 215 220

Lys Phe Glu Thr Tyr Val Gly Val Asp Tyr Ser Gln Lys Glu Asn Ile

225 230 235 240

Pro Gly Glu Val Cys Asp Val Lys Phe Arg Val Lys Ile Asp Asp Lys

245 250 255

Ile Val Ser Glu Thr Gly Val Leu Asp Pro Leu Ser Asn Ala Val Lys

260 265 270

Ile Ser Val Asn Ile Pro Asp Thr Ala Lys Thr Leu Thr Leu Tyr Ala

275 280 285

Asp Lys Val Thr Glu Thr Trp Ser Asp His Ala Asn Trp Ala Asp Ala

290 295 300

Lys Phe Tyr Gln Ala Leu Pro Glu Pro Glu Asn Val Ala Phe Lys Lys

305 310 315 320

Thr Val Val Thr Arg Lys Thr Ser Asp Asn Ser Glu Ala Pro Val Asn

325 330 335

Pro Asp Ser Ala Val Asn Ser Ser Lys Ala Val Asp Gly Val Ile Asp

340 345 350

Ser Ser Ser Tyr Phe Asp Phe Gly Asp Gln Ala Asn Ser Gly Ala Val

355 360 365

Arg Glu Ser Leu Tyr Met Glu Val Asp Leu Lys Gly Ser Tyr Leu Leu

370 375 380

Ser Asp Ile Gln Leu Trp Arg Tyr Trp Lys Asp Gly Arg Thr Tyr Ala

385 390 395 400

Ala Thr Ala Ile Val Val Ala Glu Asp Glu Asn Phe Glu Asn Ala Ala

405 410 415

Val Ile Tyr Asn Ser Asp Thr Thr Gly Glu Ile His His Leu Gly Ala

420 425 430

Gly Ser Asp Met Leu Tyr Ala Glu Thr Glu Ser Gly Lys Thr Phe Pro

435 440 445

Val Pro Glu Asn Thr Lys Ala Arg Tyr Ile Arg Val Tyr Thr Tyr Gly

450 455 460

Val Asn Gly Thr Ser Gly Val Thr Asn His Ile Val Glu Leu Lys Val

465 470 475 480

Asn Ala Tyr Val Phe Gly Asp Glu Ile Leu Pro Glu Lys Pro Asp Asp

485 490 495

Ser Lys Ile Phe Pro Asn Ala Val Asn Pro Leu Lys Leu Gln Gly Pro

500 505 510

Gly Thr Asn Asp Gln Val Thr His Pro Asp Val Thr Val Phe Asp Glu

515 520 525

Pro Trp Asn Gly Tyr Lys Tyr Trp Met Ala Tyr Thr Pro Asn Lys Pro

530 535 540

Gly Ser Ser Tyr Phe Glu Asn Pro Cys Ile Ala Ala Ser Asn Asp Gly

545 550 555 560

Val Asn Trp Glu Phe Pro Ala Gln Asn Pro Val Gln Pro Arg Tyr Asp

565 570 575

Ser Glu Ile Glu Asn Gln Asn Glu His Asn Cys Asp Thr Asp Ile Val

580 585 590

Tyr Asp Pro Val Asn Asp Arg Leu Ile Met Tyr Trp Glu Trp Ala Gln

595 600 605

Asp Glu Ala Val Asn Gly Lys Thr His Arg Ser Glu Ile Arg Tyr Arg

610 615 620

Val Ser Tyr Asp Gly Ile Asn Trp Gly Val Glu Asp Lys Thr Gly Val

625 630 635 640

Leu Met Thr Gly Pro Thr Asp His Gly Cys Ala Ile Ala Thr Glu Gly

645 650 655

Glu Arg Tyr Ser Asp Leu Ser Pro Thr Val Val Tyr Asp Lys Thr Glu

660 665 670

Lys Ile Tyr Lys Met Trp Ala Asn Asp Ala Gly Asp Val Gly Tyr Glu

675 680 685

Asn Lys Gln Asn Asn Lys Val Trp Tyr Arg Thr Ser Gln Asp Gly Ile

690 695 700

Ser Asn Trp Ser Asp Lys Thr Tyr Val Glu Asn Phe Leu Gly Val Asn

705 710 715 720

Glu Asp Gly Leu Gln Met Tyr Pro Trp His Gln Asp Ile Gln Trp Val

725 730 735

Glu Glu Phe Gln Glu Tyr Trp Ala Leu Gln Gln Ala Phe Pro Ala Gly

740 745 750

Ser Gly Pro Asp Asn Ser Ser Leu Arg Phe Ser Lys Ser Lys Asp Gly

755 760 765

Leu His Trp Glu Pro Val Ser Glu Lys Ala Leu Ile Thr Val Gly Ala

770 775 780

Pro Gly Thr Trp Asp Ala Gly Gln Ile Tyr Arg Ser Thr Phe Trp Tyr

785 790 795 800

Glu Pro Gly Gly Ala Lys Gly Asn Gly Thr Phe His Ile Trp Tyr Ala

805 810 815

Ala Leu Ala Glu Gly Gln Ser His Trp Asp Ile Gly Tyr Thr Ser Ala

820 825 830

Asn Tyr Ala Asp Ala Met Tyr Lys Leu Thr Gly Ser Arg

835 840 845

<210> 35

<211> 1082

<212> PRT

<213> Robinsoniella peoriensis

<400> 35

His Ala Glu Thr Ala Thr Glu Glu Asn Ala Ala Leu Glu Lys Thr Val

1 5 10 15

Thr Leu His Lys Ser Asp Gly Thr Glu Leu Pro Glu Asp Tyr Arg Asn

20 25 30

Pro Gln Arg Pro Ala Thr Met Ala Val Asp Gly Ile Ile Asp Asp Thr

35 40 45

Gly Glu Tyr Asn Tyr Cys Asp Phe Gly Lys Asp Gly Asp Lys Ala Ala

50 55 60

Leu Tyr Met Gln Val Asp Leu Gly Gly Leu Tyr Asp Leu Ser Arg Val

65 70 75 80

Asn Met Trp Arg Tyr Trp Lys Asp Ser Arg Thr Tyr Asp Ala Thr Val

85 90 95

Ile Thr Thr Ser Glu Ser Gly Asp Phe Thr Asp Glu Ala Val Ile Tyr

100 105 110

Asn Ser Asp Arg Ser Asn Val His Gly Phe Gly Ala Gly Gly Asp Glu

115 120 125

Arg Tyr Ala Glu Thr Ala Ser Gly His Glu Phe Pro Val Pro Asp Gly

130 135 140

Thr Lys Ala Gln Ala Val Arg Val Tyr Val Phe Gly Ser Gln Asn Gly

145 150 155 160

Thr Thr Asn His Ile Asn Glu Leu Gln Val Trp Gly Thr Pro His Thr

165 170 175

Glu Asn Pro Asp Val Asn Ser Tyr Gln Val Thr Ile Pro Gln Gly Asn

180 185 190

Gly Tyr Gln Val Ile Pro Tyr Glu Asn Asp Pro Thr Thr Val Glu Glu

195 200 205

Gly Gly Ser Phe Arg Phe Gln Val Leu Ile Asp Ser Asp Asn Gly Tyr

210 215 220

Ser Ala Thr Ser Ala Val Lys Ala Asn Gly Val Ser Leu Glu Ala Val

225 230 235 240

Asp Ser Val Tyr Thr Ile Glu Asn Ile Thr Glu Asp Gln Val Ile Thr

245 250 255

Ile Glu Gly Val His Lys Ala Gln Tyr Glu Val Lys Phe Pro Glu Asn

260 265 270

Pro Gln Gly Tyr Ser Val Glu Ile Gln Asn Glu Gly Ser Thr Thr Val

275 280 285

Asp Tyr Asn Gly Ser Val Ser Phe Lys Leu Ile Ile Asp Glu Ala Tyr

290 295 300

Asn Glu Ser Val Pro Val Val Lys Ala Asn Gly Gly Ala Ala Leu Gly

305 310 315 320

Lys Asp Glu Leu Gly Val Tyr Thr Ile Ala Asn Ile Gln Asp Asp Ile

325 330 335

Thr Val Thr Val Glu Gly Ile Gln Glu Asn Thr Val Val Lys Thr Lys

340 345 350

Thr Met Tyr Leu Ser Asp Met Asp Trp Lys Ser Ala Ala Asn Ala Val

355 360 365

Gly Ala Thr Gly Glu Lys Asp Thr Pro Thr Lys Asp Leu Asn His Leu

370 375 380

Gln Gln Gln Met Lys Leu Leu Val Asn Gly Ala Glu Lys Ser Phe Asp

385 390 395 400

Lys Gly Ile Gly Val Gln Thr Asp Ser Ser Ile Val Tyr Asp Leu Glu

405 410 415

Asp Lys Gly Tyr Thr Ser Phe His Thr Leu Ala Gly Val Asp Tyr Ser

420 425 430

Ala Met Glu Tyr Val Asp Gly Glu Gly Cys Asp Ile Gln Phe Lys Val

435 440 445

Tyr Leu Asp Asp Val Val Val Phe Asp Ser Gly Val Val Asp Ala Ser

450 455 460

Asp Glu Ala Gln Glu Val Asn Val Ala Ile Thr Ser Glu Asn Lys Glu

465 470 475 480

Leu Lys Leu Glu Ala Lys Met Val Lys Glu Pro Tyr Asn Asp Trp Gly

485 490 495

Asn Trp Ala Asp Ala Ser Phe Glu Met Ala Tyr Pro Glu Pro Ser Asn

500 505 510

Val Ala Leu Asn Lys Thr Val Thr Val Lys Lys Thr Ala Asp Asn Ser

515 520 525

Asp Ser Glu Val Asn Ser Ser Arg Pro Gly Ser Met Ala Val Asp Gly

530 535 540

Ile Ile Gly Pro Thr Ser Asp Ser Asn Tyr Cys Asp Phe Gly Gln Asp

545 550 555 560

Gly Asp Asn Thr Ser Arg Tyr Leu Gln Val Asp Leu Gly Asp Val Tyr

565 570 575

Glu Leu Thr Gln Ile Asn Met Phe Arg Tyr Trp Ala Asp Gly Arg Val

580 585 590

Tyr Asn Gly Thr Val Ile Ala Val Ser Glu Asn Ala Asp Phe Ser Asn

595 600 605

Pro Thr Phe Ile Tyr Asn Ser Asp Lys Ala Asp Lys His Gly Leu Gly

610 615 620

Ala Gly Ser Asp Asp Thr Tyr Gly Glu Thr Gln Ser Gly Lys Leu Phe

625 630 635 640

Glu Val Pro Ala Gly Thr Met Gly Gln Tyr Val Arg Val Tyr Met Ala

645 650 655

Gly Ser Asn Lys Gly Thr Thr Asn His Ile Ala Glu Leu Gln Val Met

660 665 670

Gly Tyr Asn Phe Asn Thr Glu Pro Lys Pro Tyr Glu Ala Asn Ala Phe

675 680 685

Glu Asn Ala Glu Val Tyr Leu Asp Met Pro Thr His Phe Gln Asp Leu

690 695 700

Asp Ser Asn Lys Asn Asp Asp Gly Ser Leu Lys His Ile Gly Gly Gln

705 710 715 720

Val Thr His Pro Asp Ile Gln Val Phe Asp Gln Pro Trp Asn Gly Tyr

725 730 735

Lys Tyr Trp Met Ile Tyr Thr Pro Asn Thr Met Ile Thr Ser Gln Tyr

740 745 750

Glu Asn Pro Tyr Ile Val Ala Ser Glu Asp Gly Gln Thr Trp Val Glu

755 760 765

Pro Glu Gly Ile Ser Asn Pro Ile Glu Pro Glu Pro Pro Ser Thr Arg

770 775 780

Phe His Asn Cys Asp Ala Asp Leu Leu Tyr Asp Ser Val Asn Asp Arg

785 790 795 800

Leu Leu Ala Tyr Trp Asn Trp Ala Asp Asp Gly Gly Gly Ile Asp Asp

805 810 815

Glu Leu Lys Asp Gln Asn Cys Gln Ile Arg Leu Arg Ile Ser Tyr Asp

820 825 830

Gly Ile Asn Trp Gly Val Pro Tyr Asp Lys Asp Gly Asn Ile Ala Thr

835 840 845

Thr Ala Asp Thr Val Val Arg Met Glu Thr Gly Asp Lys Asp Phe Ile

850 855 860

Pro Ala Ile Ser Glu Lys Asp Arg Tyr Gly Met Leu Ser Pro Thr Phe

865 870 875 880

Thr Tyr Asp Asp Phe Arg Gly Ile Tyr Thr Met Trp Ala Gln Asn Ser

885 890 895

Gly Asp Ala Gly Tyr Asn Gln Ser Gly Lys Phe Ile Glu Met Arg Trp

900 905 910

Ser Glu Asp Gly Ile Asn Trp Ser Glu Pro Gln Lys Val Asn Asn Phe

915 920 925

Leu Gly Lys Asp Glu Asn Gly Arg Gln Leu Trp Pro Trp His Gln Asp

930 935 940

Ile Gln Tyr Ile Pro Glu Leu Gln Glu Tyr Trp Gly Leu Ser Gln Cys

945 950 955 960

Phe Ser Thr Ser Asn Pro Asp Gly Ser Val Leu Tyr Leu Thr Lys Ser

965 970 975

Arg Asp Gly Val Asn Trp Glu Gln Ala Gly Thr Gln Pro Val Leu Arg

980 985 990

Ala Gly Lys Ser Gly Thr Trp Asp Asp Phe Gln Ile Tyr Arg Ser Thr

995 1000 1005

Phe Tyr Tyr Asp Asn Gln Ser Asp Ser Pro Thr Gly Gly Lys Phe

1010 1015 1020

Arg Ile Trp Tyr Ser Ala Leu Gln Ala Asn Thr Ser Gly Lys Thr

1025 1030 1035

Val Leu Ala Pro Asp Gly Thr Val Ser Leu Gln Val Gly Ser Gln

1040 1045 1050

Asp Thr Arg Ile Trp Arg Ile Gly Tyr Thr Glu Asn Asp Tyr Met

1055 1060 1065

Glu Val Met Lys Ala Leu Thr Gln Asn Lys Asn Tyr Glu Glu

1070 1075 1080

<210> 36

<211> 986

<212> PRT

<213> third Clostridium bacterium (Clostridium terrium)

<400> 36

His Tyr Asn Leu Ile Asp Asn Ile Ser Val Glu Lys Leu Asp Thr Asp

1 5 10 15

Ile Ser Gln Ala Asn Glu Asn Val Phe Leu Asn Gly Asn Gly Ile Ala

20 25 30

Leu Glu Val Asp Asn Arg Gly Ala Thr Cys Ile Tyr Leu Val Asp Glu

35 40 45

Asn Gly Val Lys Thr Lys Ala Thr Thr Ser Leu Asp Thr Ala Asp Phe

50 55 60

Ser Gly Tyr Pro Ile Ile Gly Gly Gln Lys Ile Arg Asp Phe Val Ile

65 70 75 80

Ile Ser Lys Asn Leu Glu Glu Asn Ile Asn Ser Ile Leu Gly Val Gly

85 90 95

Asn Arg Leu Thr Ile Ile Ser Lys Ser Ser Ser Thr Asn Leu Ile Arg

100 105 110

Lys Ile Val Phe Glu Thr Ser Asn Ser Asn Pro Gly Ala Ile Tyr Ser

115 120 125

Thr Val Ser Tyr Lys Ala Glu Ser Asn Asp Leu Leu Val Asp Ser Phe

130 135 140

His Glu Asn Glu Tyr Thr Met Ser Leu Gly Gln Gly Pro Phe Leu Ala

145 150 155 160

Tyr Gln Gly Cys Ala Asp Gln Gln Gly Ala Asn Thr Ile Val Asn Val

165 170 175

Thr Asn Gly Tyr Asn His Asn Ser Gly Gln Asn Asn Tyr Ser Val Gly

180 185 190

Val Pro Phe Ser Tyr Val Tyr Asn Ser Val Gly Gly Ile Gly Ile Gly

195 200 205

Asp Ala Ser Thr Ser Arg Arg Glu Phe Lys Leu Pro Ile Ile Gly Lys

210 215 220

Asp Asn Thr Val Ser Leu Gly Met Glu Trp Asn Gly Gln Thr Leu Lys

225 230 235 240

Lys Gly Ala Glu Thr Ala Ile Gly Thr Ser Val Ile Thr Thr Thr Asn

245 250 255

Gly Asp Tyr Tyr Ser Gly Leu Lys Ser Tyr Ala Glu Val Met Lys Asp

260 265 270

Lys Gly Ile Ser Ala Pro Ala Ser Ile Pro Asp Ile Ala Tyr Asp Ser

275 280 285

Arg Trp Glu Ser Trp Gly Phe Glu Phe Asp Phe Thr Ile Glu Lys Ile

290 295 300

Val Asn Lys Leu Asp Glu Leu Lys Ala Met Gly Ile Lys Gln Ile Thr

305 310 315 320

Leu Asp Asp Gly Trp Tyr Thr Tyr Ala Gly Asp Trp Lys Leu Ser Pro

325 330 335

Gln Lys Phe Pro Asn Gly Asn Ala Asp Met Lys Tyr Leu Thr Asp Glu

340 345 350

Ile His Lys Arg Gly Met Thr Ala Ile Leu Trp Trp Arg Pro Val Asp

355 360 365

Gly Gly Ile Asn Ser Lys Leu Val Ser Glu His Pro Glu Trp Phe Ile

370 375 380

Lys Asn Ser Gln Gly Asn Met Val Arg Leu Pro Gly Pro Gly Gly Gly

385 390 395 400

Asn Gly Gly Thr Ala Gly Tyr Ala Leu Cys Pro Asn Ser Glu Gly Ser

405 410 415

Ile Gln His His Lys Asp Phe Val Thr Val Ala Leu Glu Glu Trp Gly

420 425 430

Phe Asp Gly Phe Lys Glu Asp Tyr Val Trp Gly Ile Pro Lys Cys Tyr

435 440 445

Asp Ser Ser His Lys His Ser Ser Leu Ser Asp Thr Leu Glu Asn Gln

450 455 460

Tyr Lys Phe Tyr Glu Ala Ile Tyr Glu Gln Ser Ile Ala Ile Asn Pro

465 470 475 480

Asp Thr Phe Ile Glu Leu Cys Asn Cys Gly Thr Pro Gln Asp Phe Tyr

485 490 495

Ser Thr Pro Tyr Val Asn His Ala Pro Thr Ala Asp Pro Ile Ser Arg

500 505 510

Val Gln Thr Arg Thr Arg Val Lys Ala Phe Lys Ala Ile Phe Gly Asp

515 520 525

Asp Phe Pro Val Thr Thr Asp His Asn Ser Val Trp Leu Pro Ser Ala

530 535 540

Leu Gly Thr Gly Ser Val Met Ile Thr Lys His Thr Thr Leu Ser Ser

545 550 555 560

Ser Asp Arg Glu Gln Tyr Asn Lys Tyr Phe Gly Leu Ala Arg Asp Leu

565 570 575

Glu Leu Ala Lys Gly Glu Phe Ile Gly Asn Leu Tyr Lys Tyr Gly Ile

580 585 590

Asp Pro Leu Glu Ser Tyr Val Ile Arg Lys Gly Glu Asp Ile Tyr Tyr

595 600 605

Ser Phe Tyr Lys Asp Asn Ser Ser Tyr Ser Gly Asn Ile Glu Ile Lys

610 615 620

Gly Leu Asp Ser Asn Ala Thr Tyr Arg Ile Glu Asp Tyr Val Asn Asn

625 630 635 640

Arg Val Ile Ala Arg Gly Val Lys Gly Pro Thr Ala Thr Ile Asn Thr

645 650 655

Ser Phe Thr Asp Asn Leu Leu Val Arg Ala Ile Pro Asp Asp Thr Pro

660 665 670

Ala Glu Val Thr Thr Phe Asp Val Gly Asn Asn Thr Ile Leu Ser Ser

675 680 685

Thr Asp Ser Gly Asn Ser Lys Tyr Leu Asn Ala Val Ser Thr Thr Leu

690 695 700

Glu Lys Thr Ala Thr Ile Asp Ser Leu Ser Ile Tyr Ile Gly Asn Asn

705 710 715 720

Ser Glu Asn Gly Lys Leu Gln Ile Ala Ile Tyr Asp Asp Asn Asn Gly

725 730 735

Lys Pro Gly Thr Lys Lys Ala Tyr Val Glu Glu Phe Val Pro Thr Lys

740 745 750

Asn Ser Trp Asn Thr Lys Lys Val Val Asn Ser Val Thr Leu Pro Ser

755 760 765

Gly Gln Tyr Trp Leu Val Phe Gln Pro Asp Asn Asp Val Leu Gln Thr

770 775 780

Lys Thr Asn Pro Ser Ser Met Lys Gln Ser Ala Asn Asn Asn Pro Tyr

785 790 795 800

Asn Tyr Asn Ile Leu Pro Asn Ser Phe Pro Ile Gly Thr Gly Tyr Asn

805 810 815

Ala Tyr Lys Gly Asp Val Ser Phe Tyr Ala Thr Phe Lys Glu Ala Ser

820 825 830

Ser Gln Ala Ile Pro Gln Asn Ser Trp Ala Leu Lys Tyr Val Asp Ser

835 840 845

Glu Glu Thr Thr Gly Glu Asn Gly Arg Ala Thr Asn Ala Phe Asp Gly

850 855 860

Asn Asn Asn Thr Ile Trp His Thr Lys Tyr Ser Gly Gly Asn Ala Ala

865 870 875 880

Pro Met Pro His Glu Ile Gln Ile Asp Leu Arg Gly Val Tyr Asn Ile

885 890 895

Asn Gln Ile Asn Tyr Leu Pro Arg Gln Asp Gly Gly Thr Asn Gly Thr

900 905 910

Ile Lys Asp Tyr Glu Val Tyr Leu Ser Leu Asp Gly Val Asn Trp Gly

915 920 925

Gln Pro Ile Ser Lys Gly Thr Phe Glu Ser Asn Ser Thr Glu Lys Ile

930 935 940

Val Lys Phe Asn Glu Thr Lys Ser Arg Tyr Val Lys Leu Lys Ala Leu

945 950 955 960

Ser Glu Ile Asn Asn Lys Gln Phe Thr Thr Val Ala Asp Leu Lys Val

965 970 975

Phe Gly Trp Glu Ile Ser Lys Ile Glu Lys

980 985

<210> 37

<211> 1262

<212> PRT

<213> Robinsoniella peoriensis

<400> 37

His Gly Asn Gly Leu Glu Val Lys Ala Ser Pro Arg Glu Val Ala Gln

1 5 10 15

Ile Thr Gly Asn Gly Val Ser Val Thr Phe Phe Gln Glu Asp Gly Thr

20 25 30

Val Gln Leu Ser Cys Ile Glu Asp Asp Gly Asn Thr Ala Phe Met Thr

35 40 45

Arg Asn Ser Glu Val Ser Tyr Pro Val Val Gly Gly Glu Glu Val Thr

50 55 60

Asp Phe Ser Asp Phe Gln Cys Glu Val Gln Glu Asn Val Thr Gly Ala

65 70 75 80

Ala Gly Ala Gly Ser Arg Met Thr Ile Thr Ser Ile Ser Ser Gly Arg

85 90 95

Gly Ile Gln Arg Ser Val Val Ile Glu Thr Val Asp Glu Val Lys Gly

100 105 110

Leu Leu His Ile Ser Ser Ser Tyr Arg Ala Glu Glu Glu Val Asp Ala

115 120 125

Asp Glu Phe Ile Asp Ser Arg Phe Ser Leu Asp Asn Pro Ser Asp Thr

130 135 140

Val Trp Ser Tyr Asn Gly Gly Gly Glu Gly Ala Gln Ser Arg Tyr Asp

145 150 155 160

Thr Leu Gln Lys Ile Asp Leu Ser Asp Gly Glu Ser Phe Tyr Arg Glu

165 170 175

Asn Leu Gln Asn Gln Thr Ala Ala Gly Ile Pro Val Ala Asp Ile Tyr

180 185 190

Gly Lys Asp Gly Gly Ile Thr Val Gly Asp Ala Ser Val Thr Arg Arg

195 200 205

Gln Leu Ser Thr Pro Val Asn Glu Arg Asn Gly Thr Ala Tyr Val Ser

210 215 220

Val Lys His Pro Gly Ala Val Ile Thr Gln Arg Glu Thr Glu Ile Ser

225 230 235 240

Gln Ser Phe Val Asn Val His Arg Gly Asp Tyr Tyr Ser Gly Leu Arg

245 250 255

Gly Tyr Ala Asp Gly Met Lys Gln Ile Gly Phe Thr Thr Leu Ser Arg

260 265 270

Glu Gln Ile Pro Glu Ser Ser Tyr Asp Leu Arg Trp Glu Ser Trp Gly

275 280 285

Trp Glu Phe Asp Trp Thr Val Glu Leu Ile Ile Asn Lys Leu Asp Glu

290 295 300

Leu Lys Glu Met Gly Ile Lys Gln Ile Thr Leu Asp Asp Gly Trp Tyr

305 310 315 320

Asn Ala Ala Gly Glu Trp Gly Leu Asn Asn Trp Lys Leu Pro Asn Gly

325 330 335

Ala Leu Asp Met Arg His Leu Thr Asp Ala Ile His Glu Arg Gly Met

340 345 350

Thr Ala Val Leu Trp Trp Arg Pro Cys Asp Gly Gly Arg Glu Asp Ser

355 360 365

Ala Leu Phe Lys Glu His Pro Glu Tyr Phe Ile Lys Asn Gln Asp Gly

370 375 380

Ser Phe Gly Lys Leu Ala Gly Pro Gly Gln Trp Asn Ser Phe Leu Gly

385 390 395 400

Ser Cys Gly Tyr Ala Leu Cys Pro Leu Ser Glu Gly Ala Val Gln Ser

405 410 415

Gln Val Asp Phe Ile Asn Arg Ala Met Asn Glu Trp Gly Phe Asp Gly

420 425 430

Phe Lys Ser Asp Tyr Val Trp Ser Leu Pro Lys Cys Tyr Ser Gln Asp

435 440 445

His His His Glu Tyr Pro Glu Glu Ser Thr Glu Gln Gln Ala Val Phe

450 455 460

Tyr Arg Ala Val Tyr Glu Ala Met Thr Asp Asn Asp Pro Asn Ala Phe

465 470 475 480

His Leu Leu Cys Asn Cys Gly Thr Pro Gln Asp Tyr Tyr Ser Leu Pro

485 490 495

Tyr Val Thr Gln Val Pro Thr Ala Asp Pro Thr Ser Val Asp Gln Thr

500 505 510

Arg Arg Arg Val Lys Ala Tyr Lys Ala Leu Cys Gly Asp Tyr Phe Pro

515 520 525

Val Thr Thr Asp His Asn Glu Val Trp Tyr Pro Ser Thr Ile Gly Thr

530 535 540

Gly Ala Ile Leu Ile Glu Lys Arg Asp Leu Ser Gly Trp Glu Glu Glu

545 550 555 560

Glu Tyr Ala Lys Trp Leu Lys Ile Ala Gln Glu Asn Gln Leu His Lys

565 570 575

Gly Thr Phe Ile Gly Asp Leu Tyr Ser Tyr Gly Tyr Asp Pro Tyr Glu

580 585 590

Thr Tyr Thr Val Tyr Lys Asp Gly Ile Met Tyr Tyr Ala Phe Tyr Lys

595 600 605

Asp Gly Asn Arg Tyr Arg Pro Ser Gly Asn Pro Asp Ile Glu Leu Lys

610 615 620

Gly Leu Glu Asp Gly Lys Leu Tyr Arg Ile Val Asp Tyr Val Asn Asn

625 630 635 640

Gln Val Val Ala Thr Asn Val Thr Ser Ser Asn Ala Val Phe Ser Tyr

645 650 655

Pro Phe Ser Asp Tyr Leu Leu Val Lys Ala Val Glu Ile Ser Glu Pro

660 665 670

Asp Thr Asp Gly Pro Gly Pro Val Pro Asp Pro Glu Gly Ala Val Thr

675 680 685

Val Glu Glu Asn Asp Pro Glu Leu Val Tyr Thr Gly Asp Trp Val Arg

690 695 700

Glu Glu Asn Asp Gly Tyr His Gly Gly Gly Ala Arg Tyr Thr Lys Glu

705 710 715 720

Ala Glu Ala Ser Val Glu Leu Ala Phe Tyr Gly Thr Gly Ala Ala Trp

725 730 735

Tyr Gly Gln His Asp Val Asn Phe Gly Ser Ala Arg Ile Tyr Ile Asp

740 745 750

Gly Thr Tyr Val Lys Thr Val Ser Cys Met Gly Glu Pro Gly Ile Asn

755 760 765

Ile Lys Leu Phe Glu Ile Ser Gly Leu Asp Leu Ala Ser His Arg Ile

770 775 780

Lys Ile Glu Cys Glu Thr Pro Val Ile Asp Ile Asp Arg Leu Thr Tyr

785 790 795 800

Ile Lys Gly Glu Glu Val Pro Ala Lys Val Met Thr Ala Asp Leu Arg

805 810 815

Ala Leu Thr Val Ile Ala Asn Gln Tyr Asp Met Asn Ser Phe Ala Asp

820 825 830

Gly Asn Tyr Lys Asp Gln Leu Gly Val Ser Leu Val Arg Ala Asn Gln

835 840 845

Leu Leu Ala Ala Asp Asp Val Thr Gln Gly Ala Val Asn Glu Glu Gln

850 855 860

Lys Tyr Leu Leu Asn Ala Met Leu Lys Ile Arg Lys Lys Val Asp Lys

865 870 875 880

Ser Trp Ile Gly Leu Pro Gly Pro Ile Pro Gln Asp Ile Gln Thr Glu

885 890 895

Asn Ile Ser Arg Asp Asn Leu Ala Lys Val Ile Ser Tyr Thr Gly Gln

900 905 910

Leu Asp Arg Asp Glu Ile Ile Pro Ala Ile Lys Glu Gln Leu Asn Asp

915 920 925

Ser Tyr Asp Lys Ala Val Ser Ile Ala Glu Arg Gln Asp Ala Ser Gln

930 935 940

Pro Glu Ile Asp Arg Ala Trp Ala Glu Leu Met Asn Ala Val Gln Tyr

945 950 955 960

Ser Ser Tyr Ile Arg Gly Ser Lys Glu Glu Leu Leu Ser Leu Leu Asp

965 970 975

Glu Tyr Gly Lys Val Asp Thr Thr Val Tyr Lys Asp Ala Ala Leu Phe

980 985 990

Ile Glu Ser Leu Glu Ala Ala Lys Lys Val Tyr Gln Asp Glu Asn Ala

995 1000 1005

Met Asp Gly Glu Ile Ser Asp Cys Ile Lys Gln Leu Arg Asp Ala

1010 1015 1020

Lys Asp Gln Leu Gln Leu Lys Asp Pro Val Asp Pro Pro Lys Pro

1025 1030 1035

Asp Pro Asp Pro Asp Pro Lys Pro Asp Pro Thr Pro Asp Pro Gly

1040 1045 1050

Pro Asp Pro Lys Pro Asp Pro Thr Pro Asp Pro Thr Pro Asp Pro

1055 1060 1065

Lys Pro Asn Pro Thr Pro Thr Pro Asp Pro Thr Pro Glu Pro Ala

1070 1075 1080

Leu Lys Lys Pro Glu Gln Val Ser Gly Leu Lys Ser Lys Ala Glu

1085 1090 1095

Thr Asp Tyr Leu Thr Val Ser Trp Lys Lys Leu Asn Asn Ala Glu

1100 1105 1110

Ser Tyr Lys Val Tyr Ile Tyr Lys Ser Gly Lys Trp Arg Leu Ala

1115 1120 1125

Gly Lys Thr Thr Lys Thr Ser Ile Lys Ile Lys Lys Leu Val Ser

1130 1135 1140

Gly Thr Lys Tyr Thr Val Lys Val Ala Ala Val Asn Lys Ala Gly

1145 1150 1155

Gln Gly Lys Tyr Ser Ser Gln Val Tyr Thr Ala Ala Lys Pro Lys

1160 1165 1170

Lys Val Lys Leu Lys Ser Val Ser Arg Tyr Arg Thr Ser Lys Val

1175 1180 1185

Lys Leu Asn Tyr Gly Lys Val Lys Ala Gly Gly Tyr Glu Ile Trp

1190 1195 1200

Met Lys Asn Gly Lys Gly Ser Tyr Lys Lys Ala Ala Thr Ser Thr

1205 1210 1215

Lys Thr Thr Ala Ile Lys Ser Gly Leu Lys Lys Gly Lys Thr Tyr

1220 1225 1230

Tyr Phe Lys Val Arg Ala Tyr Val Lys Asn Lys Asn Gln Val Ile

1235 1240 1245

Tyr Gly Ser Phe Ser Asn Ile Lys Lys Tyr Lys Met Val Leu

1250 1255 1260

<210> 38

<211> 32

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ No-Signal P _ fw

<400> 38

atggtctcgc catgcagact ccagcgagtc cg 32

<210> 39

<211> 34

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ D1min _ rv

<400> 39

atggtctcga ttcttacgtc gtgtagccgg ggtc 34

<210> 40

<211> 41

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ D1ext _ rv

<400> 40

atggtctcga ttcttaatca ctggaggtat atttcacgac c 41

<210> 41

<211> 38

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ D1+2_ rv

<400> 41

atggtctcga ttcttacgca ggctcgattg gaccatac 38

<210> 42

<211> 34

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ D2ext _ fw

<400> 42

atggtctcgc catgatgtgg cgacggtgga tgag 34

<210> 43

<211> 41

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNAcDeAc _ rv

<400> 43

atggtctcga ttcttattct cccacatacg aaaaatagtc g 41

<210> 44

<211> 39

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNase _ No Signal P _ fw

<400> 44

atggtctcgc catcgtggta aaaagttcat atcactcac 39

<210> 45

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNase _ truncated _ rv

<400> 45

atggtctcga ttcttatgcg ttagtggtat aagtcaaata gtc 43

<210> 46

<211> 40

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer FpGalNase _ rv

<400> 46

atggtctcga ttcttattcc gaaatttcca ccgctttaac 40

<210> 47

<211> 50

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Ct5757_ fw

<400> 47

atggtctcgc cattataatt taattgataa tattagtgtt gaaaaattag 50

<210> 48

<211> 38

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Ct5757_ rv

<400> 48

atggtctcga ttcttattgt gttaaaccct caataaac 38

<210> 49

<211> 45

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Ct5757_ GalNase _ rv

<400> 49

atggtctcga ttcttaatga gtactttgat ttaatccatc ataag 45

<210> 50

<211> 37

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Ct5757_ DeAcase _ fw

<400> 50

atggtctcgc cattcagggc aatattggtt agttttc 37

<210> 51

<211> 35

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp1021_ fw

<400> 51

atggtctcgc catgggaacg gattagaggt gaaag 35

<210> 52

<211> 45

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp1021_ rv

<400> 52

atggtctcga ttctcataat accattttgt atttctttat attgg 45

<210> 53

<211> 37

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rl8755_ fw

<400> 53

atggtctcgc catgaagaaa ccgatttgct tgtaaac 37

<210> 54

<211> 42

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rl8755_ rv

<400> 54

atggtctcga ttcttagcgt tccaatattt tcataaattc ag 42

<210> 55

<211> 32

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp3671_ fw

<400> 55

atggtctcgc cattcaccat tgagcgctgc gg 32

<210> 56

<211> 44

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp3671_ rv

<400> 56

atggtctcga ttcttatgac tttgttttaa catttacaga cttg 44

<210> 57

<211> 38

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp3672_ fw

<400> 57

atggtctcgc catgctgaga ctgcaacaga agaaaatg 38

<210> 58

<211> 39

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> primer Rp3672_ rv

<400> 58

atggtctcga ttcttatttc tgaatttttg ccttgccag 39

<210> 59

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence AU1 epitope

<400> 59

Asp Thr Tyr Arg Tyr Ile

1 5

<210> 60

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence AU5 epitope

<400> 60

Thr Asp Phe Tyr Leu Lys

1 5

<210> 61

<211> 15

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Avi tag

<400> 61

Gly Leu Asn Asp Ile Phe Glu Ala Gln Lys Ile Glu Trp His Glu

1 5 10 15

<210> 62

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence T7 tag

<400> 62

Met Ala Ser Met Thr Gly Gly Gln Gln Met Gly

1 5 10

<210> 63

<211> 14

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence V5 tag

<400> 63

Gly Lys Pro Ile Pro Asn Pro Leu Leu Gly Leu Asp Ser Thr

1 5 10

<210> 64

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence B tag

<400> 64

Gln Tyr Pro Ala Leu Thr

1 5

<210> 65

<211> 26

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence calmodulin tag

<400> 65

Lys Arg Arg Trp Lys Lys Asn Phe Ile Ala Val Ser Ala Ala Asn Arg

1 5 10 15

Phe Lys Lys Ile Ser Ser Ser Gly Ala Leu

20 25

<210> 66

<211> 4

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence C tag

<400> 66

Glu Pro Glu Ala

1

<210> 67

<211> 23

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence DogTag

<400> 67

Asp Ile Pro Ala Thr Tyr Glu Phe Thr Asp Gly Lys His Tyr Ile Thr

1 5 10 15

Asn Glu Pro Ile Pro Pro Lys

20

<210> 68

<211> 10

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence E2 epitope

<400> 68

Ser Ser Thr Ser Ser Asp Phe Arg Asp Arg

1 5 10

<210> 69

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> affinity tag sequence E tag

<400> 69

Gly Ala Pro Val Pro Tyr Pro Asp Pro Leu Glu Pro Arg

1 5 10

<210> 70

<211> 7

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> affinity tag sequence FLAG tag

<400> 70

Asp Tyr Lys Asp Asp Asp Lys

1 5

<210> 71

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence EE tag (1)

<400> 71

Glu Tyr Met Pro Met Glu

1 5

<210> 72

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence EE tag (2)

<400> 72

Glu Phe Met Pro Met Glu

1 5

<210> 73

<211> 9

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence HA tag

<400> 73

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala

1 5

<210> 74

<211> 19

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence HAT

<400> 74

Lys Asp His Leu Ile His Asn Val His Lys Glu Phe His Ala His Ala

1 5 10 15

His Asn Lys

<210> 75

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence HQ tag

<400> 75

His Gln His Gln His Gln

1 5

<210> 76

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence HN tag

<400> 76

His Asn His Asn His Asn His Asn His Asn His Asn

1 5 10

<210> 77

<211> 8

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence HSV epitope

<400> 77

Gln Pro Glu Leu Ala Pro Glu Asp

1 5

<210> 78

<211> 16

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Isopep tag

<400> 78

Thr Asp Lys Asp Met Thr Ile Thr Phe Thr Asn Lys Lys Asp Ala Glu

1 5 10 15

<210> 79

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence KT3 epitope

<400> 79

Lys Pro Pro Thr Pro Pro Pro Glu Pro Glu Thr

1 5 10

<210> 80

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Myc epitope

<400> 80

Cys Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu

1 5 10

<210> 81

<211> 10

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Myc tag

<400> 81

Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu

1 5 10

<210> 82

<211> 18

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> NE tag of protein tag sequence

<400> 82

Thr Lys Glu Asn Pro Arg Ser Asn Gln Glu Glu Ser Tyr Asp Asp Asn

1 5 10 15

Glu Ser

<210> 83

<211> 5

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Arg tag

<400> 83

Arg Arg Arg Arg Arg

1 5

<210> 84

<211> 5

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Asp tag

<400> 84

Asp Asp Asp Asp Asp

1 5

<210> 85

<211> 4

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Cys tag

<400> 85

Cys Cys Cys Cys

1

<210> 86

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Glu tag

<400> 86

Glu Glu Glu Glu Glu Glu

1 5

<210> 87

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence His tag

<400> 87

His His His His His His

1 5

<210> 88

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Phe tag

<400> 88

Phe Phe Phe Phe Phe Phe Phe Phe Phe Phe Phe

1 5 10

<210> 89

<211> 9

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Rho1D4 tag

<400> 89

Thr Glu Thr Ser Gln Val Ala Pro Ala

1 5

<210> 90

<211> 9

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence S1 tag

<400> 90

Asn Ala Asn Asn Pro Asp Trp Asp Phe

1 5

<210> 91

<211> 15

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence S tag

<400> 91

Lys Glu Thr Ala Ala Ala Lys Phe Glu Arg Gln His Met Asp Ser

1 5 10 15

<210> 92

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Softtag 1

<400> 92

Ser Leu Ala Glu Leu Leu Asn Ala Gly Leu Gly Gly Ser

1 5 10

<210> 93

<211> 8

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Softtag 3

<400> 93

Thr Gln Asp Pro Ser Arg Val Gly

1 5

<210> 94

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Spy tag

<400> 94

Ala His Ile Val Met Val Asp Ala Tyr Lys Pro Thr Lys

1 5 10

<210> 95

<211> 38

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence SBP tag

<400> 95

Met Asp Glu Lys Thr Thr Gly Trp Arg Gly Gly His Val Val Glu Gly

1 5 10 15

Leu Ala Gly Glu Leu Glu Gln Leu Arg Ala Arg Leu Glu His His Pro

20 25 30

Gln Gly Gln Arg Glu Pro

35

<210> 96

<211> 8

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Strep tag (1)

<400> 96

Trp Ser His Pro Gln Phe Glu Lys

1 5

<210> 97

<211> 9

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Strep tag (2)

<400> 97

Ala Trp Ala His Pro Gln Pro Gly Gly

1 5

<210> 98

<211> 8

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Strep tag II

<400> 98

Trp Ser His Pro Gln Phe Glu Lys

1 5

<210> 99

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Sdy tag

<400> 99

Asp Pro Ile Val Met Ile Asp Asn Asp Lys Pro Ile Thr

1 5 10

<210> 100

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Snooptag

<400> 100

Lys Leu Gly Asp Ile Glu Phe Ile Lys Val Asn Lys

1 5 10

<210> 101

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Snooptag Jr

<400> 101

Lys Leu Gly Ser Ile Glu Phe Ile Lys Val Asn Lys

1 5 10

<210> 102

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Spot tag

<400> 102

Pro Asp Arg Val Arg Ala Val Ser His Trp Ser Ser

1 5 10

<210> 103

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence TC tag

<400> 103

Cys Cys Pro Gly Cys Cys

1 5

<210> 104

<211> 10

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence Ty tag

<400> 104

Glu Val His Thr Asn Gln Asp Pro Leu Asp

1 5 10

<210> 105

<211> 6

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> general protein tag sequence

<400> 105

His Thr Thr Pro His His

1 5

<210> 106

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence VSV tag

<400> 106

Tyr Thr Asp Ile Glu Met Asn Arg Leu Gly Lys

1 5 10

<210> 107

<211> 14

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> protein tag sequence V5 tag

<400> 107

Gly Lys Pro Ile Pro Asn Pro Leu Leu Gly Leu Asp Ser Thr

1 5 10

<210> 108

<211> 8

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<220>

<223> Xpress tag of protein tag sequence

<400> 108

Asp Leu Tyr Asp Asp Asp Asp Lys

1 5

182页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:苯唑草酮微乳液组合物

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!