增加nadph的生物合成途径的基因组工程化

文档序号:1580711 发布日期:2020-01-31 浏览:29次 >En<

阅读说明:本技术 增加nadph的生物合成途径的基因组工程化 (Genome engineering of NADPH-increasing biosynthetic pathways ) 是由 S·曼彻斯特 B·梅森 A·戈拉诺夫 于 2018-05-18 设计创作,主要内容包括:本公开涉及具有改变的NADPH可用性以增加使用NADPH产生的化合物的产生的宿主细胞以及其使用方法。通过以下中的一或多个来改变NADPH可用性:在所述宿主细胞中表达改变的GAPDH,表达变异谷氨酸脱氢酶gdh、天冬氨酸半醛脱氢酶asd、二氢吡啶甲酸还原酶dapB和内消旋-二氨基庚二酸脱氢酶ddh,表达新颖烟酰胺核苷酸转氢酶,表达新颖苏氨酸醛缩酶,以及表达丙酮酸羧化酶或调节丙酮酸羧化酶的所述表达。(The NADPH availability is altered by or more of expressing altered GAPDH in the host cell, expressing variant glutamate dehydrogenase gdh, aspartate semialdehyde dehydrogenase asd, dihydropicolinate reductase dapB, and meso-diaminopimelate dehydrogenase ddh, expressing a novel nicotinamide nucleotide transhydrogenase, expressing a novel threonine aldolase, and expressing pyruvate carboxylase or modulating the expression of pyruvate carboxylase.)

增加NADPH的生物合成途径的基因组工程化

相关申请的交叉引用

本申请要求2017年5月19日提交的美国临时申请第62/508,589号的优先权益,所述临时申请以全文引用的方式并入本文中。

关于序列表的陈述

与本申请相关的序列表以文本格式代替纸本拷贝提供,并且在此以引用的方式并入本说明书中。含有序列表的文本文件的名称是ZYMR_011_01WO_SeqList_ST25.txt。所述文本文件是950KB,创建于2018年5月18日,并且以电子方式通过EFS-Web提交。

技术领域

本公开大体上是针对增加微生物细胞中的NADPH可用性的微生物工程化方法。

具体地说,本公开涉及通过在宿主细胞中表达以下中的一或多种将宿主细胞进行工程化以增加NADPH可用性:改变的GAPDH、变异谷氨酸脱氢酶(glutamate dehydrogenase,gdh)、天冬氨酸半醛脱氢酶(aspartate semialdehyde dehydrogenase,asd)、二氢吡啶甲酸还原酶(dihydropicolinate reductase,dapB)、内消旋-二氨基庚二酸脱氢酶(meso-diaminopimelate dehydrogenase,ddh)、苏氨酸醛缩酶(threonine aldohase,ltaE)、丙酮酸羧化酶(pyruvate carboxylase,pyc)和新颖烟酰胺核苷酸转氢酶。

背景技术

NADPH作为一种还原当量,与用于合成如糖和例如L-赖氨酸和L-苏氨酸等氨基酸的工业上重要化合物的许多重要生物过程有关。然而,众所周知,NADPH的正常细胞供应可能是使用NADPH产生的化合物的产生中的限制因素。举例来说,当在谷氨酸棒状杆菌(C.glutamicum)中以工业规模产生L-赖氨酸时NADPH可能是限制因素(贝克尔(Becker)等人(2005),《环境微生物学应用(Appl.Environ.Microbiol.)》,71(12):8587-8596)。

因此,本领域中非常需要将工业微生物工程化的新方法,以克服对用于产生使用NADPH制成的化合物的细胞,例如用于产生L-赖氨酸或L-苏氨酸的细胞中NADPH的可用性的限制。

发明内容

本公开涉及克服对宿主细胞中NADPH可用性的限制,增加L-赖氨酸、L-苏氨酸、L-异亮氨酸、L-甲硫氨酸或L-甘氨酸产生的至少六个策略:(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(Glyceraldehyde-3-phosphate dehydrogenase,gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在宿主细胞中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ltA)的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

在某些实施例中,提供了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,所述方法包含改变细胞的可利用的NADPH。

在某些实施例中,提供了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中相对于缺乏经修饰的GAPDH的对应宿主细胞,所述宿主细胞提高使用NADPH产生的化合物的产生。

在某些实施例中,提供了一种产生L-赖氨酸的方法,其包含培养棒状杆菌属菌株并从所培养的棒状杆菌属菌株或培养液回收L-赖氨酸,其中所述棒状杆菌属菌株表达使用NADP作为辅酶的经修饰的GAPDH,并且其中所述棒状杆菌属菌株的L-赖氨酸生产率得到提高。

在某些实施例中,提供了一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性。

在某些实施例中,提供了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

在某些实施例中,提供了一种宿主细胞,其包含:一或多种酶gdh、asd、dapB和ddh的变体,其中所述变体展现针对辅酶NADH和NADPH的双特异性。

在某些实施例中,提供了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

在某些实施例中,提供了一种提高宿主细胞产生L-赖氨酸的效率的方法,其包含以下中的两个或更多个:

对内源性GAPDH进行修饰,使得经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性;

在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性;以及

在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

在某些实施例中,提供了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)和天冬氨酸半醛脱氢酶(asd)中的一或两种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

在某些实施例中,提供了一种提高宿主细胞产生L-苏氨酸的效率的方法,其包含:在所述宿主细胞中表达苏氨酸醛缩酶的变异酶,其中所述变异酶展现与大肠杆菌苏氨酸醛缩酶(ltaE)不同的底物偏好或酶动力学。

在某些实施例中,提供了一种增加宿主细胞的L-苏氨酸产生的方法,其包含:在所述宿主细胞中表达酶甘油醛3-磷酸脱氢酶(gapA)、谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、苏氨酸醛缩酶(ltaE)和丙酮酸羧化酶(pyc)中的一或多种酶的变异酶。

在某些实施例中,提供了一种宿主细胞,其包含多拷贝复制质粒,所述多拷贝复制质粒包含各自可操作地连接到一或多个合成启动子的thrA基因、thrB基因和thrC基因。

在某些实施例中,提供了一种提高宿主细胞产生化合物的效率的方法,其包含以下中的两个或更多个:(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在宿主细胞中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ltA)的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

在某些实施例中,提供了一种人工多核苷酸,其编码截短的甘油醛-3-磷酸脱氢酶(gapA)基因,其中所述多核苷酸包含与选自由SEQ ID NO:290、291、292和293组成的群组的多核苷酸序列至少85%、90%、95%或99%相同的序列。

在某些实施例中,提供了一种甘油醛-3-磷酸脱氢酶(gapA)的重组蛋白片段,其中所述重组蛋白片段包含与选自由SEQ ID NO:233、234、235、236和298组成的群组的氨基酸序列至少70%、80%、90%或95%相同的序列。

在某些实施例中,提供了一种提高宿主细胞的L-赖氨酸或L-苏氨酸产生效率的方法,其包含增加宿主细胞产生NADPH的能力。在一些方面,所述方法包含对甘油醛-3-磷酸脱氢酶(GAPDH)进行修饰,使得其辅酶特异性加宽。在某些情况下,经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性。在某些方面,宿主细胞是原核细胞。在某些方面,宿主细胞是棒状杆菌属。在一些方面,宿主细胞是谷氨酸棒状杆菌。在一些实施例中,宿主细胞是大肠杆菌。在一些实施例中,天然存在的GAPDH具有SEQ ID NO:58的氨基酸序列。在一些方面,经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少95%相同的氨基酸序列。在某些实施例中,经修饰的GAPDH在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。在其它实施例中,经修饰的GAPDH在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。在某些方面,在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。在其它方面,在与SEQ ID NO:58的氨基酸36相对应的位置处的亮氨酸已经被苏氨酸置换,并且在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。

在某些实施例中,提供了一种提高宿主细胞产生L-赖氨酸的效率的方法,其包含降低所述宿主细胞利用NADPH的能力,所述方法包含在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。在某些方面,所有四种酶同时在宿主细胞中表达。在某些实施例中,提供了一种提高宿主细胞产生L-苏氨酸的效率的方法,其包含降低所述宿主细胞利用NADPH的能力,所述方法包含在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)和天冬氨酸半醛脱氢酶(asd)中的一或两种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。在一些实施例中,相比于NADPH,变异酶更有效地使用NADH。在某些实施例中,所述方法包含表达gdh的变异酶,其中所述变异酶包含与SEQ IDNO:42的氨基酸序列至少约70%、75%、80%、85%、90%、95%或100%相同的氨基酸序列。在某些方面,所述方法包含表达gdh的变异酶,其中所述变异酶包含与SEQ ID NO:42的氨基酸序列至少约95%相同的氨基酸序列。在其它实施例中,所述方法包含表达asd的变异酶,其中所述变异酶包含与SEQ ID NO:40的氨基酸序列至少约70%、75%、80%、85%、90%、95%或100%相同的氨基酸序列。在其它方面,所述方法包含表达asd的变异酶,其中所述变异酶包含与SEQ ID NO:40的氨基酸序列至少95%相同的氨基酸序列。在其它方面,所述方法包含表达dapB的变异酶,其中所述变异酶包含与SEQ ID NO:46的氨基酸序列至少95%相同的氨基酸序列。在其它方面,所述方法包含表达ddh的变异酶,其中所述ddh酶包含SEQ ID NO:4的氨基酸序列。在某些实施例中,gdh的变异酶包含SEQ ID NO:44的氨基酸序列。在其它实施例中,asd的变异酶包含SEQ ID NO:30的氨基酸序列。在其它实施例中,dapB的变异酶包含SEQ ID NO:48的氨基酸序列。

在其它实施例中,提供了一种产生L-赖氨酸或L-苏氨酸的方法,其包含培养棒状杆菌属或大肠杆菌菌株并从所培养的棒状杆菌属或大肠杆菌菌株或培养液回收L-赖氨酸或L-苏氨酸,其中所述棒状杆菌属或大肠杆菌菌株表达使用NADP作为辅酶的经修饰的GAPDH,并且其中所述棒状杆菌属菌株的L-赖氨酸或L-苏氨酸生产率得到提高。

在其它实施例中,提供了一种通过对GAPDH进行修饰来加宽GAPDH的辅酶特异性的方法,其中经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性。在某些方面,相对于NAD,经修饰的GAPDH具有增加的针对辅酶NADP的特异性。在其它方面,相比于NAD,经修饰的GAPDH更有效地使用NADP。

在一些实施例中,提供了一种包含经修饰的GAPDH的宿主细胞,其中所述经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少95%相同的氨基酸序列,并且其中在与SEQID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。在某些方面,宿主细胞是谷氨酸棒状杆菌。

在其它实施例中,提供了一种包含经修饰的GAPDH的宿主细胞,其中所述经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少95%相同的氨基酸序列,并且其中在与SEQID NO:58的氨基酸36相对应的位置处的亮氨酸已经被苏氨酸置换,并且在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。在某些方面,宿主细胞是谷氨酸棒状杆菌。

在进一步实施例中,提供了一种宿主细胞,其包含一或多种酶gdh、asd、dapB和ddh的变体,其中所述变体展现针对辅酶NADH和NADPH的双特异性。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,所述方法包含改变细胞的可利用的NADPH。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中通过在所述细胞中表达经修饰的甘油醛-3-磷酸脱氢酶(GAPDH)来改变可利用的NADPH,其中经修饰的GAPDH经过修饰,使得其辅酶特异性加宽。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述宿主细胞是棒状杆菌属。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述宿主细胞是谷氨酸棒状杆菌。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述天然存在的GAPDH是gapA。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述gapA具有SEQ ID NO:58的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少95%相同的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中在与SEQ ID NO:58的氨基酸36相对应的位置处的亮氨酸已经被苏氨酸置换,并且在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述化合物选自由以下各物组成的群组:聚酮(例如苦霉素(pikromycin)、红霉素A(erythromycin A)、克拉霉素(clarithromycin)、阿奇霉素(azithromycin)、阿维菌素(Avermectin)、伊佛霉素(ivermectin)、赐诺杀(spinosad)、格尔德霉素(geldanamycin)、麦可贝辛(macbecin)、利福霉素(rifamycin)、两性霉素(amphotericin)、制霉素(nystatin)、匹马霉素(pimaricin)、莫能菌素(monensin)、多西环素(doxycycline)、布拉他辛(bullatacin)、多鳞番荔枝辛(squamocin)、莫维扎因(molvizarin)、乌瓦辛(uvaricin)、番荔枝辛(annonacin)、他克莫司(tacrolimus)、西罗莫司(sirolimus)、根赤壳菌素(radicicol)、洛伐他汀(lovastatin)、迪斯德莫来(discodermolide)、黄霉毒素(aflatoxin)、松萝酸(usnic acid)和安曲霉素(anthramycin));儿茶素(例如表儿茶素(epicatechin)、表没食子儿茶素(epigallocatechin)、表儿茶素没食子酸酯(epicatechin gallate)、表没食子儿茶素没食子酸酯(epigallocatechin gallate)、表枇杷素(epiafzelechin)、非瑟酮醇(fisetinidol)、古柏醇(guibourtinidol)、美奎醇(mesquitol)和刺槐亭醇(robinetinidol));萜类(例如异戊烯醇(prenol)、异戊酸(isovaleric acid)、香草醇(geraniol)、松油醇(terpineol)、柠檬烯(limonene)、月桂烯(myrcene)、沉香醇(linalool)、蒎烯(pinene)、蛇麻烯(humulene)、法呢烯(farnesenes)、法呢醇(farnesol)、咖啡醇(cafestol)、咖啡白醇(kahweol)、西松烯(cembrene)、紫杉烯(taxadiene)、视黄醇(retinol)、视网膜醛(retinal)、植醇(phytol)、香叶基法呢醇(geranylfarnesol)、角鲨烯(squalene)、羊毛甾醇(lanosterol)、环阿屯醇(cycloartenol)、胆固醇(cholesterol)、福卡醇(ferrugicadiol)、四异戊二烯基姜黄烯(tetraprenylcurcumene)、番茄红素(lycopene)、γ-胡萝卜素(gamma-carotene)、α-和β-胡萝卜素、3-氧代-α-紫罗兰醇、7,8-二氢紫罗兰酮、甲基环己烯-3,9-二醇和3-氧代-7,8-二氢-α-紫罗兰醇;脂肪酸(例如肉豆蔻脑酸、棕榈油酸、十六碳烯酸、油酸、反油酸、异油酸、亚油酸、反亚油酸(linoelaidicacid)、α-亚麻酸(α-linolenic acid)、花生四烯酸(arachidonic acid)、二十碳五烯酸(eicosapentaenoic acid)、芥酸(erucic acid)、二十二碳六烯酸(docosahexaenoicacid)、辛酸、癸酸、月桂酸、肉豆蔻酸、棕榈酸、硬脂酸、花生酸、山嵛酸(behenic acid)、二十四烷酸(lignoceric acid)和蜡酸(cerotic acid));氨基酸或其衍生物(例如S-腺苷甲硫氨酸、异亮氨酸、亮氨酸、缬氨酸、甲硫氨酸、苏氨酸、赖氨酸、谷氨酸、色氨酸、酪氨酸、L-赖氨酸和苯丙氨酸);来自分支酸途径的化合物(例如吲哚、分支酸(chorismate)、莽草酸(shikimate)、水杨酸、2,3-二羟基苯甲酸、对氨基苯甲酸酯、维生素k和叶酸盐);以及生物碱(例如麻黄素(ephedrine)、高三尖杉酯碱(homoharringtonine)、加兰他敏(galantamine)、长春蔓胺(vincamine)、奎尼丁(quinidine)、***(morphine)、白屈菜红碱(chelerythrine)、胡椒碱(piperine)、咖啡碱(caffeine)、烟碱(nicotine)、可可豆碱(theobromine)和奎宁(quinine))。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,其中所述化合物选自表2。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中相对于缺乏经修饰的GAPDH的对应宿主细胞,所述宿主细胞提高使用NADPH产生的化合物的产生。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中经修饰的GAPDH相对于天然存在的GAPDH具有增加的针对NADP的特异性。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少95%相同的氨基酸序列。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中所述经修饰的GAPDH包含与SEQ ID NO:58至少70%相同的氨基酸序列并且其中所述经修饰的GAPDH包含在SEQ ID NO:58的位置36、37或两个位置处的氨基酸的取代。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中所述化合物选自由以下各物组成的群组:聚酮(例如苦霉素、红霉素A、克拉霉素、阿奇霉素、阿维菌素、伊佛霉素、赐诺杀、格尔德霉素、麦可贝辛、利福霉素、两性霉素、制霉素、匹马霉素、莫能菌素、多西环素、布拉他辛、多鳞番荔枝辛、莫维扎因、乌瓦辛、番荔枝辛、他克莫司、西罗莫司、根赤壳菌素、洛伐他汀、迪斯德莫来、黄霉毒素、松萝酸和安曲霉素);儿茶素(例如表儿茶素、表没食子儿茶素、表儿茶素没食子酸酯、表没食子儿茶素没食子酸酯、表枇杷素、非瑟酮醇、古柏醇、美奎醇和刺槐亭醇);萜类(例如异戊烯醇、异戊酸、香草醇、松油醇、柠檬烯、月桂烯、沉香醇、蒎烯、蛇麻烯、法呢烯、法呢醇、咖啡醇、咖啡白醇、西松烯、紫杉烯、视黄醇、视网膜醛、植醇、香叶基法呢醇、角鲨烯、羊毛甾醇、环阿屯醇、胆固醇、福卡醇、四异戊二烯基姜黄烯、番茄红素、γ-胡萝卜素、α-和β-胡萝卜素、3-氧代-α-紫罗兰醇、7,8-二氢紫罗兰酮、甲基环己烯-3,9-二醇和3-氧代-7,8-二氢-α-紫罗兰醇;脂肪酸(例如肉豆蔻脑酸、棕榈油酸、十六碳烯酸、油酸、反油酸、异油酸、亚油酸、反亚油酸、α-亚麻酸、花生四烯酸、二十碳五烯酸、芥酸、二十二碳六烯酸、辛酸、癸酸、月桂酸、肉豆蔻酸、棕榈酸、硬脂酸、花生酸、山嵛酸、二十四烷酸和蜡酸);氨基酸或其衍生物(例如S-腺苷甲硫氨酸、异亮氨酸、亮氨酸、缬氨酸、甲硫氨酸、苏氨酸、赖氨酸、谷氨酸、色氨酸、酪氨酸、L-赖氨酸和苯丙氨酸);来自分支酸途径的化合物(例如吲哚、分支酸、莽草酸、水杨酸、2,3-二羟基苯甲酸、对氨基苯甲酸酯、维生素k和叶酸盐);以及生物碱(例如麻黄素、高三尖杉酯碱、加兰他敏、长春蔓胺、奎尼丁、***、白屈菜红碱、胡椒碱、咖啡碱、烟碱、可可豆碱和奎宁)。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中所述化合物选自表2。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中所述修饰包含在与SEQ ID NO:58的氨基酸36相对应的位置处的亮氨酸被苏氨酸置换,以及在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸被赖氨酸置换。

在一些实施例中,本公开教示了一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中宿主细胞是谷氨酸棒状杆菌。

在一些实施例中,本公开教示了一种产生L-赖氨酸的方法,其包含培养棒状杆菌属菌株并从所培养的棒状杆菌属菌株或培养液回收L-赖氨酸,其中所述棒状杆菌属菌株表达使用NADP作为辅酶的经修饰的GAPDH,并且其中所述棒状杆菌属菌株的L-赖氨酸生产率得到提高。

在一些实施例中,本公开教示了一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性。

在一些实施例中,本公开教示了一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性,其中经修饰的GAPDH对辅酶NADP的特异性相对于NAD增加。

在一些实施例中,本公开教示了一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性,其中相比于NAD,所述经修饰的GAPDH更有效地使用NADP。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述化合物选自以下各物:聚酮(例如苦霉素、红霉素A、克拉霉素、阿奇霉素、阿维菌素、伊佛霉素、赐诺杀、格尔德霉素、麦可贝辛、利福霉素、两性霉素、制霉素、匹马霉素、莫能菌素、多西环素、布拉他辛、多鳞番荔枝辛、莫维扎因、乌瓦辛、番荔枝辛、他克莫司、西罗莫司、根赤壳菌素、洛伐他汀、迪斯德莫来、黄霉毒素、松萝酸和安曲霉素);儿茶素(例如表儿茶素、表没食子儿茶素、表儿茶素没食子酸酯、表没食子儿茶素没食子酸酯、表枇杷素、非瑟酮醇、古柏醇、美奎醇和刺槐亭醇);萜类(例如异戊烯醇、异戊酸、香草醇、松油醇、柠檬烯、月桂烯、沉香醇、蒎烯、蛇麻烯、法呢烯、法呢醇、咖啡醇、咖啡白醇、西松烯、紫杉烯、视黄醇、视网膜醛、植醇、香叶基法呢醇、角鲨烯、羊毛甾醇、环阿屯醇、胆固醇、福卡醇、四异戊二烯基姜黄烯、番茄红素、γ-胡萝卜素、α-和β-胡萝卜素、3-氧代-α-紫罗兰醇、7,8-二氢紫罗兰酮、甲基环己烯-3,9-二醇和3-氧代-7,8-二氢-α-紫罗兰醇;脂肪酸(例如肉豆蔻脑酸、棕榈油酸、十六碳烯酸、油酸、反油酸、异油酸、亚油酸、反亚油酸、α-亚麻酸、花生四烯酸、二十碳五烯酸、芥酸、二十二碳六烯酸、辛酸、癸酸、月桂酸、肉豆蔻酸、棕榈酸、硬脂酸、花生酸、山嵛酸、二十四烷酸和蜡酸);氨基酸或其衍生物(例如S-腺苷甲硫氨酸、异亮氨酸、亮氨酸、缬氨酸、甲硫氨酸、苏氨酸、赖氨酸、谷氨酸、色氨酸、酪氨酸、L-赖氨酸和苯丙氨酸);来自分支酸途径的化合物(例如吲哚、分支酸、莽草酸、水杨酸、2,3-二羟基苯甲酸、对氨基苯甲酸酯、维生素k和叶酸盐);以及生物碱(例如麻黄素、高三尖杉酯碱、加兰他敏、长春蔓胺、奎尼丁、***、白屈菜红碱、胡椒碱、咖啡碱、烟碱、可可豆碱和奎宁)。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述化合物选自表2。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中相比于NADPH,所述变异酶更有效地使用NADH。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与SEQ ID NO:42的氨基酸序列至少95%相同的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与SEQ ID NO:40的氨基酸序列至少95%相同的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述方法包含表达dapB的变异酶,其中所述变异酶包含与SEQ ID NO:46的氨基酸序列至少95%相同的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所述方法包含表达ddh的变异酶,其中所述变异酶包含SEQ ID NO:4的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中gdh的变异酶包含SEQ ID NO:44的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中asd的变异酶包含SEQ ID NO:30的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中dapB的变异酶包含SEQ ID NO:48的氨基酸序列。

在一些实施例中,本公开教示了一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,其中所有四种酶的变体在宿主细胞中同时表达。

在一些实施例中,本公开教示了一种宿主细胞,其包含:一或多种酶gdh、asd、dapB和ddh的变体,其中所述变体展现针对辅酶NADH和NADPH的双特异性。

在一些实施例中,本公开教示了一种提高宿主细胞的L-赖氨酸产生效率的方法,其包含在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

在一些实施例中,本公开教示了一种提高宿主细胞的L-赖氨酸产生效率的方法,其包含以下中的两个或更多个:(1)对内源性GAPDH进行修饰,使得经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性;(2)在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性;以及(3)在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

在一些实施例中,本公开教示了一种通过以下来增加L-赖氨酸、L-苏氨酸、L-异亮氨酸、L-甲硫氨酸或L-甘氨酸产生的方法:(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在宿主细胞中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(lTA)的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

附图说明

图1示出细菌赖氨酸生物合成途径,并概述在本申请中用以提高L-赖氨酸在细菌中的产率和生产率的策略。宿主细胞产生L-赖氨酸的效率可以通过以下中的一或多个来提高:(1)对内源性GAPDH进行修饰,使得经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性,从而产生NADPH;(2)在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性,从而减少对NADPH的利用;以及(3)在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶,从而由NADH产生NADPH。

图2展示如实例1中所述,表达经修饰甘油醛-3-磷酸脱氢酶(GAPDH)的谷氨酸棒状杆菌菌株中的L-赖氨酸生产率。产生几个谷氨酸棒状杆菌菌株,每一个表达具有以下突变中的一或多个突变的gapA酶:D35G、L36T、T37K和P192S。随后与具有原生gapA的亲代菌株相比,测试菌株产生L-赖氨酸的能力。具有赋予针对NADP的改变的辅酶特异性的某些突变的GAPDH的引入显著提高L-赖氨酸的生产率。单独T37K和T37K与L36T显著增加2个背景中的生产率。菌株7000182994和7000184348各自含有T37K,并且性能比其相应的亲代亲代_1和亲代_2更佳。菌株7000182999和7000184352各自含有T37K和L36T,并且性能比其相应的亲代亲代_1和亲代_2更佳。

图3示出通过表达相比于NADPH更有效地使用NADH的变异gdh、asd、dapB和ddh酶,将谷氨酸棒状杆菌中用于赖氨酸合成的DAP-途径重编程的策略。谷氨酸棒状杆菌酶gdh和dapB分别在共生梭菌(Clostridium symbiosum)和大肠杆菌(Escherichia coli)中具有已知的同源物,相比于NADPH,这些同源物更有效地使用NADH。在宿主细胞中进行全基因组同源性搜索,以发现谷氨酸棒状杆菌adh和ddh的变体。对于每种酶,同源性搜索得到9种变体。对谷氨酸棒状杆菌gdh和dapB的已知同源物以及谷氨酸棒状杆菌asd和ddh的9种变体进行密码子优化并克隆至质粒中以在谷氨酸棒状杆菌中表达。

图4示出用于在谷氨酸棒状杆菌中表达变异gdh、asd、dapB和ddh酶的多种组合的策略。gdh、asd、dapB和ddh酶的每个不同型式的一个拷贝以多种组合克隆至含有抗卡那霉素标记基因的质粒中。随后将每个质粒引入谷氨酸棒状杆菌中,并通过标准同源重组技术将酶基因整合至谷氨酸棒状杆菌染色体中。通过在含有卡那霉素的培养基上培养来选择成功地将酶基因整合于基因组中的克隆。所有四种酶在谷氨酸棒状杆菌中同时表达。

图5A-B展示谷氨酸棒状杆菌中gdh、asd、dapB和ddh酶的不同型式的多种组合的表达的作用。图5A展示两个谷氨酸棒状杆菌重组菌株7000186960和7000186992的数据,每个菌株含有ddh的原生酶以及gdh、asd和dapB的相同3种异源酶(使用NADH的gdh和dapB的已知型式以及来自敏捷乳杆菌(Lactobacillus agilis)的asd的变体),显示与相应的亲代亲代_3和亲代_4相比,显著提高L-赖氨酸的生产率。7000186960和7000186992各自含有gdh、asd和dapB的相同3种异源酶以及ddh的原生酶。图5B展示在3个测试背景中的2个中gdh和dapB的异源酶也略微增加产率。

图6描绘本公开的转化质粒的装配以及其整合至宿主生物体中。***DNA是通过在装配反应中合并一或多个合成寡核苷酸来产生。含有所期望序列的DNA***序列侧接与基因组的目标区域同源的DNA区域。这些同源区域促进了基因组整合,并且一经整合,则形成直接重复区域,所述直接重复区域是为了在后续步骤中使载体骨架DNA环出而设计。所装配的质粒含有***DNA且任选地含有一或多个选择标记。

图7描绘用于DNA的所选区域从宿主菌株环出的程序。所***DNA和宿主基因组的直接重复区域可以在重组事件中“环出”。选择标记反向选择的细胞含有直接重复区域所侧接的环DNA的缺失。

图8A-B展示用于在大肠杆菌K-12、W3110中使用thrLABC调控子(图8A)或thrABC操纵子(图8B)进行大肠杆菌W3110苏氨酸基础菌株构建的步骤一的质粒设计。

图9示出赖氨酸和苏氨酸的细菌生物合成途径,并概述在本申请中用以提高L-赖氨酸或L-苏氨酸在细菌中的产率和生产率的策略。(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化,从而产生NADPH。(2)在宿主细胞中表达由NADH产生NADPH的转氢酶,从而产生NADPH。(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程,从而减少NADPH的利用。(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程,从而减少NADPH的利用。(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ltA)的同源物,将苏氨酸合成重编程,从而增加每单位消耗的NADPH的苏氨酸产生。(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

图10A-C描绘苏氨酸生物合成的代谢通路图,其展示出通过表达异源苏氨酸醛缩酶文库(TAlib)实现的可能情形。图10A描绘苏氨酸生物合成的代谢通路图,其展示出原生大肠杆菌ltaE所偏爱的反应(苏氨酸转变成乙醛和甘氨酸)。图10B描绘展示改善情形的部分途径,其中在表达异源TA酶下苏氨酸与乙醛和甘氨酸之间的转变更加平衡。图10C描绘展示出优选情形的部分途径,其中乙醛和甘氨酸转变成苏氨酸是在表达异源TA酶下所偏爱的方向。

图11A-C展示当表达也在棒状杆菌中测试的原生gapA、gsd、asd、ltaE或变体的个别基因或组合时由大肠杆菌thrABC背景菌株(W3110 pMB085thrABCΔtdh;thrABC)产生的L-苏氨酸的效价(mg/L)。还展示野生型大肠杆菌K12 W3110、缺失tdh的W3110(tdh_del)和W3110 pMB085thrLABCΔtdh(thrLABC)菌株的效价,用于比较。图11A展示gapA的结果。测试的三个gapA变体(gapAv5、gapAv7和gapAv8)都产生相对于对照显著更高的L-苏氨酸效价,对照包括表达大肠杆菌gapA的额外拷贝的菌株(Ec_gapA)。图11B展示asd的结果。敏捷乳杆菌asd产生显著高于表达大肠杆菌asd的第二拷贝的相同基础菌株的效价。图11C展示gdh的结果。在此情况下,梭菌属gdh(Csy_gdh)与表达大肠杆菌gdh的第二拷贝的相同基础菌株未显著不同,但两个菌株性能比亲代菌株(thrABC)更佳。

图12展示用于构建用于表达asd、gdh和ltaE文库变体的质粒的调控元件(pMB038启动子(SEQ ID NO:237)和thrL终止子(SEQ ID NO:238)和骨架(p15A)(SEQ ID NO:239)的设计。

图13展示与野生型大肠杆菌K12 W3110和经p15A空白载体(不具有在pMB038启动子(SEQ ID NO:237)与终止子(SEQ ID NO:238)之间克隆的文库变体的环化p15A质粒(SEQID NO:239);对照质粒)转化的亲代对照菌株-苏氨酸基础菌株THR02(W3110pMB085thrABCΔtdh)相比提高的L-苏氨酸效价(mg/L)。表达asd_13(SEQ ID NO:108)和asd_18(SEQ IDNO:118)的菌株具有提高的力价,但未与对照菌株无显著不同。通过学生T比较方式所测定,七种gdh变体:gdh_1(SEQ ID NO:136)、gdh_8(SEQ ID NO:150)、gdh_14(SEQ ID NO:162)、gdh_16(SEQ ID NO:166)、gdh_18(SEQ ID NO:170)、gdh_20(SEQ ID NO:174)和gdh_22(SEQID NO:178)都产生显著更高的L-苏氨酸效价。灰色圆圈和标记表明性能显著优于对照菌株的样品。

图14展示与野生型大肠杆菌K12 W3110和经p15A空白载体(不具有在pMB038启动子(SEQ ID NO:237)与终止子(SEQ ID NO:238)之间克隆的文库变体的环化p15A质粒(SEQID NO:239);对照质粒)转化的亲代对照菌株-苏氨酸基础菌株THR02(W3110pMB085thrABCΔtdh)相比由表达苏氨酸醛缩酶(ltaE)文库变体的菌株产生的提高的L-苏氨酸效价(mg/L)。通过学生T比较方式所测定,ltaE_6(SEQ ID NO:196)、ltaE_11(SEQ ID NO:206)、ltaE_18(SEQ ID NO:220)、ltaE_20(SEQ ID NO:224)、lta_24(SEQ ID NO:232)都产生显著更高的L-苏氨酸效价。灰色圆圈和标记表明性能显著优于对照菌株的样品。

图15展示由表达在个别地表达时各自提高效价的Csy_gdh、gapAv5或gapAv7与单一asd、gdh或ltaE文库变体的组合的菌株产生的提高的苏氨酸效价。除W3110外,所示的所有菌株都在pMB085-thrABC tdh缺失背景中。对于这些实验来说,大部分相关对照是经空白p15A对照质粒(7000349886、7000349887和7000349885;分别Csy_gdh+p15A(-)、gapAv5+p15A(-)和gapAv7+p15A(-))转化的亲代菌株(Csy_gdh、gapAv5和gapAv7)。

图16描绘用于由表达来自NNK文库的外源性gapA等位基因的谷氨酸棒状杆菌产生赖氨酸的两个平板模型中的文库性能。绘制每个模型中的平均性能。大部分整合体(灰色圆圈)性能等于或比亲代(黑色菱形)更差。某些gapA等位基因在两个平板模型中产生高效价赖氨酸(黑色圆圈)。

具体实施方式

定义

尽管相信所属领域的技术人员非常了解以下术语,但仍阐述以下定义以促进对所公开的主题的解释。

术语“一(a/an)”是指所述实体中的一个或多个,即可指多个指示物。因而,术语“一”、“一或多种”和“至少一种”在本文中可互换地使用。此外,通过不定冠词“一”提及“一个元件”并不排除存在超过一个元件的可能性,除非上下文明确要求存在一个并且仅存在一个元件。

除非上下文另有要求,否则本说明书和权利要求书通篇中,词语“包含(comprise)”和其变体,如“包含(comprises)”和“包含(comprising)”,应解释为开放性、包涵性含义,即“包括(但不限于)”。

本说明书通篇对“一个实施例”或“一实施例”的提及意味着结合所述实施例描述的具体特征、结构或特性可以包括在本公开的至少一个实施例中。因此,本说明书通篇中多个位置处出现短语“在一个实施例中”或“在一实施例中”未必都是指同一个实施例。应了解,出于清楚的目的在分开的实施例的上下文中描述的本公开的某些特征还可以按组合形式提供于单个实施例中。相反,为简洁起见而在单个实施例的上下文中描述的本公开的各种特征也可以分开或以任何合适的子组合形式提供。

如本文所用,术语“细胞生物体”、“微生物体”或“微生物”应该在宽泛的意义上理解。这些术语可互换地使用并且包括(但可以不限于)两种原核生物域:细菌和古细菌,以及某些真核生物真菌和原生生物。在一些实施例中,本公开提及本公开中所存在的清单/表格和图式的“微生物”或“细胞生物体”或“微生物”。这种表征不仅可以指所述表格和图式的已鉴别类属,而且指已鉴别的分类种,以及所述表格或图式中的各种新颖和最新鉴别或设计的任何生物体株系。对于这些术语在本说明书的其它部分(如实例)中的叙述来说,相同表征同样适用。

术语“原核生物”是所属领域内所公认的并且是指不含细胞核或其它细胞器的细胞。原核生物通常分类至两个域之一:细菌和古细菌。古细菌和细菌域生物体之间的决定性差异是基于16S核糖体RNA中的核苷酸碱基序列的基本差异。

术语“古细菌”是指疵壁菌门(Mendosicutes)的生物体类别,典型地在异常环境中发现其并且根据若干个准则而与原核生物的其余部分区分开来,所述若干个准则包括核糖体蛋白的数目和细胞壁中的胞壁酸的缺乏。基于ssrRNA分析,古细菌由系统发生学截然不同的两个群组组成:嗜泉古菌界(Crenarchaeota)和广古生菌界(Euryarchaeota)。古细菌基于其生理学可以组织成三种类型:产甲烷菌(产生甲烷的原核生物);极端嗜盐菌(extreme halophiles)(在极高浓度的盐(NaCl)存在下活着的原核生物);和极端(超)嗜热菌(extreme(hyper)thermophilus)(在极高温度下活着的原核生物)。除有别于细菌的统一古细菌特点(即,细胞壁中没有胞壁质、酯连型膜脂等)之外,这些原核生物还展现了使其适应其具体栖息地的独特结构或生物化学属性。嗜泉古菌界主要由极端嗜热性硫依赖性原核生物组成并且广古生菌界含有产甲烷菌和极端嗜盐菌。

“细菌”或“真细菌”是指原核生物体域。细菌包括如下至少11个不同群组:(1)革兰氏阳性(革兰+)细菌,其存在两大亚门:(1)高G+C群组(放线菌属(Actinomycetes)、分枝杆菌属(Mycobacteria)、微球菌属(Micrococcus)等),(2)低G+C群组(芽孢杆菌属(Bacillus)、梭菌属(Clostridia)、乳杆菌属(Lactobacillus)、葡萄球菌属(Staphylococci)、链球菌属(Streptococci)、霉浆菌属(Mycoplasmas));(2)变形菌门(Proteobacteria),例如紫色光合成+非光合成革兰氏阴性细菌(包括最“常见”的革兰氏阴性细菌);(3)蓝细菌(Cyanobacteria),例如有氧光养生物;(4)螺旋菌(Spirochetes)和相关物种;(5)浮霉状菌属(Planctomyces);(6)拟杆菌属(Bacteroides)、黄杆菌(Flavobacteria);(7)衣原体属(Chlamydia);(8)绿色硫细菌;(9)绿色非硫细菌(也是无氧光养生物);(10)耐放射性微球菌和相关物种;(11)栖热孢菌属(Thermotoga)和嗜热性热袍菌(Thermosipho thermophiles)。

“真核生物”是细胞含有细胞核和封闭于膜内的其它细胞器的任何生物体。真核生物属于真核或真核生物分类群。将真核细胞与原核细胞(前述细菌和古细菌)区分开来的决定性特征是其具有膜结合的细胞器,尤其是含有遗传物质且被核被膜封闭的细胞核。

术语“经过基因修饰的宿主细胞”、“经过基因修饰的微生物”、“重组微生物”、“重组宿主细胞”和“重组菌株”在本文中可互换使用,并且可以指已经进行基因修饰的微生物。因此,所述术语包括如下微生物(例如细菌、酵母细胞、真菌细胞等),其与所源自的天然存在的微生物相比,已经进行基因改变、修饰或工程化,以便其展现改变、修饰或不同的基因型和/或表型(例如当基因修饰影响微生物的编码核酸序列时)。应了解,所述术语不仅指所讨论的具体重组微生物,而且还指此类微生物的后代或可能的后代。

术语“野生型微生物”可以描述天然存在的细胞,即尚未经过基因修饰的细胞。

术语“基因工程化”可以指对微生物基因组的任何操控(例如通过核酸***或缺失)。

术语“对照”或“对照宿主细胞”是指用于测定基因修饰或实验处理的影响的适当的比较宿主细胞。在一些实施例中,对照宿主细胞是野生型细胞。在其它实施例中,对照宿主细胞在基因上除了基因修饰之外与进行基因修饰的宿主细胞相同,从而有别于处理宿主细胞。在一些实施例中,本公开教示了亲代菌株作为对照宿主细胞(例如用作菌株改良程序基础的S1菌株)的用途。

如本文所用,术语“等位基因”意指基因的一或多种替代形式中的任一种,所有等位基因都涉及至少一种性状或特征。在二倍体细胞中,既定基因的两个等位基因占据一对同源染色体上的相应基因座。在实施例中,因为本公开涉及QTL,即可以包含一或多个基因或调控序列的基因组区域,所以在一些情况下,称为“单倍型”(即染色体区段的等位基因)比“等位基因”更准确,然而,在那些情况下,术语“等位基因”应理解为包含术语“单倍型”。

如本文所用,术语“基因座(locus)”(基因座(loci)的复数形式)意指染色体上发现例如基因或遗传标记的特定位置或位点。

如本文所用,术语“基因连锁”是指在育种期间,两种或更多种特性以高比率共同遗传,使得其难以通过杂交来分离。

如本文所用,“重组”或“重组事件”是指染色体交换或独立分类。术语“重组”是指具有作为重组事件的结果产生的新基因组成的生物体。

如本文所用,术语“表型”是指由个体的基因组成(即基因型)与环境之间的相互相用产生的个别细胞、细胞培养物、生物体或生物体群组的可观察特征。

如本文所用,术语“嵌合”或“重组”当描述核酸序列或蛋白质序列时,是指使至少两个异源多核苷酸或两个异源多肽连接成单一大分子或使至少一种天然核酸或蛋白质序列的一或多个元件重排的核酸或蛋白质序列。举例来说,术语“重组”可以指序列中两个以其它方式分离的区段例如通过化学合成或通过基因工程技术操控核酸中的分离区段进行的人造组合。

如本文所用,“合成核苷酸序列”或“合成多核苷酸序列”可以是不知在自然界中存在或天然不存在的核苷酸序列。通常,当与任何其它天然存在的核苷酸序列相比时,这类合成核苷酸序列将包含至少一种核苷酸差异。

如本文所用,术语“核酸”是指具有任何长度的核苷酸(核糖核苷酸或脱氧核糖核苷酸)的聚合形式,或其类似物。这一术语是指分子的主要结构,并且因此包括双链和单链DNA,以及双链和单链RNA。其还包括经修饰的核酸,如甲基化和/或封端核酸、含有经修饰碱基、骨架修饰等的核酸。术语“核酸”与“核苷酸序列”可互换使用。

如本文所用,术语“基因”是指与生物功能相关的任何DNA区段。因此,基因包括(但不限于)编码序列和/或其表达所需的调控序列。基因还可以包括例如形成其它蛋白质的识别序列的未表达的DNA区段。基因可以从多种来源获得,包括从所关注的来源克隆或利用已知或预测的序列信息合成,并且可以包括被设计成具有所期望参数的序列。

如本文所用,术语“同源”或“同源物”或“直系同源物”是所属领域中已知的并且是指具有共同祖先或家族成员并且基于序列一致性程度而确定的相关序列。术语“同源性”、“同源”、“基本上相似”以及“基本上对应”在本文中可互换使用。其可以指一或多个核苷酸碱基中的变化不影响核酸片段介导基因表达或产生某一表现型的能力的核酸片段。这些术语还可以指本公开的核酸片段的修饰,例如相对于初始的未经修饰的片段,基本上不改变所得核酸片段的功能特性的一或多个核苷酸的缺失或***。因此,应了解,如所属领域的技术人员将了解,本公开可以涵盖超过特定例示性序列。这些术语可以描述一种物种、亚种、品种、栽培品种或品系中所发现的基因与另一种物种、亚种、品种、栽培品种或品系中的相应或同等基因之间的关系。出于本公开的目的,可以对同源序列进行比较。认为、相信或已知“同源序列”或“同源物”或“直系同源物”在功能上是相关的。功能关系可以用多种方式中的任一种表示,包括(但不限于):(a)序列同一性程度和/或(b)相同或相似的生物功能。优选地,指示(a)与(b)。可以使用所属领域中容易获得的软件程序确定同源性,如《现代分子生物学实验技术(Current Protocols in Molecular Biology)》(奥斯贝(F.M.Ausubel)等人编,1987)副刊30,章节7.718,表6.71中所论述的那些软件程序。一些比对程序是MacVector(英国牛津的牛津分子有限公司(Oxford Molecular Ltd,Oxford,U.K.))、ALIGNPlus(宾夕法尼亚州的科学和教育软件(Scientific and Educational Software,Pennsylvania))以及AlignX(Vector NTI,加利福尼亚州卡尔斯巴德的英杰公司(Invitrogen,Carlsbad,CA))。另一种比对程序是Sequencher(Gene Codes,密歇根安娜堡(Ann Arbor,Michigan)),其使用默认参数。

如本文所用,术语“变异酶”或“变体”是指一种酶,其与表达变体的生物体中的原生酶相比,具有不同的氨基酸序列,但催化反应的能力与原生酶的催化能力相同或相似。

如本文所用,术语“内源性”或“内源基因”是指在宿主细胞基因组内天然地发现此基因的位置处天然存在的基因。在本公开的上下文中,异源启动子可操作地连接到内源基因意指通过遗传方式将异源启动子序列***现有基因之前,在此基因天然存在的位置处。如本文所述的内源基因可以包括天然存在的基因的等位基因,所述等位基因已经根据本公开的任何方法发生突变。

如本文所用,术语“外源”与术语“异源”可互换地使用并且是指来自不同于天然来源的一些来源的物质。举例来说,术语“外源蛋白质”或“外源基因”是指来自非天然来源或位置并且已经通过人工方式提供至生物系统中的蛋白质或基因。

如本文所用,术语“核苷酸变化”可以指例如核苷酸取代、缺失和/或***,如在所属领域中所充分了解。举例来说,突变含有可产生沉默取代、添加或缺失,但不改变所编码的蛋白质的特性或活性或蛋白质制备方式的变化。

如本文所用,术语“蛋白质修饰”可以指例如氨基酸取代、氨基酸修饰、缺失和/或***,如所属领域中所充分了解。

如本文所用,术语核酸或多肽的“至少一部分”或“片段”可以意指具有这类序列的最小尺寸特征的部分,或全长分子的任何更大的片段,最多是并且包括全长分子。本公开的多核苷酸片段可以编码基因调控元件的生物活性部分。基因调控元件的生物活性部分可以通过分离本公开的一种多核苷酸中包含基因调控元件的部分并且如本文中所描述评估活性来制备。类似地,多肽的一部分可以是4个氨基酸、5个氨基酸、6个氨基酸、7个氨基酸等,最多是全长多肽。待使用的所述部分的长度将取决于具体应用。适用作杂交探针的核酸部分可短至12个核苷酸;在一些实施例中,其是20个核苷酸。适用作表位的多肽部分可以短至4个氨基酸。发挥全长多肽功能的多肽部分通常将长于4个氨基酸。

变异多核苷酸还涵盖可以来源于诱变和诱重组程序(如DNA改组)的序列。这类DNA改组的策略在所属领域中已知。参见例如施特默尔(Stemmer)(1994)《美国国家科学院院刊(PNAS)》91:10747-10751;施特默尔(1994)《自然(Nature)》370:389-391;凯默瑞(Crameri)等人(1997)《自然生物技术(Nature Biotech.)》15:436-438;穆尔(Moore)等人(1997)《分子生物学杂志(J.Mol.Biol.)》272:336-347;张(Zhang)等人(1997)《美国国家科学院院刊》94:4504-4509;凯默瑞等人(1998)《自然》391:288-291;以及美国专利第5,605,793号和第5,837,458号。

对于本文所公开的多核苷酸的PCR扩增,可以设计用于PCR反应中的寡核苷酸引物以由从所关注的任何生物体提取的cDNA或基因组DNA扩增相应的DNA序列。用于设计PCR引物和PCR克隆的方法在所属领域中通常已知并且公开于萨布鲁克(Sambrook)等人(2001),《分子克隆:实验指南(Molecular Cloning:A Laboratory Manual)》(第3版,冷泉港实验室出版社(Cold Spring Harbor Laboratory Press),纽约普莱恩维尤(Plainview,NewYork))。还参见英尼斯(Innis)等人编(1990)《PCR方案:方法和应用指导(PCR Protocols:AGuide to Methods and Applications)》(学术出版社(Academic Press),纽约(NewYork));英尼斯和吉尔凡(Gelfand)编(1995)《PCR策略(PCR Strategies)》(学术出版社,纽约);以及英尼斯和吉尔凡编(1999)《PCR方法手册(PCR Methods Manual)》(学术出版社,纽约)。已知的PCR方法可以包括(但不限于)使用成对引物、巢式引物、单特异性引物、简并引物、基因特异性引物、载体特异性引物、部分不匹配引物等的方法。

如本文所用,术语“引物”可以指当放置在诱导引物延伸产物合成的条件下时,即在核苷酸和聚合剂(如DNA聚合酶)存在下以及在适合温度和pH下,能够与扩增目标粘接,从而允许DNA聚合酶附著,借此充当DNA合成的起始点的寡核苷酸。(扩增)引物优选单链以获得最大的扩增效率。引物优选是寡脱氧核糖核苷酸。引物必须足够长以在聚合剂存在下引发延伸产物的合成。引物的精确长度将取决于多种因素,包括引物的温度和组成(A/T对比G/C含量)。一对双向引子由一个正向引物和一个反向引物组成,如在DNA扩增(如PCR扩增)领域中通常使用。

如本文所用,“启动子”或“启动子多核苷酸”可以指能够控制编码序列或功能性RNA的表达的DNA序列。启动子序列由邻近的和更远侧的上游元件组成,后面元件通常可以称为强化子。因此,“强化子”可以是会刺激启动子活性的DNA序列,并且可以是启动子的固有元件或被***以增强启动子的水平或组织特异性的异源元件。启动子可完全来源于原生基因,或由来源于自然界中所发现的不同启动子的不同元件构成,或甚至包含合成DNA区段。所属领域的技术人员应了解,不同启动子可以引导基因在不同组织或细胞类型中或处于不同发育阶段或响应不同的环境条件的表达。另外认识到,由于在大多数情况下,调控序列的确切边界尚未完全界定,因此一些变异的DNA片段可以具有相同的启动子活性。

如本文所用,短语“重组构建体”、“表达构建体”、“嵌合构建体”、“构建体”以及“重组DNA构建体”在本文中可互换地使用。重组构建体包含核酸片段的人工组合,例如自然界中未一同发现的调控和编码序列。举例来说,嵌合构建体可以包含来源于不同来源的调控序列和编码序列,或来源于同一来源但以与在自然界中发现的方式不同的方式排列的调控序列和编码序列。在一些情况下,嵌合构建体可以是包含多个调控(例如启动子)和编码序列(例如gapA/转氢酶/gdh、asd、dapB和/或ddh基因)的重组构建体。包含多个编码序列的嵌合构建体中的每个编码序列可以受独立调控序列控制或功能性地连接到独立调控序列。本文所述的这类构筑体可以单独使用或可以与载体结合使用。如所属领域的技术人员众所周知,如果使用载体,那么载体的选择可能取决于用于使宿主细胞转化的方法。举例来说,可以使用质粒载体。所属领域的技术人员深知,为了成功地转化、选择和繁殖包含本公开的任一个分离的核酸片段的宿主细胞,基因元件必须存在于载体上。所属领域的技术人员还将认识到,不同的独立转化事件将引起不同的表达水平和模式(琼斯(Jones)等人,(1985),《欧洲分子生物学杂志(EMBO J)》4:2411-2418;德阿尔梅达(De Almeida)等人,(1989),《分子基因遗传学(Mol.Gen.Genetics)》218:78-86),因此必须对多个事件进行筛选以便获得呈现所期望表达水平和模式的株系。这类筛选尤其可以通过DNA的南方印迹分析(Southernanalysis)、mRNA表达的北方印迹分析(Northern analysis)、蛋白质表达的免疫印迹分析或表型分析来实现。载体可以是质粒、病毒、噬菌体、前病毒、噬菌粒、转座子、人工染色体等等,其自主地复制或者能整合至宿主细胞的染色体中。载体还可以是非自主复制的裸RNA多核苷酸、裸DNA多核苷酸、由相同链内的DNA和RNA组成的多核苷酸、聚赖氨酸结合的DNA或RNA、肽结合的DNA或RNA、脂质粒结合的DNA等。如本文所用,术语“表达”是指功能性最终产物,例如mRNA或蛋白质(前体或成熟物)的产生。

“可操作地连接”或“功能性地连接”在此背景下可以意指根据本公开的启动子多核苷酸与另一寡核苷酸或多核苷酸(例如gapA/转氢酶/gdh、asd、dapB和/或ddh基因)的依序排列,引起所述另一多核苷酸(例如gapA/转氢酶/gdh、asd、dapB和/或ddh基因)的转录。换句话说,“可操作地连接”或“功能性地连接”可以意指启动子控制与所述启动子相邻或处于其下游或3'的基因(例如gapA/转氢酶/gdh、asd、dapB和/或ddh基因)的转录。

如本文所用,术语“所关注产物”或“生物分子”是指由原料中的微生物产生的任何产物。在一些情况下,所关注的产物可以是小分子、酶、肽、氨基酸、有机酸、合成化合物、燃料、乙醇等。举例来说,所关注的产物或生物分子可以是任何初级或次级细胞外代谢物。初级代谢物尤其可以是乙醇、柠檬酸、乳酸、谷氨酸、谷氨酸酯、赖氨酸、苏氨酸、色氨酸和其它氨基酸、维生素、多糖等。次级代谢物尤其可以是抗生素化合物,如青霉素,或免疫抑制剂,如环孢菌素A(cyclosporin A);植物激素,如赤霉素;他汀类药物,如洛伐他汀(lovastatin);杀真菌剂,如灰黄霉素(griseofulvin)等。所关注的产物或生物分子也可以是微生物产生的任何细胞内组分,如:微生物酶,包括:催化酶、淀粉酶、蛋白酶、果胶酶、葡萄糖异构酶、纤维素酶、半纤维素酶、脂肪酶、乳糖酶、链激酶和其它多种。细胞内组分还可以包括重组蛋白,如:胰岛素、B型肝炎疫苗、干扰素、粒细胞群落刺激因子、链激酶等等。

术语“碳源”通常可以指适用作供细胞生长用的碳源的物质。碳源包括(但不限于)生物质水解产物、淀粉、蔗糖、纤维素、半纤维素、木糖和木质素,以及这些底物的单体组分。碳源可以包含各种形式的各种有机化合物,包括(但不限于)聚合物、碳水化合物、酸、醇、醛、酮、氨基酸、肽等。这些包括例如各种单糖,如葡萄糖、右旋糖(D-葡萄糖)、麦芽糖、寡糖、多糖、饱和或不饱和脂肪酸、丁二酸酯、乳酸酯、乙酸酯、乙醇等,或其混合物。光合成生物体可以另外产生光合成产物形式的碳源。在一些实施例中,碳源可以选自生物质水解产物和葡萄糖。

术语“原料”可以定义为供应给微生物或发酵工艺,由此能够制备其它产物的原材料或原材料混合物。举例来说,碳源,如生物质或来源于生物质的碳化合物,可以是供微生物在发酵工艺中产生所关注产物(例如小分子、肽、合成化合物、燃料、乙醇等)的原料。然而,原料可以含有除碳源外的营养物。

术语“体积生产率”或“生产速率”可以定义为每单位时间每体积培养基形成的产物的量。体积生产率可以用克/升/小时(g/L/h)报告。

术语“比生产率”定义为产物的形成速率。为了描述生产率作为微生物的固有参数而非发酵工艺的固有参数,可以在本文中将生产率进一步定义为比生产率,单位为每小时每克细胞干重(CDW)的产物克数(g/g CDW/h)。对既定微生物使用CDW与OD600的关系,比生产率还可以用每小时每升培养基每600nm培养液光学密度(OD)的产物克数(g/L/h/OD)表示。

术语“产量”可以定义为每单位重量的原材料所得的产物的量并且可以用每克底物的产物克数(g/g)表示。产量可以用理论产量的百分比表示。““理论产量”定义为如根据用于制备产物的代谢途径的化学计量学所指定,按既定量的底物计,能够产生的产物的最大量。

术语“力价”或“效价”可以定义为溶液的浓度或溶液中的物质的浓度。举例来说,所关注产物(例如小分子、肽、合成化合物、燃料、乙醇等)在发酵液中的力价可以描述为每升发酵液的溶液中的所关注产物克数(g/L)。

术语“总效价”可以定义为工艺中所产生的全部所关注产物的总和,所述所关注产物包括(但不限于)溶液中的所关注产物、气相(如果适用)中的所关注产物以及从工艺中去除并且相对于工艺中的初始体积或工艺中的操作体积所回收的任何所关注产物。

如本文所用,术语“HTP基因设计文库”或“文库”是指根据本公开的基因扰动的集合。在一些实施例中,本公开的文库可以表现为i)数据库或其它计算机文件中的序列信息的集合;ii)编码前述系列的基因元件的基因构建体的集合;或iii)包含所述基因元件的宿主细胞菌株。

产生基因多样性池供增加NADPH的基因设计和HTP微生物工程平台使用

在一些实施例中,本公开的方法的特征为基因设计。如本文所用,术语基因设计是指通过鉴别和选择具体基因的最佳变体、基因的一部分、启动子、终止密码子、5'UTR、3'UTR或其它DNA序列来重建或改变宿主生物体的基因组,以设计和产生新的优良宿主细胞。

在一些实施例中,本公开的基因设计方法中的第一步骤是获得具有多种序列变异的初始基因多样性池群体,由此群体可以重建新的宿主基因组。

利用来自现有野生型菌株的多样性池

在一些实施例中,本公开教示了用于鉴别既定野生型群体的微生物间所存在的序列多样性的方法。因此,多样性池可以是分析所用的既定数目n种野生型微生物,其中所述微生物基因组代表“多样性池”。

在一些实施例中,多样性池可以是所述野生型微生物间的天然基因变异中所存在的现有多样性的结果。这种变异可以由既定宿主细胞的菌株变体产生或可以由作为完全不同物种的微生物所产生。基因变异可以包括菌株基因序列的任何差异,不论天然存在还是不存在。在各方面,本公开利用微生物的专用文库来获得新颖苏氨酸醛缩酶。如将要看到的,本申请教示如何利用苏氨酸醛缩酶的此文库来优化此有用氨基酸的菌株产生。

利用来自现有工业菌株变体的多样性池

在本公开的其它实施例中,多样性池是在传统菌株改良过程中所产生的菌株变体(例如经由随机突变而产生并且选用于多年来提高产量的一或多种宿主生物体菌株)。因此,在一些实施例中,多样性池或宿主生物体可以包含历史性生产菌株的集合。

在具体方面,多样性池可以是原始亲代微生物菌株(S1),其在具体时间点具有“基线”基因序列(S1Gen1);然后是任何数目个后续子代菌株(S2、S3、S4、S5等,可归纳为S2-n),其衍生/发展自所述S1菌株,并且相对于S1的基线基因组,具有不同基因组(S2-nGen2-n)。

通过诱变来产生多样性池

在一些实施例中,既定多样性池细胞群体中的所关注突变能够利用使菌株发生突变的任何方式(包括诱变化学品或辐射)人工产生。术语“诱变”在本文中用于指一种诱导细胞核酸材料发生一或多种基因修饰的方法。

术语“基因修饰”是指DNA的任何改变。代表性基因修饰包括核苷酸***、缺失、取代以及其组合,并且可以小至单个碱基或大至数万个碱基。因此,术语“基因修饰”涵盖核苷酸序列的倒位和其它染色体重排,借此改变构成染色体区域的DNA的位置或取向。染色体重排可以包含染色体内重排或染色体间重排。

在一个实施例中,本发明所要求的主题中所用的诱变方法基本上是随机的,以使得基因修饰能够在待诱变的核酸材料内的任何可利用核苷酸位置处发生。换句话说,在一个实施例中,诱变未显示在具体核苷酸序列处发生的偏好或频率增加。

本公开的方法可以使用任何诱变剂,包括(但不限于):紫外光、X射线辐射、γ辐射、N-乙基-N-亚硝基脲(ENU)、甲基亚硝基脲(MNU)、丙卡巴肼(procarbazine,PRC)、三亚乙基三聚氰胺(TEM)、丙烯酰胺单体(AA)、苯丁酸氮芥(chlorambucil,CHL)、美法仑(melphalan,MLP)、环磷酰胺(cyclophosphamide,CPP)、硫酸二乙酯(DES)、甲烷磺酸乙酯(EMS)、甲烷磺酸甲酯(MMS)、6-巯基嘌呤(6-mercaptopurine,6-MP)、丝裂霉素-C(mitomycin-C,MMC)、N-甲基-N'-硝基-N-亚硝基胍(MNNG)、3H2O和氨基甲酸酯(UR)(参见例如林奇克(Rinchik),1991;马克(Marker)等人,1997;和拉塞尔(Russell),1990)。其它诱变剂已为所属领域中的技术人员所熟知,包括http://www.iephb.nw.ru/~spirov/hazard/mutagen_lst.html中所述的那些。

术语“诱变”还涵盖了用于改变(例如通过靶向突变)或调节细胞功能,借此增强诱变速率、品质或程度的方法。举例来说,可以改变或调节细胞,借此使其在DNA修复、诱变剂代谢、诱变剂敏感性、基因组稳定性或其组合方面出现功能异常或缺陷。因此,通常维持基因组稳定性的基因功能的干扰可以用于增强诱变。干扰的代表性目标包括(但不限于)DNA连接酶I(本特雷(Bentley)等人,2002)和酪蛋白激酶I(美国专利第6,060,296号)。

在一些实施例中,利用定点诱变(例如使用市购试剂盒,如Transformer定点诱变试剂盒(克隆科技公司(Clontech)进行的引物定向诱变)在整个核酸序列中产生多种变化,以便产生编码本公开的裂解酶的核酸。

暴露于一或多种诱变剂后发生基因修饰的频率可以通过改变处理剂量和/或重复次数来调节,并且可以根据具体应用来定制。

因此,在一些实施例中,如本文所用,“诱变”包含所属领域中已知的用于诱导突变的所有技术,包括易错PCR诱变、寡核苷酸定向诱变、定点诱变以及利用本文所述的任何技术进行的迭代序列重组。

增加NADPH的基因设计的概述

本公开提供了一种用于产生能够增加所关注的生物分子或产物的产生的微生物(例如细菌)的方法。一般来说,用于产生供产生如本文所提供的任何生物分子用的微生物的方法可能需要通过以下来对宿主微生物进行基因修饰:将一或多个目标基因引入所述宿主微生物中,以产生所述微生物的基因组工程化菌株;在适合于产生所关注的生物分子或产物的条件下培养所述工程化菌株;以及如果所述工程化菌株产生增加量的所关注的生物分子或产物,那么选择所述工程化菌株。所述增加量可以与宿主微生物的野生型菌株相比。增加量可以与不含目标基因文库的成员的宿主微生物菌株相比。目标基因可以在载体中包含单个目标基因,或在相同载体上包含多个目标基因。

本公开的一个实施例的一个例示性工作流程需要鉴别目标基因,获得或合成目标基因的核酸(例如DNA)以及将所述获得或合成的目标基因克隆至合适载体中。所属领域中已知和/或本文提供的任何方法可以用于将目标基因装配或克隆至合适载体中。载体可以是所属领域中已知和/或本文提供的与待利用的宿主微生物相容的任何载体。一旦装配好包含目标基因的载体,就可以将其引入宿主微生物中。载体可以使用所属领域中已知和/或本文提供的任何方法引入。宿主微生物可以是本文提供的任何宿主微生物。一旦引入宿主微生物中,就可以选择经过基因修饰的宿主并且可以评估目标基因的***。目标基因可以进行工程化以***到宿主微生物基因组的特定位置中。在一些情况下,目标基因***到促进目标基因表达但不扰乱宿主微生物内的非预期途径/过程的基因组的中性位点中。在一些情况下,目标基因置换宿主微生物内的特定基因。特定基因可以是宿主微生物中通常存在的同源目标基因。可以凭经验确定整合位点,例如中性整合位点,以便可以测试多个位点并且可以选择允许表达所整合的目标基因但不会损害宿主细胞的位点。可以通过以下来促进整合至所需位点(例如中性位点)中:将目标基因克隆至包含与所需整合位点同源的序列部分(即,同源臂)的载体中,随后在宿主细胞中进行重组事件。目标基因可以***同源序列的部分之间。在某些实施例中,载体包含约2kb的与所需整合位点同源的序列。与所需位点同源的序列可以侧接甘油醛3-磷酸脱氢酶(gapA)、谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和/或内消旋-二氨基庚二酸脱氢酶(ddh)基因***物,使得序列的第一部分处于基因***物的上游(即,5')并且序列的第二部分处于基因***物的下游(即,3')。在其它实施例中,载体包含约4kb的与所需整合位点同源的序列。在此实施例中,载体包含约2kb的与所需整合位点同源的处于gapA、gdh、asd、dapB和/或ddh基因***物上游(即,5')的序列和约2kb的与所需整合位点同源的处于gapA、gdh、asd、dapB和/或ddh基因***物下游(即,3')的序列。在一些实施例中,整合通过单一互换型整合以及随后质粒骨架的环出进行,所述环出是通过对载体骨架中存在的标记物进行反向选择来推动。在一些实施例中,目标基因是所属领域中已知和/或本文提供的任何gapA基因。在其它实施例中,目标基因是所属领域中已知和/或本文提供的任何烟酰胺核苷酸转氢酶基因。在其它实施例中,目标基因是所属领域中已知和/或本文提供的任何gdh、asd、dapB和/或ddh基因。在一些实施例中,目标基因是所属领域中已知和/或本文提供的任何gapA基因,和/或所属领域中已知和/或本文提供的任何烟酰胺核苷酸转氢酶基因,和/或所属领域中已知和/或本文提供的任何gdh、asd、dapB和ddh基因。在其它实施例中,目标基因是所属领域中已知和/或本文提供的任何thrA、thrB、thrC和/或ltaE基因。在其它实施例中,目标基因是所属领域中已知和/或本文提供的任何pyc基因。

可以使用所属领域中已知的任何方法,例如对经过基因修饰的微生物的基因组或其部分进行扩增和/或测序来评估所述***。在一些情况下,本文所提供的方法还需要通过如本文所述进行反向选择来去除选择标记或使其环出。环出可以使用本文所提供的任一方法进行。

在评估目标基因的***和任选地去除选择标记后,可以评估经过基因修饰的菌株产生所关注的生物分子或产物的能力。在评估前,任选的步骤可以为扩增菌株。扩增可能需要在平板上或多孔板中的孔中在适合于扩增的生长介质中培养经过基因修饰的菌株。评估步骤可能需要在包含被设计成模拟产生所关注的生物分子或产物的实际条件的生长介质/条件的平板上或多孔板中的孔中培养经过基因修饰的菌株。在一些情况下,此步骤中的生长介质适合于产生来源于葡萄糖代谢加工的所关注的生物分子或产物。如果如从评估步骤确定,经过基因修饰的菌株具有或预测产生所需或临限生产速率或产率的所关注的生物分子或产物,那么可以选择此菌株并且冷却存储。预测可以基于测量在培养菌株期间的多个时间点形成的所关注的产物和生物质的量,以及使用所述测量结果预测所述菌株将在扩增或更大规模条件(例如发酵条件)下如何表现。在一个实施例中,预测是基于评估方法期间菌株性能的线性回归分析。

在一些情况下,将具有或预测产生所需或临限生产速率或产率的所关注的生物分子或产物的经过基因修饰的菌株转移至较大培养物中用于产生所关注的生物分子或产物的条件(例如发酵条件)下或在其中生长。此步骤可以用以确定所选菌株是否可以如在用于产生所关注的生物分子或产物的实际条件下预测般表现。在一些情况下,本文提供的用于引入和评估来自目标基因文库(例如本文提供的目标基因文库)的每个目标基因的步骤针对来自文库的每个目标基因重复,以选择产生所需或临限产率和/或生产速率的所关注的生物分子或产物的经过基因修饰的微生物的一或多个菌株。

在一些实施例中,所关注的生物分子或产物通过微生物来源于葡萄糖和其代谢加工,使得本文所提供的方法需要产生如下微生物菌株,其可以产生增加量的来源于菌株对葡萄糖的代谢加工的所关注的生物分子或产物。在某些实施例中,本文所提供的方法需要引入一或多个与赖氨酸生物合成有关的目标基因。在其它实施例中,本文所提供的方法需要引入一或多个与宿主细胞中的NADPH产生有关的目标基因。在其它实施例中,本文所提供的方法需要引入一或多个与减少宿主细胞的NADPH利用有关的目标基因。在一些实施例中,目标基因是gapA基因,从而在本文所提供的方法中将gapA基因引入宿主微生物中。gapA基因可以是宿主微生物中的异源基因。

在其它实施例中,目标基因是烟酰胺核苷酸转氢酶基因,从而在本文所提供的方法中将烟酰胺核苷酸转氢酶基因引入宿主微生物中。

在许多生物体中,三羧酸(tricarboxylic acid,TCA)循环中间物可以直接从丙酮酸中再生。举例来说,在一些细菌中发现但在大肠杆菌中未发现的丙酮酸羧化酶(pyc)通过利用羧基生物素将丙酮酸进行羧基化来介导草酰乙酸的形成。

在其它实施例中,目标基因是丙酮酸羧化酶基因,从而在本文所提供的方法中将丙酮酸羧化酶(pyc)基因引入宿主微生物中。pyc基因可以与宿主微生物异源。在某些实施例中,pyc基因选自美国专利第6,171,833号和美国专利第6,171,833号中所公开的序列。在一个实施例中,pyc基因来源于菜豆根瘤菌(R.etli)。在一个实施例中,pyc基因来源于棒状杆菌属。在一个实施例中,目标生物体是大肠杆菌。在一个实施例中,目标生物体是棒状杆菌属。在一个实施例中,pyc的异源变体在缺乏内源性pyc的宿主细胞中表达。在一个实施例中,pyc的异源变体在具有内源性pyc的宿主细胞中表达。在一个实施例中,通过对包含pyc的基因座进行基因修饰以包括可操作地连接到pyc的强启动子来增加内源性pyc的表达。在一些实施例中,通过从启动子梯选择启动子来调节pyc的表达。在一个实施例中,通过***可操作地连接到原生pyc基因的启动子元件来原生天然PYC的表达。在一个实施例中,通过***可操作地连接到原生pyc基因的来自启动子梯的几个启动子元件中的每一个来调整原生PYC的表达。在一个实施例中,通过过表达异源pyc基因来增加PYC的表达。在一个实施例中,异源pyc基因是谷氨酸棒状杆菌pyc基因。在一个实施例中,谷氨酸棒状杆菌pyc可操作地连接到强启动子。在一个实施例中,谷氨酸棒状杆菌pyc可操作地连接到来自启动子梯的几个启动子元件中的每一个,并且通过选择启动子元件来调整PYC的表达,以产生最高量的例如苏氨酸等所需产物。

在其它实施例中,目标基因是gdh、asd、dapB或ddh基因中的一或多种,从而在本文所提供的方法中将gdh、asd、dapB或ddh基因引入宿主微生物中。gdh、asd、dapB或ddh基因中的一或多种可以是宿主微生物中的异源基因。在某些实施例中,在本文所提供的方法中将gdh、asd、dapB和ddh所有四种基因引入宿主微生物中。

在某些实施例中,在本文所提供的方法中gapA基因与烟酰胺核苷酸转氢酶基因都引入宿主微生物中。

在其它实施例中,在本文所提供的方法中gapA基因以及选自gdh、asd、dapB和ddh中的一或多种基因引入宿主微生物中。在其它实施例中,在本文所提供的方法中烟酰胺核苷酸转氢酶基因以及选自gdh、asd、dapB和ddh中的一或多种基因引入宿主微生物中。

在其它实施例中,在本文所提供的方法中gapA、烟酰胺核苷酸转氢酶基因以及选自gdh、asd、dapB和ddh中的一或多种基因同时引入宿主微生物中。

在一个实施例中,gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因引入宿主微生物中增加宿主微生物中NADPH的量。在某些方面,宿主微生物中NADPH的产生增加。在其它方面,宿主微生物中NADPH的利用减少。在某些实施例中,宿主微生物中增加量的NADPH用以增加所关注的生物分子或产物的合成。本文所提供的方法产生的所关注的生物分子或产物可以是葡萄糖产生的任何商品。在一些情况下,所关注的生物分子或产物是小分子、氨基酸、有机酸或醇。氨基酸可以是酪氨酸、苯丙氨酸、色氨酸、天冬氨酸、天冬酰胺、苏氨酸、异亮氨酸、甲硫氨酸或赖氨酸。有机酸可以是丁二酸、乳酸或丙酮酸。醇可以是乙醇或异丁醇。在特定实施例中,所关注的生物分子或产物是氨基酸。在特定方面,氨基酸是赖氨酸。在某些方面,赖氨酸是L-赖氨酸。在特定方面,氨基酸是苏氨酸。在某些方面,苏氨酸是L-苏氨酸。

在一个实施例中,宿主菌株是通过***thrLABC调控子(例如大肠杆菌K-12菌株W3110(SEQ ID NO.76)的thrLABC调控子)进行修饰的细菌菌株。在一个实施例中,宿主菌株是通过***thrABC调控子(例如通过缺失thrL前导序列而进行修饰的大肠杆菌K-12菌株W3110(SEQ ID NO:77)的thrLABC调控子)进行修饰的细菌菌株。在一个实施例中,宿主菌株是通过缺失编码L-苏氨酸3-脱氢酶(tdh)或其同源物的细菌基因组区进行修饰的细菌菌株。

利用增加NADPH的文库进行微生物基因工程化

在一个实施例中,所公开的微生物基因组工程化方法利用gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因的文库。gapA基因可基于其使用NAD作为辅因子的能力而选择。在某些实施例中,加宽gapA的辅酶特异性。因此,在一些方面,gapA具有针对NAD和NADH的双特异性。在一些方面,相比于NAD,gapA更优选地使用NADH。在其它方面,gapA对NAD和NADH同等偏好。烟酰胺核苷酸转氢酶基因可基于其将NADH转变成NADPH的能力而选择。gdh、asd、dapB或ddh可基于其使用NADPH作为辅因子的能力而选择。在某些实施例中,加宽gdh、asd、dapB和/或ddh的辅酶特异性。因此,在一些方面,gdh、asd、dapB和/或ddh具有针对NADPH和NADP的双特异性。在一些方面,相比于NADP,gdh、asd、dapB和/或ddh更优选地使用NADPH。在其它方面,gdh、asd、dapB和/或ddh对NADPH和NADP同等偏好。TA基因可基于其更缓慢地代谢苏氨酸或产生苏氨酸的能力而选择。在某些实施例中,加宽TA的底物特异性。因此,在一些方面,TA具有针对甘氨酸和丝氨酸的双特异性。在一些方面,相比于甘氨酸,TA更优选地使用丝氨酸。在其它方面,TA对丝氨酸和甘氨酸同等偏好。pyc可基于其将丙酮酸转变成草酰乙酸的能力而选择。

在一些情况下,利用gapA文库、烟酰胺核苷酸转氢酶文库、gdh、asd、dapB和/或ddh和/或TA文库和/或pyc文库或这些文库的任何组合对微生物进行工程化。在一些实施例中,文库含有多个嵌合构建体***物,使得文库中的每个***物包含gapA基因、烟酰胺核苷酸转氢酶基因和选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因。在工程化后,可以针对所得结果,例如如本文所提供的产物从葡萄糖中的产生有效地筛选或评估微生物。利用本文提供的文库界定具体基因组改变并且然后测试/筛选具有所述改变的宿主微生物基因组的此方法可以依照有效和迭代的方式进行,并且可以用于鉴别gapA和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因的特定组合,所述特定组合在宿主细胞中的表达从葡萄糖产生所需或临限水平的所关注的生物分子或产物。

在某些实施例中,如本文所提供的用于本文所提供的方法的每个gapA基因或烟酰胺核苷酸转氢酶基因或选自gdh、asd、dapB和ddh的一或多种基因处于原生启动子或本文提供的任一启动子多核苷酸的控制下或功能性地连接至其。“启动子多核苷酸”或“启动子”或“具有启动子活性的多核苷酸”可以意指当功能性地连接至待转录的多核苷酸时决定编码多核苷酸(例如gapA基因或烟酰胺核苷酸转氢酶基因或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因)的转录起始点和频率,由此实现受影响的控制多核苷酸的表达强度的多核苷酸、优选脱氧核糖核苷酸,或核酸、优选脱氧核糖核酸(DNA)。在一些实施例中,包含gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因的文库中的每个gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因处于相同或同一启动子的控制下。在其它实施例中,包含葡萄糖gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因的文库中的每个gapA基因和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因处于独立或不同启动子的控制下。在其它实施例中,包含目标基因的嵌合构建体的文库中的嵌合构建体中的每个目标基因处于相同或同一启动子的控制下。在另外的实施例中,包含目标基因的嵌合构建体的文库中的嵌合构建体中的每个目标基因处于独立或不同启动子的控制下。

启动子梯

在一些实施例中,本公开教示了选择具有最佳表达特性以调节宿主微生物中一或多种酶的表达并对总宿主菌株生产率产生有益作用的启动子的方法。

启动子调控基因转录速率并且可以通过多种方式影响转录。举例来说,不论内部或外部细胞条件如何,组成性启动子均引导其关联基因按恒定速率转录,而可调控启动子则取决于内部和/或外部细胞条件,例如生长速率、温度、对特定环境化学物质的响应等增加或降低基因转录的速率。启动子可以从其正常细胞情境中分离出来且进行工程化以调控几乎任何基因的表达,从而能够有效改变细胞生长、产物产量和/或所关注的其它表型。

在一些实施例中,本公开教示了用于产生启动子梯文库以供下游基因设计方法使用的方法。举例来说,在一些实施例中,本公开教示了鉴别一或多种启动子和/或在宿主细胞内产生一或多种启动子的变体的方法,所述启动子和/或变体展现了一系列表达强度或优良的调控特性。已经鉴别和/或产生的这些启动子的特定组合可以一起分组成启动子梯,下文更详细地解释此。

在一些实施例中,本公开教示了启动子梯的使用。在一些实施例中,本公开的启动子梯包含展现连续系列的表达谱的启动子。举例来说,在一些实施例中,启动子梯通过鉴别响应于刺激而展现一系列表达强度的天然、原生或野生型启动子或通过组成性表达来产生。已经鉴别的这些启动子可以一起分组成启动子梯。

在其它实施例中,本公开教示了展现跨越不同条件的一系列表达谱的启动子梯的产生。举例来说,在一些实施例中,本公开教示了具有在整个发酵的不同阶段的表达峰分布的启动子梯的产生。在其它实施例中,本公开教示了具有响应于特定刺激的不同表达峰动力学的启动子梯的产生。所属领域的技术人员将认识到,本公开的调控性启动子梯可以代表任一或多种调控概况。

在一些实施例中,本公开的启动子梯被设计成以可预测的方式跨越一系列连续响应扰动基因表达。在一些实施例中,启动子梯的连续性质赋予菌株改良程序额外的预测能力。举例来说,在一些实施例中,所选代谢途径的交换启动子或终止序列可以产生宿主细胞性能曲线,此鉴别最佳表达率或表达谱;产生如下菌株,其中靶向基因不再是具体反应或基因级联的限制因素,同时还避免了在不适当情形下发生不必要的过表达或错误表达。在一些实施例中,启动子梯通过鉴别展现所期望概况的天然、原生或野生型启动子来产生。在其它实施例中,通过使天然存在的启动子发生突变以衍生多种突变启动子序列来产生启动子梯。测试这些突变启动子中的每一种对目标基因表达的影响。在一些实施例中,测试所编辑的启动子在多种条件下的表达活性,从而记录/表征/标注每种启动子变体的活性并存储于数据库中。随后将所得到的经编辑的启动子变体组织成启动子梯,所述启动子梯基于其表达强度进行排列(例如高表达性变体靠近顶部,且减弱的表达靠近底部,因此产生术语“梯”)。

在一些实施例中,本公开教示了作为已鉴别的天然存在的启动子与突变变体启动子的组合的启动子梯。

在一些实施例中,本公开教示了鉴别符合以下两种标准的天然、原生或野生型启动子的方法:1)代表组成性启动子梯;以及2)可以由短DNA序列编码,理想地,小于100个碱基对。在一些实施例中,本公开的组成性启动子展现跨越两种所选生长条件的恒定基因表达(典型地在工业培育期间所经历的条件间进行比较)。在一些实施例中,本公开的启动子将由约60个碱基对核心启动子和长度在26个碱基对与40个碱基对之间的5'UTR组成。

在一些实施例中,选择前述已鉴别的天然存在的启动子序列中的一或多种用于基因编辑。在一些实施例中,通过上文所述的任一种突变方法编辑天然启动子。在其它实施例中,本公开的启动子通过合成具有所期望序列的新启动子变体来编辑。

以下申请的整个公开内容以引入的方式并入本文中:美国申请第15/396,230号(美国公开第US 2017/0159045 A1号);PCT/US2016/065465(WO 2017/100377 A1);美国申请第15/140,296号(US 2017/0316353 A1);PCT/US2017/029725(WO 2017/189784 A1);PCT/US2016/065464(WO 2017/100376 A2);美国临时申请第62/431,409号;美国临时申请第62/264,232号;以及美国临时申请第62/368,786号。

本公开的启动子的非详尽性清单提供于下表1中。启动子序列中之每一个都可以称为异源启动子或异源启动子多核苷酸。

表1.本公开的所选启动子序列.

SEQ ID No. 启动子简称 启动子名称
59 P1 Pcg0007_lib_39
60 P2 Pcg0007
61 P3 Pcg1860
62 P4 Pcg0755
63 P5 Pcg0007_265
64 P6 Pcg3381
65 P7 Pcg0007_119
66 P8 Pcg3121

在一些实施例中,本公开的启动子展现与来自上表的启动子至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

在一些情况下,启动子梯可以用于选自gapA文库、烟酰胺核苷酸转氢酶文库、gdh、asd、dapB和/或ddh和/或TA文库和/或pyc文库或这些文库的任何组合的基因前面。在一些实施例中,启动子梯的用途包含调节选自gapA文库、烟酰胺核苷酸转氢酶文库、gdh、asd、dapB和/或ddh和/或TA文库和/或pyc文库或这些文库的任何组合的基因的表达。在一些实施例中,启动子梯的用途包含微调选自gapA文库、烟酰胺核苷酸转氢酶文库、gdh、asd、dapB和/或ddh和/或TA文库和/或pyc文库或这些文库的任何组合的基因的表达。在工程化后,可以针对所得结果,例如如本文所提供的产物从葡萄糖中的产生有效地筛选或评估微生物。利用本文提供的启动子梯来产生其中基因实现具体表达水平的宿主并且然后测试/筛选具有所述改变的宿主微生物基因组的此方法可以依照有效和迭代的方式执行,并且可以用于鉴别对于gapA和/或烟酰胺核苷酸转氢酶基因和/或选自gdh、asd、dapB和ddh的一或多种基因和/或TA基因和/或pyc基因来说最佳的特定基因表达水平,由此在宿主细胞中所述基因表达水平下的表达从葡萄糖产生所需或临限水平的所关注的生物分子或产物。

甘油醛-3-磷酸脱氢酶文库

在某些实施例中,本文提供了用于本文所提供的方法中的gapA基因的文库。gapA基因的文库可以包含一或多个gapA基因。文库中的每个gapA基因可以是gapA基因的原生形式或突变形式。突变形式可以包含选自***、缺失、单核苷酸多态性(single nucleotidepolymorphism,SNP)或易位的一或多种突变。文库中的每个gapA基因可以是gapA基因。gapA基因可以是来自所属领域中已知的原核细胞(即,细菌和/或古细菌)的任何gapA基因。gapA基因可以是来自所属领域中已知的真核细胞(例如真菌)的任何gapA基因。gapA可以被认为是包含NAD和/或NADH依赖性GAPDH活性的任何蛋白质。举例来说,本文中使用的gapA可以是将甘油醛-3-磷酸转变为甘油酸-1,3-双磷酸的任何酶。宿主细胞可以是本文提供的任何宿主细胞。在一些实施例中,gapA基因的文库包含来自以下各菌的任何菌株/物种/亚种的gapA基因:分枝杆菌属(例如耻垢分支杆菌(Mycobacterium smegmatis))、链霉菌(Streptomyces)(例如天蓝色链霉菌(Streptomyces coelicolor))、发酵单胞菌属(Zymomonas)(例如运动发酵单胞菌(Zymomonas mobilis))、集胞藻属(Synechocystis)(例如集胞藻属PCC6803)、双歧杆菌属(例如长双岐杆菌(Bifidobacterium longum))、埃希氏杆菌属(例如大肠杆菌)、芽孢杆菌属(例如枯草芽孢杆菌(Bacillus subtilis))、棒状杆菌属(例如谷氨酸棒状杆菌)、酵母菌属(Saccharomyces)(例如酿酒酵母(S.cerevisiae))或其组合。

在一些实施例中,本公开的gapA酶展现与本文提供的gapA酶至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

文库中的每个gapA基因可以功能性地连接至其原生启动子或其原生启动子的突变形式或处于所述启动子的控制下。文库中的每个gapA基因可以功能性地连接至本文提供的任何启动子或受所述启动子控制。gapA基因的文库中的每个gapA基因可以存在于嵌合构建体中,使得基因可以侧接一或多个调控序列和/或与宿主细胞基因组中存在的序列同源的序列。与宿主细胞中存在的序列同源的序列可以促进gapA基因整合至宿主细胞基因组的包含互补序列的位点或基因座中。整合可以经由重组事件进行。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。

烟酰胺核苷酸转氢酶文库

在某些实施例中,本文提供了用于本文所提供的方法中的烟酰胺核苷酸转氢酶基因的文库。烟酰胺核苷酸转氢酶基因的文库可以包含一或多个烟酰胺核苷酸转氢酶基因。文库中的每个烟酰胺核苷酸转氢酶基因可以是转氢酶基因的原生形式或突变形式。突变形式可以包含选自***、缺失、单核苷酸多态性(SNP)或易位的一或多种突变。文库中的每个烟酰胺核苷酸转氢酶基因可以是转氢酶基因。烟酰胺核苷酸转氢酶基因可以是来自所属领域中已知的原核细胞(即,细菌和/或古细菌)的任何转氢酶基因。烟酰胺核苷酸转氢酶基因可以是来自所属领域中已知的真核细胞(例如真菌)的任何转氢酶基因。烟酰胺核苷酸转氢酶可以是将NADH转变为NADPH的任何酶。宿主细胞可以是本文提供的任何宿主细胞。在一些实施例中,烟酰胺核苷酸转氢酶基因的文库包含来自以下各菌的任何菌株/物种/亚种的转氢酶基因:分枝杆菌属(例如耻垢分支杆菌)、链霉菌(例如天蓝色链霉菌)、发酵单胞菌属(例如运动发酵单胞菌)、集胞藻属(例如集胞藻属PCC6803)、双歧杆菌属(例如长双岐杆菌)、埃希氏杆菌属(例如大肠杆菌)、芽孢杆菌属(例如枯草芽孢杆菌)、棒状杆菌属(例如谷氨酸棒状杆菌)、酵母菌属(例如酿酒酵母)或其组合。

在一些实施例中,本公开的烟酰胺核苷酸转氢酶展现与本文提供的转氢酶至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

文库中的每个烟酰胺核苷酸转氢酶基因可以功能性地连接至其原生启动子或其原生启动子的突变形式或处于所述启动子的控制下。文库中的每个烟酰胺核苷酸转氢酶基因可以功能性地连接至本文提供的任何启动子或受所述启动子控制。烟酰胺核苷酸转氢酶基因的文库中的每个烟酰胺核苷酸转氢酶基因可以存在于嵌合构建体中,使得基因可以侧接一或多个调控序列和/或与宿主细胞基因组中存在的序列同源的序列。与宿主细胞中存在的序列同源的序列可以促进烟酰胺核苷酸转氢酶基因整合至宿主细胞基因组的包含互补序列的位点或基因座中。整合可以经由重组事件进行。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。

gdh、asd、dapB和/或ddh文库

在某些实施例中,本文提供了用于本文所提供的方法中的gdh、asd、dapB和ddh基因的文库。gdh、asd、dapB和ddh基因的文库可以包含一或多个gdh、asd、dapB和ddh基因。文库中的每个gdh、asd、dapB或ddh基因可以是分别gdh、asd、dapB或ddh基因的原生形式或突变形式。突变形式可以包含选自***、缺失、单核苷酸多态性(SNP)或易位的一或多种突变。文库中的每个gdh、asd、dapB或ddh基因可以分别是gdh、asd、dapB或ddh基因。gdh、asd、dapB或ddh基因可以是来自所属领域中已知的原核细胞(即,细菌和/或古细菌)的任何gdh、asd、dapB或ddh基因。asd、dapB或ddh基因可以是来自所属领域中已知的真核细胞(例如真菌)的任何asd、dapB或ddh基因。gdh可以被认为是包含NADPH和/或NADH依赖性谷氨酸脱氢酶活性的任何蛋白质。举例来说,本文中使用的gdh可以是将草酰乙酸转变为天冬氨酸的任何酶。asd可以被认为是包含NADPH和/或NADH依赖性天冬氨酸半醛脱氢酶活性的任何蛋白质。举例来说,本文中使用的asd可以是将天冬氨酰磷酸转变为天冬氨酸半醛的任何酶。dapB可以被认为是包含NADPH和/或NADH依赖性二氢吡啶甲酸还原酶活性的任何蛋白质。举例来说,本文中使用的dapB可以是将二氢吡啶甲酸转变为四氢吡啶甲酸的任何酶。ddh可以被认为是包含NADPH和/或NADH依赖性内消旋-二氨基庚二酸脱氢酶活性的任何蛋白质。举例来说,本文中使用的ddh可以是催化四氢吡啶甲酸直接转变成内消旋-二氨基庚二酸的任何酶。

宿主细胞可以是本文提供的任何宿主细胞。在一些实施例中,asd、dapB或ddh基因的文库分别包含来自以下各菌的任何菌株/物种/亚种的asd、dapB或ddh基因:分枝杆菌属(例如耻垢分支杆菌)、链霉菌(例如天蓝色链霉菌)、发酵单胞菌属(例如运动发酵单胞菌)、集胞藻属(例如集胞藻属PCC6803)、双歧杆菌属(例如长双岐杆菌)、埃希氏杆菌属(例如大肠杆菌)、芽孢杆菌属(例如枯草芽孢杆菌)、棒状杆菌属(例如谷氨酸棒状杆菌)、酵母菌属(例如酿酒酵母)或其组合。

在一些实施例中,本公开的asd、dapB或ddh酶分别展现与本文提供的asd、dapB或ddh酶至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

文库中的每个asd、dapB或ddh基因可以功能性地连接至其原生启动子或其原生启动子的突变形式或处于所述启动子的控制下。文库中的每个asd、dapB或ddh基因可以功能性地连接至本文提供的任何启动子或受所述启动子控制。asd、dapB和/或ddh基因的文库中的每个asd、dapB和/或ddh基因可以存在于嵌合构建体中,使得基因可以侧接一或多个调控序列和/或与宿主细胞基因组中存在的序列同源的序列。与宿主细胞中存在的序列同源的序列可以促进asd、dapB或ddh基因整合至宿主细胞基因组的包含互补序列的位点或基因座中。整合可以经由重组事件进行。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。

TA文库

在某些实施例中,本文提供了用于本文所提供的方法中的TA基因的文库。TA基因的文库可以包含一或多个TA基因。文库中的每个TA基因可以是分别TA基因的原生形式或突变形式。突变形式可以包含选自***、缺失、单核苷酸多态性(SNP)或易位的一或多种突变。文库中的每个TA基因可以分别是TA基因。TA基因可以分别是来自所属领域中已知的原核细胞(即,细菌和/或古细菌)的任何TA基因。TA基因可以是来自所属领域中已知的真核细胞(例如真菌)的任何TA基因。TA可以被认为是包含苏氨酸醛缩酶活性的任何蛋白质。举例来说,本文中使用的TA可以是将苏氨酸转变为乙醛和甘氨酸的任何酶。在一个实施例中,TA基因以比内源性TA慢的速率将苏氨酸转变为乙醛和甘氨酸。在一个实施例中,TA基因将乙醛和甘氨酸转变为苏氨酸。

宿主细胞可以是本文提供的任何宿主细胞。在一些实施例中,TA基因的文库分别包含来自以下各菌的任何菌株/物种/亚种的TA基因:分枝杆菌属(例如耻垢分支杆菌)、链霉菌(例如天蓝色链霉菌)、发酵单胞菌属(例如运动发酵单胞菌)、集胞藻属(例如集胞藻属PCC6803)、双歧杆菌属(例如长双岐杆菌)、埃希氏杆菌属(例如大肠杆菌)、芽孢杆菌属(例如枯草芽孢杆菌)、棒状杆菌属(例如谷氨酸棒状杆菌)、酵母菌属(例如酿酒酵母)或其组合。

在一些实施例中,本公开的TA酶展现与本文提供的TA酶至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

文库中的每个TA基因可以功能性地连接至其原生启动子或其原生启动子的突变形式或处于所述启动子的控制下。文库中的每个TA基因可以功能性地连接至本文提供的任何启动子或受所述启动子控制。TA基因的文库中的每个TA基因可以存在于嵌合构建体中,使得基因可以侧接一或多个调控序列和/或与宿主细胞基因组中存在的序列同源的序列。与宿主细胞中存在的序列同源的序列可以促进TA基因整合至宿主细胞基因组的包含互补序列的位点或基因座中。整合可以经由重组事件进行。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。

pyc文库

在某些实施例中,本文提供了用于本文所提供的方法中的pyc基因的文库。pyc基因的文库可以包含一或多个pyc基因。文库中的每个pyc基因可以是分别pyc基因的原生形式或突变形式。突变形式可以包含选自***、缺失、单核苷酸多态性(SNP)或易位的一或多种突变。文库中的每个pyc基因可以分别是pyc基因。pyc基因可以分别是来自所属领域中已知的原核细胞(即,细菌和/或古细菌)的任何pyc基因。pyc基因可以是来自所属领域中已知的真核细胞(例如真菌)的任何pyc基因。pyc可以被认为是包含丙酮酸羧化酶活性的任何蛋白质。举例来说,本文中使用的pyc可以是将丙酮酸转变为草酰乙酸的任何酶。

宿主细胞可以是本文提供的任何宿主细胞。在一些实施例中,pyc基因的文库分别包含来自以下各菌的任何菌株/物种/亚种的pyc基因:分枝杆菌属(例如耻垢分支杆菌)、链霉菌(例如天蓝色链霉菌)、发酵单胞菌属(例如运动发酵单胞菌)、集胞藻属(例如集胞藻属PCC6803)、双歧杆菌属(例如长双岐杆菌)、芽孢杆菌属(例如枯草芽孢杆菌)、棒状杆菌属(例如谷氨酸棒状杆菌)、酵母菌属(例如酿酒酵母)或其组合。

在一些实施例中,本公开的pyc酶展现与本文提供的分别pyc酶至少100%、99%、98%、97%、96%、95%、94%、93%、92%、91%、90%、89%、88%、87%、86%、85%、84%、83%、82%、81%、80%、79%、78%、77%、76%或75%的序列同一性。

文库中的每个pyc基因可以功能性地连接至其原生启动子或其原生启动子的突变形式或处于所述启动子的控制下。文库中的每个pyc基因可以功能性地连接至本文提供的任何启动子或受所述启动子控制。pyc基因的文库中的每个pyc基因可以存在于嵌合构建体中,使得基因可以侧接一或多个调控序列和/或与宿主细胞基因组中存在的序列同源的序列。与宿主细胞中存在的序列同源的序列可以促进pyc基因整合至宿主细胞基因组的包含互补序列的位点或基因座中。整合可以经由重组事件进行。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。

产生gapA基因的突变形式

如本文所提供,用于本文所提供的方法中的gapA基因可以是其源自的基因的突变形式。突变基因可以按所属领域中已知或本文提供的任何方式突变。

在一些实施例中,本公开教示了通过引入、缺失或置换基因组DNA的所选部分来使细胞群体发生突变。因此,在一些实施例中,本公开教示了用于靶向特定基因座(例如gapA)进行突变的方法。在其它实施例中,本公开教示了利用如ZFN、TALENS或CRISPR等基因编辑技术选择性地编辑目标DNA区域。在细胞群体突变后,靶向突变可以从细胞分离并且随后用于产生gapA基因的文库。

在一些实施例中,本公开教示了在宿主生物体外使所选DNA区域(例如gapA基因)突变。举例来说,在一些实施例中,本公开教示了使原生gapA基因突变。

在一些实施例中,DNA的所选区域是在体外通过天然变体的基因改组或用合成寡核苷酸改组、质粒-质粒重组、病毒质粒重组或病毒-病毒重组来产生。在其它实施例中,基因组区域经由易错PCR或定点诱变产生。

在一些实施例中,在含有gapA基因的所选基因区域中产生突变是利用“再装配PCR”完成。简单来说,合成寡核苷酸引物(寡核苷酸)用于所关注的核酸序列(例如gapA基因)的区段的PCR扩增,使得寡核苷酸的序列与两个区段的接合点重叠。重叠区域的长度典型地是约10至100个核苷酸。所述区段各自用一组这样的引物扩增。接着根据装配方案“再装配”PCR产物。简单来说,在装配方案中,首先通过例如凝胶电泳或尺寸排阻色谱,纯化PCR产物以不含引物。将纯化的产物混合在一起并且经历约1-10个循环的在聚合酶和三磷酸脱氧核苷(dNTP)和适当缓冲盐存在下在缺乏额外引物下(“自引发”)的变性、再粘接和延伸。利用引物侧接基因的后续PCR扩增完整再装配和改组的基因的产量。

在本公开的一些实施例中,例如上文所论述的突变gapA DNA区域富集突变序列,以便更有效地对多种突变谱,即突变的可能组合进行取样。在一些实施例中,通过mutS蛋白质亲和基质(瓦格纳(Wagner)等人,《核酸研究(Nucleic Acids Res.)》23(19):3944-3948(1995);苏(Su)等人,《美国国家科学院院刊(Proc.Natl.Acad.Sci.(U.S.A.))》,83:5057-5061(1986))鉴别突变序列,其中优选在装配反应前进行体外扩增亲和纯化物质的步骤。然后将此扩增物质安放至装配或再装配PCR反应中。

在一些实施例中,在自然界中发现突变gapA DNA区域。

产生烟酰胺核苷酸转氢酶基因的突变形式

如本文所提供,用于本文所提供的方法中的烟酰胺核苷酸转氢酶基因可以是其源自的基因的突变形式。突变基因可以按所属领域中已知或本文提供的任何方式突变。

在一些实施例中,本公开教示了通过引入、缺失或置换基因组DNA的所选部分来使细胞群体发生突变。因此,在一些实施例中,本公开教示了用于靶向特定基因座(例如烟酰胺核苷酸转氢酶)进行突变的方法。在其它实施例中,本公开教示了利用如ZFN、TALENS或CRISPR等基因编辑技术选择性地编辑目标DNA区域。在细胞群体突变后,靶向突变可以从细胞分离并且随后用于产生烟酰胺核苷酸转氢酶基因的文库。

在一些实施例中,本公开教示了在宿主生物体外使所选DNA区域(例如烟酰胺核苷酸转氢酶基因)突变。举例来说,在一些实施例中,本公开教示了使原生烟酰胺核苷酸转氢酶基因突变。

在一些实施例中,DNA的所选区域是在体外通过天然变体的基因改组或用合成寡核苷酸改组、质粒-质粒重组、病毒质粒重组或病毒-病毒重组来产生。在其它实施例中,基因组区域经由易错PCR或定点诱变产生。

在某些实施例中,在含有烟酰胺核苷酸转氢酶基因的所选基因区域中产生突变是利用“再装配PCR”完成。

在一些实施例中,例如上文所论述的突变烟酰胺核苷酸转氢酶DNA区域富集突变序列,以便更有效地对多种突变谱,即突变的可能组合进行取样。在一些实施例中,通过mutS蛋白质亲和基质鉴别突变序列,其中优选在装配反应前进行体外扩增亲和纯化物质的步骤。然后将此扩增物质安放至装配或再装配PCR反应中。

在一些实施例中,在自然界中发现突变烟酰胺核苷酸转氢酶DNA区域。

产生gdh、asd、dapB和/或ddh基因的突变形式

如本文所提供,用于本文所提供的方法中的gdh、asd、dapB或ddh基因可以是其源自的基因的突变形式。突变基因可以按所属领域中已知或本文提供的任何方式突变。

在一些实施例中,本公开教示了通过引入、缺失或置换基因组DNA的所选部分来使细胞群体发生突变。因此,在一些实施例中,本公开教示了用于靶向特定基因座(例如gdh、asd、dapB或ddh)进行突变的方法。在其它实施例中,本公开教示了利用如ZFN、TALENS或CRISPR等基因编辑技术选择性地编辑目标DNA区域。在细胞群体突变后,靶向突变可以从细胞分离并且随后用于产生烟酰胺核苷酸转氢酶基因的文库。

在一些实施例中,本公开教示了在宿主生物体外使所选DNA区域(例如gdh、asd、dapB或ddh基因)突变。举例来说,在一些实施例中,本公开教示了使原生gdh、asd、dapB或ddh基因突变。

在一些实施例中,DNA的所选区域是在体外通过天然变体的基因改组或用合成寡核苷酸改组、质粒-质粒重组、病毒质粒重组或病毒-病毒重组来产生。在其它实施例中,基因组区域经由易错PCR或定点诱变产生。

在某些实施例中,在含有烟酰胺核苷酸转氢酶基因的所选基因区域中产生突变是利用“再装配PCR”完成。

在一些实施例中,例如上文所论述的突变gdh、asd、dapB和/或ddh DNA区域富集突变序列,以便更有效地对多种突变谱,即突变的可能组合进行取样。在一些实施例中,通过mutS蛋白质亲和基质鉴别突变序列,其中优选在装配反应前进行体外扩增亲和纯化物质的步骤。然后将此扩增物质安放至装配或再装配PCR反应中。

在一些实施例中,在自然界中发现突变或变异gdh、asd、dapB和/或ddh DNA区域。在某些实施例中,在包括(但不限于)以下的细菌中发现谷氨酸棒状杆菌ddh的天然存在的变体:口腔放线菌(A.oris)、超嗜热古菌(H.archaeon)、粪芽孢菌属(coprobacillus)、竹节状甲烷鬃毛菌(M.harundinacea)、微核巨球形菌(M.micronuciformis)、反硝化无色杆菌(A.denitrificans)、藤黄微球菌(M.luteus)、粪短杆菌(B.faecium)和肉食杆菌属(carnobacterium)。在某些实施例中,在包括(但不限于)以下的细菌中发现谷氨酸棒状杆菌asd的天然存在的变体:詹氏甲烷球菌(M.jannaschii)、普通索利氏菌(S.usitatus)、内部盐碱湖菌(N.innermongolicus)、嗜热光合绿曲菌(C.aurantiacus)、敏捷乳杆菌(L.agilis)、小鸡双歧杆菌(B.pullorum)、细菌双歧杆菌(B.bacterium)、汉氏粘球菌(M.hansupus)和固氮类芽孢杆菌(P.sabinae)。在一些实施例中,在包括(但不限于)共生梭菌(C.symbiosum)的细菌中发现谷氨酸棒状杆菌gdh的天然存在的变体。在一些实施例中,在包括(但不限于)大肠杆菌的细菌中发现谷氨酸棒状杆菌dapB的天然存在的变体。在某些实施例中,通过在生物体(例如细菌)中进行全基因组同源性搜索,发现谷氨酸棒状杆菌gdh、asd、dapB和/或ddh的天然存在的变体。

产生TA基因的突变形式

如本文所提供,用于本文所提供的方法中的TA基因可以是其源自的基因的突变形式。突变基因可以按所属领域中已知或本文提供的任何方式突变。

在一些实施例中,本公开教示了通过引入、缺失或置换基因组DNA的所选部分来使细胞群体发生突变。因此,在一些实施例中,本公开教示了用于靶向特定基因座(例如TA)进行突变的方法。在其它实施例中,本公开教示了利用如ZFN、TALENS或CRISPR等基因编辑技术选择性地编辑目标DNA区域。在细胞群体突变后,靶向突变可以从细胞分离并且随后用于产生烟酰胺核苷酸转氢酶基因的文库。

在一些实施例中,本公开教示了在宿主生物体外使所选DNA区域(例如TA基因)突变。举例来说,在一些实施例中,本公开教示了使原生TA基因突变。

在一些实施例中,DNA的所选区域是在体外通过天然变体的基因改组或用合成寡核苷酸改组、质粒-质粒重组、病毒质粒重组或病毒-病毒重组来产生。在其它实施例中,基因组区域经由易错PCR或定点诱变产生。

在某些实施例中,在含有烟酰胺核苷酸转氢酶基因的所选基因区域中产生突变是利用“再装配PCR”完成。

在一些实施例中,突变TA DNA区域(例如上文所论述者)富集突变序列,以便更有效地对多种突变谱、即突变的可能组合进行取样。在一些实施例中,通过mutS蛋白质亲和基质鉴别突变序列,其中优选在装配反应前进行体外扩增亲和纯化物质的步骤。然后将此扩增物质安放至装配或再装配PCR反应中。

在一些实施例中,在自然界中发现突变或变异TA DNA区域。在某些实施例中,在包括(但不限于)以下的细菌中发现谷氨酸棒状杆菌TA的天然存在的变体:口腔放线菌、超嗜热古菌、粪芽孢菌属、竹节状甲烷鬃毛菌、微核巨球形菌、反硝化无色杆菌、藤黄微球菌、粪短杆菌和肉食杆菌属。在某些实施例中,通过在生物体(例如细菌)中进行全基因组同源性搜索,发现谷氨酸棒状杆菌TA的天然存在的变体。

产生pyc基因的突变形式

如本文所提供,用于本文所提供的方法中的pyc基因可以是其源自的基因的突变形式。突变基因可以按所属领域中已知或本文提供的任何方式突变。

在一些实施例中,本公开教示了通过引入、缺失或置换基因组DNA的所选部分来使细胞群体发生突变。因此,在一些实施例中,本公开教示了用于靶向特定基因座(例如pyc)进行突变的方法。在其它实施例中,本公开教示了利用如ZFN、TALENS或CRISPR等基因编辑技术选择性地编辑目标DNA区域。在细胞群体突变后,靶向突变可以从细胞分离并且随后用于产生烟酰胺核苷酸转氢酶基因的文库。

在一些实施例中,本公开教示了在宿主生物体外使所选DNA区域(例如pyc基因)突变。举例来说,在一些实施例中,本公开教示了使原生pyc基因突变。

在一些实施例中,DNA的所选区域是在体外通过天然变体的基因改组或用合成寡核苷酸改组、质粒-质粒重组、病毒质粒重组或病毒-病毒重组来产生。在其它实施例中,基因组区域经由易错PCR或定点诱变产生。

在某些实施例中,在含有烟酰胺核苷酸转氢酶基因的所选基因区域中产生突变是利用“再装配PCR”完成。

在一些实施例中,例如上文所论述的突变pyc DNA区域富集突变序列,以便更有效地对多种突变谱,即突变的可能组合进行取样。在一些实施例中,通过mutS蛋白质亲和基质鉴别突变序列,其中优选在装配反应前进行体外扩增亲和纯化物质的步骤。然后将此扩增物质安放至装配或再装配PCR反应中。

在一些实施例中,在自然界中发现突变或变异pyc DNA区域。在某些实施例中,在包括(但不限于)以下的细菌中发现谷氨酸棒状杆菌pyc的天然存在的变体:口腔放线菌、超嗜热古菌、粪芽孢菌属、竹节状甲烷鬃毛菌、微核巨球形菌、反硝化无色杆菌、藤黄微球菌、粪短杆菌和肉食杆菌属。在某些实施例中,通过在生物体(例如细菌)中进行全基因组同源性搜索,发现谷氨酸棒状杆菌pyc的天然存在的变体。

包含gapA基因的文库的产生

在一些实施例中,本公开教示了***和/或置换和/或缺失宿主生物体的包含gapA基因的DNA区段。在一些方面,本文所教示的方法包括建构所关注的寡核苷酸(即,gapA区段),其可以并入到宿主生物体的基因组中。在一些实施例中,本公开的gapA DNA区段可以经由所属领域中已知的任何方法,包括由已知模板复制或切割、突变或DNA合成来获得。在一些实施例中,本公开与用于产生DNA序列的市售基因合成产品(例如GeneArtTM、GeneMakerTM、GenScriptTM、AnagenTM、Blue HeronTM、EntelechonTM、基诺公司(GeNOsys,Inc.)或QiagenTM)相容。

在一些实施例中,gapA DNA区段被设计成将葡萄糖gapA DNA区段并入宿主生物体的所选DNA区域中(例如添加有用GAPDH活性)。在某些实施例中,所选DNA区域是中性整合位点。在其它实施例中,gapA DNA区段被设计成从宿主生物体的DNA去除原生gapA基因(例如去除原生GAPDH活性)。

在一些实施例中,本发明方法中所用的gapA基因可以使用所属领域中已知的任何酶促或化学合成方法分段合成为寡核苷酸。寡核苷酸可以在固体载体上合成,所述固体载体如可控微孔玻璃(controlled pore glass,CPG)、聚苯乙烯珠粒或由可以含有CPG的热塑性聚合物组成的膜。寡核苷酸还能够在阵列上、在并行的微米尺度上使用微流体(田(Tian)等人,《分子生物系统(Mol.BioSyst.)》,5,714-722(2009))或提供两者组合的已知技术(参见雅各布森(Jacobsen)等人,美国专利申请第2011/0172127号)合成。

在阵列上或通过微流体的合成优于常规固体载体合成之处在于通过减少试剂的使用降低了成本。基因合成所需的规模低,因此由阵列或通过微流体合成的寡核苷酸产物的规模是可接受的。然而,所合成的寡核苷酸的品质低于使用固体载体合成时(参见田(Tian),见下文;也参见施泰勒(Staehler)等人,美国专利申请第2010/0216648号)。

自从二十世纪八十年代首次描述了传统的四步亚磷酰胺化学方法以来,其已经取得了大量的进步(参见例如丝兹查勒(Sierzchala)等人,《美国化学学会杂志(J.Am.Chem.Soc.)》,125,13427-13441(2003),其使用过氧基阴离子脱除保护基;早川(Hayakawa)等人,美国专利第6,040,439号,其是关于替代保护基;阿杂叶维(Azhayev)等人,《四面体(Tetrahedron)》57,4977-4986(2001),其是关于通用载体;考兹洛维(Kozlov)等人,《核苷、核苷酸和核酸(Nucleosides,Nucleotides,and Nucleic Acids)》,24(5-7),1037-1041(2005),其是关于通过使用大孔隙CPG改良较长寡核苷酸的合成;以及丹哈(Damha)等人,《核酸研究(NAR)》,18,3813-3821(1990),其是关于改良衍生化)。

不论合成的类型如何,所得寡核苷酸接着都可以形成较小的建构嵌段用于较长的多核苷酸(即,gapA基因)。在一些实施例中,较小的寡核苷酸可以使用所属领域中已知的方案连接在一起,如聚合酶链装配(PCA)、连接酶链反应(LCR)和热力学平衡由内而外合成法(TBIO)(参见兹阿尔(Czar)等人,《生物技术趋势(Trends in Biotechnology)》,27,63-71(2009))。在PCA中,在多个循环(典型地约55个循环)中使跨越所期望较长产物的整个长度的寡核苷酸粘接且延长,最终获得全长产物。LCR使用连接酶将两个寡核苷酸连接,所述两个寡核苷酸均粘接到第三寡核苷酸。TBIO合成始于所期望产物的中心并且通过使用重叠寡核苷酸而在两个方向上逐渐地延长,所述重叠寡核苷酸与位于基因的5'端的正向链同源并且与位于基因的3'端的反向链相反。

另一种合成较大双链DNA片段的方法是通过顶链PCR(top-strand PCR,TSP)组合较小寡核苷酸。在此方法中,多个寡核苷酸跨越所期望产物的整个长度并且含有与相邻寡核苷酸重叠的区域。可以使用通用正向和反向引物执行扩增,并且通过多个循环的扩增来形成全长双链DNA产物。此产物接着可以经历任选的差错校正和进一步的扩增,产生所期望的双链DNA片段最终产物。

在TSP的一种方法中,将要组合形成所期望全长产物的较小寡核苷酸集合具有40-200个之间的碱基长度并且彼此重叠至少约15-20个碱基。就实用目的来说,重叠区域的最小长度应该足以确保寡核苷酸的特异性粘接并且具有足够高的解链温度(Tm),以便在所用反应温度下粘接。重叠可以延伸到既定寡核苷酸被相邻寡核苷酸完全叠覆的点。重叠的量似乎对最终产物的品质无任何影响。装配体中的第一个和最后一个寡核苷酸建构嵌段应该含有正向和反向扩增引物的结合位点。在一个实施例中,第一个和最后一个寡核苷酸的末端序列含有互补的相同序列以允许使用通用引物。

包含烟酰胺核苷酸转氢酶基因的文库的产生

在一些实施例中,本公开教示了***和/或置换和/或缺失宿主生物体的包含烟酰胺核苷酸转氢酶基因的DNA区段。在一些方面,本文所教示的方法包括建构所关注的寡核苷酸(即,烟酰胺核苷酸转氢酶区段),其可以并入到宿主生物体的基因组中。在一些实施例中,本公开的烟酰胺核苷酸转氢酶DNA区段可以经由所属领域中已知的任何方法,包括由已知模板复制或切割、突变或DNA合成来获得。在一些实施例中,本公开与用于产生DNA序列的市售基因合成产品(例如GeneArtTM、GeneMakerTM、GenScriptTM、AnagenTM、Blue HeronTM、EntelechonTM、基诺公司或QiagenTM)相容。

在一些实施例中,烟酰胺核苷酸转氢酶DNA区段被设计成将烟酰胺核苷酸转氢酶DNA区段并入宿主生物体的所选DNA区域中(例如添加有用转氢酶活性)。在某些实施例中,所选DNA区域是中性整合位点。在其它实施例中,烟酰胺核苷酸转氢酶DNA区段被设计成从宿主生物体的DNA去除原生烟酰胺核苷酸转氢酶基因(例如去除原生转氢酶活性)。

在一些实施例中,本发明方法中所用的烟酰胺核苷酸转氢酶基因可以使用所属领域中已知的任何酶促或化学合成方法分段合成为寡核苷酸。寡核苷酸可以在固体载体上合成,所述固体载体如可控微孔玻璃(CPG)、聚苯乙烯珠粒或由可以含有CPG的热塑性聚合物组成的膜。寡核苷酸还能够在阵列上、在并行的微米尺度上使用微流体或提供两者组合的已知技术合成。

在阵列上或通过微流体的合成优于常规固体载体合成之处在于通过减少试剂的使用降低了成本。基因合成所需的规模低,因此由阵列或通过微流体合成的寡核苷酸产物的规模是可接受的。然而,所合成的寡核苷酸的品质低于使用固体载体合成时。

自从二十世纪八十年代首次描述了传统的四步亚磷酰胺化学方法以来,其已经取得了大量的进步(参见例如丝兹查勒等人,《美国化学学会杂志》,125,13427-13441(2003),其使用过氧基阴离子脱除保护基;早川等人,美国专利第6,040,439号,其是关于替代保护基;阿杂叶维等人,《四面体》57,4977-4986(2001),其是关于通用载体;考兹洛维等人,《核苷、核苷酸和核酸》,24(5-7),1037-1041(2005),其是关于通过使用大孔隙CPG改良较长寡核苷酸的合成;以及丹哈等人,《核酸研究》,18,3813-3821(1990),其是关于改良衍生化)。

不论合成的类型如何,所得寡核苷酸接着都可以形成较小的建构嵌段用于较长的多核苷酸(即,烟酰胺核苷酸转氢酶基因)。在一些实施例中,较小的寡核苷酸可以使用所属领域中已知的方案连接在一起,如聚合酶链装配(PCA)、连接酶链反应(LCR)和热力学平衡由内而外合成法(TBIO)。

另一种合成较大双链DNA片段的方法是通过顶链PCR(TSP)组合较小寡核苷酸。在TSP的一种方法中,将要组合形成所期望全长产物的较小寡核苷酸集合具有40-200个之间的碱基长度并且彼此重叠至少约15-20个碱基。就实用目的来说,重叠区域的最小长度应该足以确保寡核苷酸的特异性粘接并且具有足够高的解链温度(Tm),以便在所用反应温度下粘接。重叠可以延伸到既定寡核苷酸被相邻寡核苷酸完全叠覆的点。重叠的量似乎对最终产物的品质无任何影响。装配体中的第一个和最后一个寡核苷酸建构嵌段应该含有正向和反向扩增引物的结合位点。在一个实施例中,第一个和最后一个寡核苷酸的末端序列含有互补的相同序列以允许使用通用引物。

调节丙酮酸羧化酶

丙酮酸羧化酶可以在宿主细胞中由含有包含编码丙酮酸羧化酶的核苷酸序列的核酸片段的表达载体表达。可替代地,包含编码丙酮酸羧化酶的核苷酸序列的核酸片段可以整合至宿主的染色体中。无论相对于宿主细胞是异源还是内源性的核酸序列都可以使用例如同源重组引入细菌染色体中。首先,将所关注基因和编码抗药性标记的基因***含有与所关注基因待***的染色体区同源的DNA片的质粒中。然后,此重组诱发DNA引入细菌中,并且选择其中含有所关注基因和抗药性标记的DNA片段在所需位置处重组至染色体中的克隆。可以经由转化,将基因和抗药性标记作为由任何克隆载体制备的线性化DNA片或作为无法在细菌宿主中复制的专门重组***载体的一部分引入细菌中。在线性化DNA的情况下,可以使用recD-宿主来增加获得所需重组体的频率。随后使用PCR和跨越***区域扩增DNA的引物验证克隆。来自非重组克隆的PCR产物尺寸较小并且只含有将发生***事件的染色体区域,而来自重组克隆的PCR产物尺寸较大并且含有此染色体区域加***基因和抗药性。

在一个优选实施例中,宿主细胞、优选大肠杆菌、谷氨酸棒状杆菌、黄色短杆菌(B.flavum)或乳糖发酵短杆菌(B.lactofermentum)经包含丙酮酸羧化酶基因、优选从菜豆根瘤菌(R.etli)或荧光假单胞菌(P.fluorescens)分离的基因、更优选来自菜豆根瘤菌的pyc基因的核酸片段转化,使得基因在宿主细胞中转录和表达,以相对于可比较的野生型细胞增加草酰乙酸的产生,因此增加所关注的下游代谢物的产生。

本公开的代谢工程细胞过表达丙酮酸羧化酶。换句话说,代谢工程细胞以高于可比较的野生型细胞中表达的丙酮酸羧化酶水平的水平表达丙酮酸羧化酶。此比较可以通过所属领域的技术人员以许多方式进行,并且在可比较的生长条件下进行。举例来说,丙酮酸羧化酶活性可以使用派恩(Payne)和莫里斯(Morris)的方法(《普通微生物学杂志(J.Gen.Microbiol.)》,59,97-101(1969))定量和比较。在此分析中过表达丙酮酸羧化酶的代谢工程细胞将产生比野生型细胞更大的活性。另外或可替代地,可以通过以下来定量和比较丙酮酸羧化酶的量:从细胞制备蛋白质提取物;使其进行SDS-PAGE;将其转移至蛋白质印迹法,随后使用检测试剂盒检测生物素化丙酮酸羧化酶蛋白质,所述试剂盒可购自例如皮尔斯化学公司(Pierce Chemical Company)(伊利诺斯州罗克福德(Rockford,Ill.))、西格玛化学公司(Sigma Chemical Company)(密苏里州圣路易斯(St.Louis,Mo.))或宝灵曼(Boehringer Mannheim)(印第安纳州印第安纳波利斯(Indianapolis,Ind.)),其用于观测蛋白质印迹上的生物素化蛋白质。在一些合适宿主细胞中,非工程化的野生型细胞中的丙酮酸羧化酶表达可以低于可检测水平。

包含gdh、asd、dapB和/或ddh基因的文库的产生

在一些实施例中,本公开教示了***和/或置换和/或缺失包含宿主生物体的gdh、asd、dapB和/或ddh基因的DNA区段。在一些方面,本文所教示的方法包括建构所关注的寡核苷酸(即,gdh、asd、dapB和/或ddh区段),其可以并入到宿主生物体的基因组中。在一些实施例中,本公开的gdh、asd、dapB和/或ddh DNA区段可以经由所属领域中已知的任何方法,包括由已知模板复制或切割、突变或DNA合成来获得。在一些实施例中,本公开与用于产生DNA序列的市售基因合成产品(例如GeneArtTM、GeneMakerTM、GenScriptTM、AnagenTM、BlueHeronTM、EntelechonTM、基诺公司或QiagenTM)相容。

在一些实施例中,gdh、asd、dapB和/或ddh DNA区段被设计成将一或多个葡萄糖gdh、asd、dapB和/或ddh DNA区段并入宿主生物体的所选DNA区域中(例如添加一或多种有用谷氨酸脱氢酶、天冬氨酸半醛脱氢酶、二氢吡啶甲酸还原酶和/或内消旋-二氨基庚二酸脱氢酶活性)。在某些实施例中,所选DNA区域是中性整合位点。在其它实施例中,gdh、asd、dapB和/或ddh DNA区段被设计成从宿主生物体的DNA去除一或多种原生gdh、asd、dapB和/或ddh基因(例如去除一或多种原生谷氨酸脱氢酶、天冬氨酸半醛脱氢酶、二氢吡啶甲酸还原酶和/或内消旋-二氨基庚二酸脱氢酶活性)。

在一些实施例中,本发明方法中所用的gdh、asd、dapB和/或ddh基因可以使用所属领域中已知的任何酶促或化学合成方法分段合成为寡核苷酸。寡核苷酸可以在固体载体上合成,所述固体载体如可控微孔玻璃(CPG)、聚苯乙烯珠粒或由可以含有CPG的热塑性聚合物组成的膜。寡核苷酸还能够在阵列上、在并行的微米尺度上使用微流体或提供两者组合的已知技术合成。

在阵列上或通过微流体的合成优于常规固体载体合成之处在于通过减少试剂的使用降低了成本。基因合成所需的规模低,因此由阵列或通过微流体合成的寡核苷酸产物的规模是可接受的。然而,所合成的寡核苷酸的品质低于使用固体载体合成时。

自从二十世纪八十年代首次描述了传统的四步亚磷酰胺化学方法以来,其已经取得了大量的进步(参见例如丝兹查勒等人,《美国化学学会杂志》,125,13427-13441(2003),其使用过氧基阴离子脱除保护基;早川等人,美国专利第6,040,439号,其是关于替代保护基;阿杂叶维等人,《四面体》57,4977-4986(2001),其是关于通用载体;考兹洛维等人,《核苷、核苷酸和核酸》,24(5-7),1037-1041(2005),其是关于通过使用大孔隙CPG改良较长寡核苷酸的合成;以及丹哈等人,《核酸研究》,18,3813-3821(1990),其是关于改良衍生化)。

不论合成的类型如何,所得寡核苷酸接着都可以形成较小的建构嵌段用于较长的多核苷酸(即,gdh、asd、dapB和/或ddh基因)。在一些实施例中,较小的寡核苷酸可以使用所属领域中已知的方案连接在一起,如聚合酶链装配(PCA)、连接酶链反应(LCR)和热力学平衡由内而外合成法(TBIO)。

另一种合成较大双链DNA片段的方法是通过顶链PCR(TSP)组合较小寡核苷酸。在TSP的一种方法中,将要组合形成所期望全长产物的较小寡核苷酸集合具有40-200个之间的碱基长度并且彼此重叠至少约15-20个碱基。就实用目的来说,重叠区域的最小长度应该足以确保寡核苷酸的特异性粘接并且具有足够高的解链温度(Tm),以便在所用反应温度下粘接。重叠可以延伸到既定寡核苷酸被相邻寡核苷酸完全叠覆的点。重叠的量似乎对最终产物的品质无任何影响。装配体中的第一个和最后一个寡核苷酸建构嵌段应该含有正向和反向扩增引物的结合位点。在一个实施例中,第一个和最后一个寡核苷酸的末端序列含有互补的相同序列以允许使用通用引物。

包含苏氨酸醛缩酶(TA)基因的文库的产生

在一些实施例中,本公开教示了***和/或置换和/或缺失宿主生物体的包含TA基因的DNA区段。在一些方面,本文所教示的方法包括建构所关注的寡核苷酸(即,TA区段),其可以并入到宿主生物体的基因组中。在一些实施例中,本公开的TA DNA区段可以经由所属领域中已知的任何方法,包括由已知模板复制或切割、突变或DNA合成来获得。在一些实施例中,本公开与用于产生DNA序列的市售基因合成产品(例如GeneArtTM、GeneMakerTM、GenScriptTM、AnagenTM、Blue HeronTM、EntelechonTM、基诺公司或QiagenTM)相容。

在一些实施例中,TA DNA区段被设计成将一或多个TA DNA区段并入宿主生物体的所选DNA区域中(例如添加有用苏氨酸醛缩酶活性)。在某些实施例中,所选DNA区域是中性整合位点。在其它实施例中,TA DNA区段被设计成从宿主生物体的DNA去除一或多个原生TA基因(例如去除一或多个具有苏氨酸醛缩酶活性的基因)。

在一些实施例中,本发明方法中所用的TA基因可以使用所属领域中已知的任何酶促或化学合成方法分段合成为寡核苷酸。寡核苷酸可以在固体载体上合成,所述固体载体如可控微孔玻璃(CPG)、聚苯乙烯珠粒或由可以含有CPG的热塑性聚合物组成的膜。寡核苷酸还能够在阵列上、在并行的微米尺度上使用微流体或提供两者组合的已知技术合成。

在阵列上或通过微流体的合成优于常规固体载体合成之处在于通过减少试剂的使用降低了成本。基因合成所需的规模低,因此由阵列或通过微流体合成的寡核苷酸产物的规模是可接受的。然而,所合成的寡核苷酸的品质低于使用固体载体合成时。

自从二十世纪八十年代首次描述了传统的四步亚磷酰胺化学方法以来,其已经取得了大量的进步(参见例如丝兹查勒等人,《美国化学学会杂志》,125,13427-13441(2003),其使用过氧基阴离子脱除保护基;早川等人,美国专利第6,040,439号,其是关于替代保护基;阿杂叶维等人,《四面体》57,4977-4986(2001),其是关于通用载体;考兹洛维等人,《核苷、核苷酸和核酸》,24(5-7),1037-1041(2005),其是关于通过使用大孔隙CPG改良较长寡核苷酸的合成;以及丹哈等人,《核酸研究》,18,3813-3821(1990),其是关于改良衍生化)。

不论合成的类型如何,所得寡核苷酸接着都可以形成较小的建构嵌段用于较长的多核苷酸(即,TA基因)。在一些实施例中,较小的寡核苷酸可以使用所属领域中已知的方案连接在一起,如聚合酶链装配(PCA)、连接酶链反应(LCR)和热力学平衡由内而外合成法(TBIO)。

另一种合成较大双链DNA片段的方法是通过顶链PCR(TSP)组合较小寡核苷酸。在TSP的一种方法中,将要组合形成所期望全长产物的较小寡核苷酸集合具有40-200个之间的碱基长度并且彼此重叠至少约15-20个碱基。就实用目的来说,重叠区域的最小长度应该足以确保寡核苷酸的特异性粘接并且具有足够高的解链温度(Tm),以便在所用反应温度下粘接。重叠可以延伸到既定寡核苷酸被相邻寡核苷酸完全叠覆的点。重叠的量似乎对最终产物的品质无任何影响。装配体中的第一个和最后一个寡核苷酸建构嵌段应该含有正向和反向扩增引物的结合位点。在一个实施例中,第一个和最后一个寡核苷酸的末端序列含有互补的相同序列以允许使用通用引物。

包含pyc基因的文库的产生

在一些实施例中,本公开教示了***和/或置换和/或缺失宿主生物体的包含pyc基因的DNA区段。在一些方面,本文所教示的方法包括建构所关注的寡核苷酸(即,pyc区段),其可以并入到宿主生物体的基因组中。在一些实施例中,本公开的pyc DNA区段可以经由所属领域中已知的任何方法,包括由已知模板复制或切割、突变或DNA合成来获得。在一些实施例中,本公开与用于产生DNA序列的市售基因合成产品(例如GeneArtTM、GeneMakerTM、GenScriptTM、AnagenTM、Blue HeronTM、EntelechonTM、基诺公司或QiagenTM)相容。

在一些实施例中,pyc DNA区段被设计成将一或多个葡萄糖pyc DNA区段并入宿主生物体的所选DNA区域中(例如添加具有丙酮酸羧化酶活性的一或多个有用基因)。在某些实施例中,所选DNA区域是中性整合位点。在其它实施例中,pyc DNA区段被设计成从宿主生物体的DNA去除一或多个原生pyc基因(例如去除一或多个具有丙酮酸羧化酶活性的原生基因)。

在一些实施例中,本发明方法中所用的pyc基因可以使用所属领域中已知的任何酶促或化学合成方法分段合成为寡核苷酸。寡核苷酸可以在固体载体上合成,所述固体载体如可控微孔玻璃(CPG)、聚苯乙烯珠粒或由可以含有CPG的热塑性聚合物组成的膜。寡核苷酸还能够在阵列上、在并行的微米尺度上使用微流体或提供两者组合的已知技术合成。

在阵列上或通过微流体的合成优于常规固体载体合成之处在于通过减少试剂的使用降低了成本。基因合成所需的规模低,因此由阵列或通过微流体合成的寡核苷酸产物的规模是可接受的。然而,所合成的寡核苷酸的品质低于使用固体载体合成时。

自从二十世纪八十年代首次描述了传统的四步亚磷酰胺化学方法以来,其已经取得了大量的进步(参见例如丝兹查勒等人,《美国化学学会杂志》,125,13427-13441(2003),其使用过氧基阴离子脱除保护基;早川等人,美国专利第6,040,439号,其是关于替代保护基;阿杂叶维等人,《四面体》57,4977-4986(2001),其是关于通用载体;考兹洛维等人,《核苷、核苷酸和核酸》,24(5-7),1037-1041(2005),其是关于通过使用大孔隙CPG改良较长寡核苷酸的合成;以及丹哈等人,《核酸研究》,18,3813-3821(1990),其是关于改良衍生化)。

不论合成的类型如何,所得寡核苷酸接着都可以形成较小的建构嵌段用于较长的多核苷酸(即,pyc基因)。在一些实施例中,较小的寡核苷酸可以使用所属领域中已知的方案连接在一起,如聚合酶链装配(PCA)、连接酶链反应(LCR)和热力学平衡由内而外合成法(TBIO)。

另一种合成较大双链DNA片段的方法是通过顶链PCR(TSP)组合较小寡核苷酸。在TSP的一种方法中,将要组合形成所期望全长产物的较小寡核苷酸集合具有40-200个之间的碱基长度并且彼此重叠至少约15-20个碱基。就实用目的来说,重叠区域的最小长度应该足以确保寡核苷酸的特异性粘接并且具有足够高的解链温度(Tm),以便在所用反应温度下粘接。重叠可以延伸到既定寡核苷酸被相邻寡核苷酸完全叠覆的点。重叠的量似乎对最终产物的品质无任何影响。装配体中的第一个和最后一个寡核苷酸建构嵌段应该含有正向和反向扩增引物的结合位点。在一个实施例中,第一个和最后一个寡核苷酸的末端序列含有互补的相同序列以允许使用通用引物。

装配/克隆质粒

在一些实施例中,本公开教示了用于构建能够将所需gapA基因和/或烟酰胺核苷酸转氢酶和/或gdh、asd、dapB和/或ddh基因和/或TA基因和/或pyc基因DNA区段***宿主生物体的基因组中的载体的方法。在一些实施例中,本公开教示了克隆包含***DNA(例如gapA基因和/或烟酰胺核苷酸转氢酶和/或gdh、asd、dapB和/或ddh基因和/或TA基因和/或pyc基因)、同源臂和至少一种选择标记的载体的方法(参见图6)。

在一些实施例中,本公开与适合于转化到宿主生物体中的任何载体相容。在一些实施例中,本公开教示了与宿主细胞相容的穿梭载体的使用。在一个实施例中,本文所提供的方法中使用的穿梭载体是与大肠杆菌和/或棒状杆菌属宿主细胞相容的穿梭载体。本文所提供的方法中使用的穿梭载体可以包含如本文所述的用于选择和/或反向选择的标记。标记可以是所属领域中已知和/或本文提供的任何标记。穿梭载体可以进一步包含任何调控序列和/或如所属领域已知的适用于所述穿梭载体装配的序列。穿梭载体可以进一步包含在如本文所提供的宿主细胞(例如大肠杆菌或谷氨酸棒状杆菌)中繁殖所需要的任何复制起点。调控序列可以是所属领域中已知或本文提供的任何调控序列,例如宿主细胞的遗传机制所用的启动子、起始、终止、信号、分泌和/或终止序列。在某些情况下,可以将目标DNA***从任何存储库或目录产物获得的载体、构建体或质粒中,如商业载体(参见例如DNA2.0定制版或

Figure BDA0002279931240000511

载体)。

在一些实施例中,本公开的装配/克隆方法可以采用以下装配策略中的至少一种:i)II型常规克隆;ii)II S型介导或“金门(Golden Gate)”克隆(参见例如恩格勒(Engler,C.),康德兹(R.Kandzia)和马里约内(S.Marillonnet),2008,“具有高通量能力的一锅一步精确克隆方法(A one pot,one step,precision cloning method with high throughputcapability)”,《公共科学图书馆综合卷(PLos One)》3:e3647;科特纳(Kotera,I.)和长井(T.Nagai),2008,“使用DNA聚合酶抑制剂和IIS型限制酶对粗PCR产物的高通量单管式重组(A high-throughput and single-tube recombination of crude PCR products usinga DNA polymerase inhibitor and type IIS restriction enzyme)”,《生物技术杂志(JBiotechnol)》137:1-7.;韦伯(Weber,E.),格鲁兹勒(R.Gruetzner),沃尔纳(S.Werner),恩格勒(C.Engler)和马里约内(S.Marillonnet),2011,通过金门克隆装配设计者TAL效应子(Assembly of Designer TAL Effectors by Golden Gate Cloning),《公共科学图书馆综合卷》6:e19722);iii)

Figure BDA0002279931240000512

重组;iv)克隆、核酸外切酶介导的装配(艾斯兰迪斯(Aslanidis)和德迥(de Jong),1990,“PCR产物的非连接依赖性克隆(LIC-PCR)(Ligation-independent cloning of PCR products(LIC-PCR))”,《核酸研究(NucleicAcids Research)》,第18卷,第20 6069期);v)同源重组;vi)非同源末端连接;或其组合。模组化的基于IIS型的装配策略在PCT公开WO 2011/154147中公开,其公开内容以引用的方式包括在本文中。

在一些实施例中,本公开教示了具有至少一个选择标记的克隆载体。各种选择标记基因是所属领域中已知的,其通常编码抗生素抗性功能以在原核细胞(例如针对安比西林(ampicillin)、卡那霉素(kanamycin)、四环素(tetracycline)、氯胺苯醇(chloramphenycol)、匀霉素(zeocin)、观霉素/链霉素(spectinomycin/streptomycin))或真核细胞(例如遗传霉素(geneticin)、新霉素(neomycin)、潮霉素(hygromycin)、嘌呤霉素(puromycin)、杀稻瘟菌素(blasticidin)、匀霉素)中在选择性压力下进行选择。其它标记系统允许筛选和鉴别所需或非所需的细胞,如众所周知的蓝/白筛选系统,其在细菌中用于在X-gal或荧光报告子(如在成功转导的宿主细胞中表达的绿色或红色荧光蛋白)存在下选择阳性克隆。大部分只在原核生物系统中具功能性的另一类选择标记是指可反向选择的标记基因,通常也称为“死亡基因”,其表达杀死生产细胞的毒性基因产物。这类基因的实例包括sacB、rpsL(strA)、tetAR、pheS、thyA、gata-1或ccdB,其功能描述于(雷拉特(Reyrat)等人,1998,“可反向选择的标记:细菌遗传学和发病机理的未开发工具(CounterselectableMarkers:Untapped Tools for Bacterial Genetics and Pathogenesis)”,《感染与免疫(Infect Immun.)》,66(9):4011-4017)。

在一些实施例中,其中克隆目标DNA区段的载体包含启动子。启动子多核苷酸可用于在宿主微生物中过表达或低表达gapA和/或烟酰胺核苷酸转氢酶和/或gdh、asd、dapB和/或ddh和/或TA和/或pyc。

在一些实施例中,所产生的包含异源gapA基因和/或烟酰胺核苷酸转氢酶基因和/或gdh、asd、dapB和ddh基因中的一或多种和/或TA基因的每个菌株进行培养并根据本公开的一或多个准则(例如所关注的生物分子或产物的生产率)分析。来自所分析的每个宿主菌株的数据与具体的gapA基因或烟酰胺核苷酸转氢酶基因或gdh、asd、dapB和/或ddh基因和/或TA基因和/或pyc基因或gapA/烟酰胺核苷酸转氢酶/gdh、asd、dapB和/或ddh基因/TA/pyc组合关联/相关,并记录下来供将来使用。因此,本公开能够产生大且高度注释的基因多样性文库/保藏处,其鉴别gapA基因或烟酰胺核苷酸转氢酶基因或gdh、asd、dapB和/或ddh基因和/或TA基因和/或pyc基因或者gapA或烟酰胺核苷酸转氢酶基因或gdh、asd、dapB和/或ddh/TA/pyc基因组合对所关注的许多基因或表型特性的作用。

在一些实施例中,多样性池内的菌株是参照“参考菌株”测定的。在一些实施例中,参考菌株是野生型菌株。在其它实施例中,参考菌株是经历任何基因组工程化之前的原始工业菌株。参考菌株可以由从业者定义并且不一定是原始野生型菌株或原始工业菌株。基础菌株仅仅代表被视为“基础”、“参考”或原始基因背景的菌株,与由所述参考菌株衍生或开发的后续菌株与之进行比较。

值得留意的构思是亲代菌株与参考菌株之间的差异。亲代菌株是用于当前一轮基因组工程化的背景。参考菌株是在每个平板中用于促进比较,尤其是平板之间的比较的对照菌株,且典型地是如上文所提及的“基础菌株”。但是由于所述基础菌株(例如用于对总体性能进行基准测试的野生型或工业菌株)在所指定一轮的菌株改良中是诱变目标的意义上不一定是“基础”,因此更具描述性的术语是“参考菌株”。

总之,基础/参考菌株通常是用于对所建构菌株的性能进行基准测试,而亲代菌株是用于对相关基因背景下的特定基因变化的性能进行基准测试。

在一些实施例中,本公开教示了载体的用途,其用于在起始和/或终止密码子变体下克隆gapA基因和/或烟酰胺核苷酸转氢酶和/或gdh、asd、dapB和/或ddh基因和/或TA基因和/或pyc基因,使得所克隆的基因利用起始和/或终止密码子变体。举例来说,酿酒酵母和哺乳动物的典型终止密码子分别是UAA和UGA。单子叶植物的典型终止密码子是UGA,而昆虫和大肠杆菌通常使用UAA作为终止密码子(达尔芬(Dalphin)等人(1996),核酸研究(Nucl.Acids Res.)24:216-218)。

密码子优化

在一个实施例中,所提供的公开的方法包含对宿主生物体所表达的一或多种基因进行密码子优化。用于优化密码子以改善各种宿主中的表达的方法在所属领域中已知且描述于文献(参见美国专利申请公开第2007/0292918号,所述申请以全文引用的方式并入本文中)中。可以制备含有具体原核生物或真核生物宿主优选的密码子的优化编码序列(也参见莫雷(Murray)等人(1989),《核酸研究(Nucl.Acids Res.)》17:477-508),以例如提高翻译速率或产生具有期望特性的重组RNA转录物,如半衰期比由非优化序列产生的转录物长。

在一些实施例中,本文提供的gapA/烟酰胺核苷酸转氢酶/gdh、asd、dapB和/或ddh基因/TA基因/pyc基因或多核苷酸包含针对在本文提供的例如大肠杆菌和/或谷氨酸棒状杆菌等宿主细胞中的翻译优化的分子密码子。基因或多核苷酸可以是分离、合成或重组核酸。密码子优化的gapA/烟酰胺核苷酸转氢酶/gdh、asd、dapB和/或ddh/TA/pyc基因或多核苷酸可以选自SEQ ID NO:1-50、67-74、79-231和232。本文提供的密码子优化的gapA/烟酰胺核苷酸转氢酶/gdh、asd、dapB和/或ddh/TA/pyc基因或多核苷酸可以使用所属领域中已知的用于产生密码子优化的多核苷酸的方法,例如金思特(GenScript)的OptimumGeneTM基因设计系统或DNA2.0

Figure BDA0002279931240000531

表达优化技术产生。

蛋白质表达由大量因素控制,包括影响转录、mRNA加工以及翻译稳定性和起始的那些因素。因此优化可以解决任何具体基因的大量序列特点中的任一个。作为一个特定实例,稀有密码子诱导的翻译暂停能够引起蛋白质表达减少。稀有密码子诱导的翻译暂停包括所关注的多核苷酸中存在很少用于宿主生物体中的密码子,因其在可利用的tRNA池中的稀缺性而可能对蛋白质翻译产生负面影响。

交替翻译起始还会引起异源蛋白质表达减少。交替翻译起始可以包括合成多核苷酸序列,其不经意间含有能够充当核糖体结合位点(RBS)的基元。这些位点可以起始截短蛋白质从基因内部位点的翻译。一种减少产生在纯化期间可能难以去除的截短蛋白质的可能性的方法包括将推定的内部RBS序列从优化的多核苷酸序列中排除。

重复诱导的聚合酶打滑会引起异源蛋白质表达减少。重复诱导的聚合酶打滑涉及已经显示可引起DNA聚合酶打滑或停顿,造成移框突变的核苷酸序列重复。这类重复还能够引起RNA聚合酶打滑。在具有高G+C含量偏好的生物体中,可以存在由G或C核苷酸重复组成的较高程度的重复。因此,一种减少诱导RNA聚合酶打滑的可能性的方法包括改变G或C核苷酸的延长重复。

干扰二级结构也会引起异源蛋白质表达减少。二级结构能够隔离RBS序列或起始密码子并且已经与蛋白质表达的减少相关。茎环结构也会与转录暂停和减弱相关。优化的多核苷酸序列可以在核苷酸序列的RBS和基因编码区中含有最小的二级结构以实现转录和翻译的改善。

举例来说,优化程序可以始于鉴别由宿主表达的所期望氨基酸序列。可以由所述氨基酸序列设计出候选多核苷酸或DNA序列。在合成DNA序列的设计期间,可以对密码子使用频率与宿主表达生物体的密码子使用进行比较并且可以从合成序列中去除稀有宿主密码子。另外,可以对合成候选DNA序列进行修饰以便去除非期望的酶限制位点并且添加或去除任何所期望的信号序列、连接子或未翻译区域。可以分析合成DNA序列中的可能会干扰翻译过程的二级结构的存在,所述二级结构如G/C重复和茎环结构。

宿主细胞的转化

在一些实施例中,本公开的载体可以使用多种技术中的任一种引入宿主细胞中,所述技术包括转化、转染、转导、病毒感染、基因枪或Ti介导的基因转移。具体方法包括磷酸钙转染、DEAE-葡聚糖介导的转染、脂质体转染或电穿孔(戴维斯(Davis,L.),迪波乐(Dibner,M.),巴特(Battey,I.),1986“分子生物学基础方法(Basic Methods inMolecular Biology)”)。其它转化方法包括例如乙酸锂转化和电穿孔。参见例如杰兹(Gietz)等人,《核酸研究(Nucleic Acids Res.)》,27:69-74(1992);伊藤(Ito)等人,《细菌学杂志(J.Bacterol.)》153:163-168(1983);以及贝克尔(Becker)和加伦特(Guarente),《酶学方法(Methods in Enzymology)》194:182-187(1991)。在一些实施例中,转化的宿主细胞称为重组宿主菌株。

在一些实施例中,本公开教示了使用所属领域中已知的96孔板机器人技术平台和液体处置机进行细胞的高通量转化。

在一些实施例中,本公开教示了用一或多个选择标记筛选转化细胞。在一个此类实施例中,将经包含卡那霉素抗性标记(KanR)的载体转化的细胞涂铺于含有有效量的卡那霉素抗生素的培养基上。推测加入卡那霉素的培养基上可见的菌落形成单位,以将载体盒并入其基因组中。所期望序列的***可以通过PCR、限制酶分析和/或相关***位点的测序来证实。

所选序列的环出

在一些实施例中,本公开教示了使DNA的所选区域从宿主生物体中环出的方法。环出方法可以如中岛(Nakashima)等人,2014“通过基因组编辑和基因静默进行的细菌细胞工程化(Bacterial Cellular Engineering by Genome Editing and Gene Silencing)”,国际分子科学杂志(Int.J.Mol.Sci.)15(2),2773-2793中所述。在一些实施例中,本公开教示了使选择标记从阳性转化体环出。环出缺失技术在所属领域中已知,并且描述于(替尔(Tear)等人,2014“不稳定人工基因特异性反向重复序列的切除介导了大肠杆菌中的无痕基因缺失(Excision of Unstable Artificial Gene-Specific inverted RepeatsMediates Scar-Free Gene Deletions in Escherichia coli)”,《应用生物化学和生物技术(Appl.Biochem.Biotech.)》175:1858-1867)。本文所提供的方法中使用的环出方法可以使用单一互换型同源重组或双重互换型同源重组执行。在某些实施例中,如本文所述的所选区域的环出可能需要使用如本文所述的单一互换型同源重组。

首先,将环出载体***宿主生物体基因组内的所选目标区域中(例如通过同源重组、CRISPR或其它基因编辑技术)。在一个实施例中,单一互换型同源重组是在圆形质粒或载体与宿主细胞基因组之间使用,以使圆形质粒或载体环入,如图6中所描绘。所***的载体可以使用作为现有或邻近引入的宿主序列的直接重复序列的序列设计,以便直接重复序列侧接预定成环和缺失的DNA区域。一旦***,可以根据选择区域的缺失来反向选择含有环出质粒或载体的细胞(参见例如图7;缺乏针对选择基因的抗性)。

宿主微生物

虽然本文所提供的基因组工程化方法是用工业微生物细胞培养物例示的,但可适用于可以在基因突变体群体中鉴别所需特性的任何生物体。

因此,如本文所用,术语“微生物”应在宽广的意义上理解。其包括(但不限于)两个原核生物域:细菌和古细菌,以及某些真核生物真菌和原生生物。然而,在某些方面,本文教示的方法中可以使用“更高级”的真核生物体,如昆虫、植物和动物。

适合的宿主细胞包括(但不限于):细菌细胞、藻类细胞、植物细胞、真菌细胞、昆虫细胞和哺乳动物细胞。在一个示例性实施例中,适合的宿主细胞包括大肠杆菌(例如SHuffleTM胜任型大肠杆菌,其获自马萨诸塞州伊普威治的新英格兰生物实验室(NewEngland BioLabs,Ipswich,Mass.))。

本公开的其它适合宿主生物体包括棒状杆菌属的微生物。在一些实施例中,优选的棒状杆菌属菌株/菌种包括:有效棒状杆菌(C.efficiens),保藏模式菌株是DSM44549;谷氨酸棒状杆菌,保藏模式菌株是ATCC13032;以及产氨棒状杆菌(C.ammoniagenes),保藏模式菌株是ATCC6871。在一些实施例中,本公开的优选宿主是谷氨酸棒状杆菌。在一些实施例中,本公开教示了志贺杆菌属(Shigella)的宿主细胞,包括弗氏志贺杆菌(Shigellaflexneri)、痢疾志贺杆菌(Shigella dysenteriae)、鲍氏志贺杆菌(Shigella boydii)和索氏志贺杆菌(Shigella sonnei)。

棒状杆菌属(具体地说,谷氨酸棒状杆菌菌种)中的适合宿主菌株尤其是已知的野生型菌株:谷氨酸棒状杆菌ATCC13032、醋麸酸棒状杆菌(Corynebacteriumacetoacidophilum)ATCC15806、嗜乙酰乙酸棒状杆菌(Corynebacteriumacetoacidophilum)ATCC13870、糖蜜棒状杆菌(Corynebacterium melassecola)ATCC17965、产热氨棒状杆菌(Corynebacterium thermoaminogenes)FERM BP-1539、黄色短杆菌(Brevibacterium flavum)ATCC14067、乳酸发酵短杆菌(Brevibacteriumlactofermentum)ATCC13869和分歧短杆菌(Brevibacterium divaricatum)ATCC14020;以及由其制备的产L-氨基酸突变体或菌株,例如产L-赖氨酸菌株:谷氨酸棒状杆菌FERM-P1709、黄色短杆菌FERM-P 1708、乳酸发酵短杆菌FERM-P 1712、谷氨酸棒状杆菌FERM-P6463、谷氨酸棒状杆菌FERM-P6464、谷氨酸棒状杆菌DM58-1、谷氨酸棒状杆菌DG52-5、谷氨酸棒状杆菌DSM5714和谷氨酸棒状杆菌DSM12866。

对于谷氨酸棒状杆菌来说,术语“谷氨酸微球菌”也已在使用。物种有效棒状杆菌的一些代表例在现有技术中也称为产热氨棒状杆菌,如菌株FERM BP-1539。

在一些实施例中,本公开的宿主细胞是真核细胞。适合的真核生物宿主细胞包括(但不限于):真菌细胞、藻类细胞、昆虫细胞、动物细胞和植物细胞。适合的真菌宿主细胞包括(但不限于):子囊菌门(Ascomycota)、担子菌门(Basidiomycota)、半知菌门(Deuteromycota)、接合菌门(Zygomycota)、不完全菌类(Fungi imperfecti)。某些优选的真菌宿主细胞包括酵母细胞和丝状真菌细胞。适合的丝状真菌宿主细胞包括例如真菌门(Eumycotina)和卵菌门(Oomycota)亚门的任何丝状形式。(参见例如霍克索斯(Hawksworth)等人,安·贝氏真菌词典(Ainsworth and Bisby's Dictionary of TheFungi),第8版,1995年,CAB国际,大学出版社,英国剑桥(CAB International,UniversityPress,Cambridge,UK),其以引用的方式并入本文中)。丝状真菌的特征是营养菌丝体,其细胞壁由甲壳素、纤维素和其它复杂多糖组成。丝状真菌宿主细胞在形态上不同于酵母。

在某些说明性但非限制性的实施例中,丝状真菌宿主细胞可以是以下物种的细胞:棉霉属(Achlya)、枝顶孢属(Acremonium)、曲霉属(Aspergillus)、短梗霉属(Aureobasidium)、烟管霉属(Bjerkandera)、拟蜡菌属(Ceriporiopsis)、头孢霉属(Cephalosporium)、金孢霉属(Chrysosporium)、旋孢腔菌属(Cochliobolus)、棒囊壳属(Corynascus)、隐丛赤壳属(Cryphonectria)、隐球菌属(Cryptococcus)、鬼伞属(Coprinus)、革盖菌属(Coriolus)、色二孢属(Diplodia)、内斯菌属(Endothis)、镰孢菌属(Fusarium)、赤霉属(Gibberella)、胶霉属(Gliocladium)、腐殖菌属(Humicola)、肉座菌属(Hypocrea)、毁丝霉属(Myceliophthora)(例如嗜热毁丝霉(Myceliophthorathermophila))、白霉菌属(Mucor)、脉孢菌属(Neurospora)、青霉属(Penicillium)、柄孢壳属(Podospora)、射脉菌属(Phlebia)、瘤胃壶菌属(Piromyces)、梨胞霉属(Pyricularia)、根毛霉属(Rhizomucor)、根霉菌属(Rhizopus)、裂殖菌属(Schizophyllum)、革节孢属(Scytalidium)、孢子丝菌属(Sporotrichum)、踝节菌属(Talaromyces)、嗜热子囊菌属(Thermoascus)、梭孢壳霉属(Thielavia)、栓菌属(Tramates)、弯颈霉菌属(Tolypocladium)、木霉属(Trichoderma)、轮枝孢属(Verticillium)、小包脚菇属(Volvariella),或其有性型或无性型,以及其同义词或分类同等物。

适合的酵母宿主细胞包括(但不限于):念珠菌属(Candida)、汉逊酵母属(Hansenula)、酵母菌属(Saccharomyces)、裂殖酵母属(Schizosaccharomyces)、毕赤酵母属(Pichia)、克鲁维酵母属(Kluyveromyces)和耶氏酵母属(Yarrowia)。在一些实施例中,酵母细胞是多形汉逊酵母(Hansenula polymorpha)、酿酒酵母(Saccharomycescerevisiae)、卡氏酵母(Saccaromyces carlsbergensis)、糖化酵母(Saccharomycesdiastaticus)、洛本酵母(Saccharomyces norbensis)、克鲁维酵母(Saccharomyceskluyveri)、粟酒裂殖酵母(Schizosaccharomyces pombe)、巴斯德毕赤酵母(Pichiapastoris)、芬兰毕赤酵母(Pichia finlandica)、嗜海藻糖毕赤酵母(Pichiatrehalophila)、考达毕赤酵母(Pichia kodamae)、膜醭毕赤酵母(Pichiamembranaefaciens)、幸运毕赤酵母(Pichia opuntiae)、耐热毕赤酵母(Pichiathermotolerans)、萨利毕赤酵母(Pichia salictaria)、松栎毕赤酵母(Pichiaquercuum)、皮吉毕赤酵母(Pichia pijperi)、树干毕赤酵母(Pichia stipitis)、嗜甲醇毕赤酵母(Pichia methanolica)、安格斯毕赤酵母(Pichia angusta)、乳酸克鲁维酵母(Kluyveromyces lactis)、白色念珠菌(Candida albicans)或解脂耶氏酵母(Yarrowialipolytica)。

在某些实施例中,宿主细胞是藻类,如衣藻属(Chlamydomonas)(例如莱茵衣藻(C.Reinhardtii))和席藻属(Phormidium)(席藻种ATCC29409)。

在其它实施例中,宿主细胞是原核细胞。适合的原核生物细胞包括革兰氏阳性、革兰氏阴性和革兰氏变异性细菌细胞。宿主细胞可以是(但不限于)以下物种:土壤杆菌属(Agrobacterium)、脂环杆菌属(Alicyclobacillus)、念珠藻属(Anabaena)、倒囊藻属(Anacystis)、不动杆菌属(Acinetobacter)、酸热菌属(Acidothermus)、节杆菌属(Arthrobacter)、固氮菌属(Azobacter)、芽孢杆菌属(Bacillus)、双歧杆菌属(Bifidobacterium)、短杆菌属(Brevibacterium)、丁酸弧菌属(Butyrivibrio)、布赫纳氏菌属(Buchnera)、平原菟丝子(Campestris)、弯曲杆菌属(Camplyobacter)、梭菌属(Clostridium)、棒状杆菌属、红色硫黃细菌属(Chromatium)、粪球菌属(Coprococcus)、埃希氏杆菌属(Escherichia)、肠球菌属(Enterococcus)、肠杆菌属(Enterobacter)、欧文菌属(Erwinia)、梭杆菌属(Fusobacterium)、粪栖杆菌属(Faecalibacterium)、弗朗西斯氏菌属(Francisella)、黄杆菌属(Flavobacterium)、土芽孢杆菌属(Geobacillus)、嗜血杆菌属(Haemophilus)、螺旋杆菌属(Helicobacter)、克雷伯氏菌属(Klebsiella)、乳杆菌属(Lactobacillus)、乳球菌属(Lactococcus)、泥杆菌属(Ilyobacter)、微球菌属(Micrococcus)、微杆菌属(Microbacterium)、中间根瘤菌属(Mesorhizobium)、甲基杆菌属(Methylobacterium)、甲基杆菌属、分枝杆菌属(Mycobacterium)、奈瑟菌属(Neisseria)、泛菌属(Pantoea)、假单胞菌属(Pseudomonas)、原绿球藻属(Prochlorococcus)、红细菌属(Rhodobacter)、红假单胞菌属(Rhodopseudomonas)、红假单胞菌属、罗斯氏菌属(Roseburia)、红螺菌属(Rhodospirillum)、红球菌属(Rhodococcus)、栅列藻属(Scenedesmus)、链霉菌属(Streptomyces)、链球菌属(Streptococcus)、聚球藻属(Synecoccus)、糖单孢菌属(Saccharomonospora)、葡萄球菌属(Staphylococcus)、沙雷氏菌属(Serratia)、沙门氏菌属(Salmonella)、志贺杆菌属(Shigella)、嗜热厌氧杆菌属(Thermoanaerobacterium)、养障体(Tropheryma)、土拉热(Tularensis)、蒂梅丘拉(Temecula)、嗜热聚球藻属(Thermosynechococcus)、热球菌属(Thermococcus)、脲原体属(Ureaplasma)、黄单胞菌属(Xanthomonas)、木杆菌属(Xylella)、耶尔森氏菌属(Yersinia)和发酵单胞菌属(Zymomonas)。在一些实施例中,宿主细胞是谷氨酸棒状杆菌。

在一些实施例中,宿主菌株是细菌宿主菌株。在一些实施例中,细菌宿主菌株是工业菌株。已知多种细菌工业菌株且其适用于本文所述的方法和组合物中。

在一些实施例中,细菌宿主细胞是土壤杆菌属物种(例如土壤放射杆菌(A.radiobacter)、发根土壤杆菌(A.rhizogenes)、悬钩子土壤杆菌(A.rubi))、节杆菌属物种(例如金黄节杆菌(A.aurescens)、柠檬节杆菌(A.citreus)、球形节杆菌(A.globformis)、裂烃谷氨酸节杆菌(A.hydrocarboglutamicus)、迈索尔节杆菌(A.mysorens)、烟草节杆菌(A.nicotianae)、石蜡节杆菌(A.paraffineus)、畏光节杆菌(A.protophonniae)、玫瑰色石蜡节杆菌(A.roseoparaffinus)、硫磺节杆菌(A.sulfureus)、产脲节杆菌(A.ureafaciens))、芽孢杆菌属物种(例如苏云金芽孢杆菌(B.thuringiensis)、炭疽芽孢杆菌(B.anthracis)、巨大芽孢杆菌(B.megaterium)、枯草芽孢杆菌(B.subtilis)、迟缓芽胞杆菌(B.lentus)、环状芽孢杆菌(B.circulars)、短小芽孢杆菌(B.pumilus)、灿烂芽孢杆菌(B.lautus)、凝结芽孢杆菌(B.coagulans)、短小芽孢杆菌(B.brevis)、强固芽胞杆菌(B.firmus)、嗜碱芽孢杆菌(B.alkaophius)、地衣芽孢杆菌(B.licheniformis)、克劳氏芽孢杆菌(B.clausii)、嗜热脂肪芽孢杆菌(B.stearothermophilus)、耐盐嗜碱芽孢杆菌(B.halodurans)和解淀粉芽孢杆菌(B.amyloliquefaciens)。在具体实施例中,宿主细胞是工业芽孢杆菌属菌株,包括(但不限于)枯草芽孢杆菌、短小芽孢杆菌、地衣芽孢杆菌、巨大芽孢杆菌、克劳氏芽孢杆菌、嗜热脂肪芽孢杆菌和解淀粉芽孢杆菌。在一些实施例中,宿主细胞是工业梭菌属物种(例如丙酮丁醇梭菌(C.acetobutylicum)、破伤风梭菌E88(C.tetani E88)、象牙海岸梭菌(C.lituseburense)、糖丁酸梭菌(C.saccharobutylicum)、产气荚膜梭菌(C.perfringens)、拜氏梭菌(C.beijerinckii))。在一些实施例中,宿主细胞是工业棒状杆菌属菌种(例如谷氨酸棒状杆菌、嗜乙酰乙酸棒状杆菌)。在一些实施例中,宿主细胞是工业埃希氏杆菌属物种(例如大肠杆菌)。在一些实施例中,宿主细胞是工业欧文菌属(Erwinia)菌种(例如噬夏孢欧文菌(E.uredovora)、胡萝卜软腐欧文菌(E.carotovora)、菠萝欧文氏菌(E.ananas)、草生欧文菌(E.herbicola)、点状欧文菌(E.punctata)、土生欧文菌(E.terreus))。在一些实施例中,宿主细胞是工业泛菌属物种(例如柠檬泛菌(P.citrea)、成团泛菌(P.agglomerans))。在一些实施例中,宿主细胞是工业假单胞菌属(Pseudomonas)物种(例如恶臭假单胞菌(P.putida)、铜绿假单胞菌(P.aeruginosa)、迈氏假单胞菌(P.mevalonii))。在一些实施例中,宿主细胞是工业链球菌属物种(例如类马链球菌(S.equisimiles)、酿脓链球菌(S.pyogenes)、***链球菌(S.uberis))。在一些实施例中,宿主细胞是工业链霉菌属菌种(例如产二素链霉菌(S.ambofaciens)、不产色链霉菌(S.achromogenes)、除虫链霉菌(S.avermitilis)、天蓝色链霉菌(S.coelicolor)、金霉素链霉菌(S.aureofaciens)、金黄色葡萄球菌(S.aureus)、杀真菌素链霉菌(S.fungicidicus)、灰色链霉菌(S.griseus)、变铅青链霉菌(S.lividans))。在一些实施例中,宿主细胞是工业发酵单胞菌属(Zymomonas)物种(例如运动发酵单胞菌(Z.mobilis)、解脂发酵单胞菌(Z.lipolytica))等等。

在各种实施例中,公众容易从多个培养物保藏中心获得可以用于实施本公开的菌株(包括原核和真核菌株),如美国菌种保藏中心(American Type Culture Collection,ATCC)、德国微生物菌种保藏中心(Deutsche Sammlung von Mikroorganismen andZellkulturen GmbH,DSM)、荷兰微生物菌种保藏中心(Centraalbureau VoorSchimmelcultures,CBS)以及美国农业研究菌种保藏中心(Agricultural ResearchService Patent Culture Collection)、北方区域研究中心(Northern RegionalResearch Center,NRRL))。

在一些实施例中,本公开的方法也适用于多细胞生物体。举例来说,所述平台可以用于改良农作物的性能。生物体可以包含多种植物,如禾本亚目(Gramineae)、非突亚科(Fetucoideae)、颇考亚科(Poacoideae)、剪股颖属(Agrostis)、梯牧草属(Phleum)、鸡脚茅属(Dactylis)、高粱属(Sorgum)、狗尾草属(Setaria)、玉蜀黍属(Zea)、稻属(Oryza)、小麦属(Triticum)、黑麦属(Secale)、燕麦属(Avena)、大麦属(Hordeum)、蔗属(Saccharum)、早熟禾属(Poa)、羊茅属(Festuca)、钝叶草属(Stenotaphrum)、狗牙根属(Cynodon)、薏苡属(Coix)、莪利竹族(Olyreae)、原禾族(Phareae)、菊科(Compositae)或豆科(Leguminosae)。举例来说,植物可以是玉米、稻米、大豆、棉花、小麦、黑麦、燕麦、大麦、豌豆、菜豆、小扁豆、花生、地瓜、豇豆、绒毛豆、三叶草、苜蓿、羽扇豆、野豌豆、莲藕、草木樨、紫藤、香豌豆、高粱、小米、葵花、芥花或其类似物。类似地,生物体可以包括多种动物,如非人类哺乳动物、鱼、昆虫等。

大肠杆菌宿主细胞

如上所提及,大肠杆菌宿主细胞可以用于本公开的实施例中。

举例来说,大肠杆菌物种的适合宿主菌株包含:产肠毒素大肠杆菌(Enterotoxigenic E.coli,ETEC)、肠病原体大肠杆菌(Enteropathogenic E.coli,EPEC)、肠侵袭性大肠杆菌(Enteroinvasive E.coli,EIEC)、肠出血性大肠杆菌Enterohemorrhagic E.coli,EHEC)、尿道致病性大肠杆菌(Uropathogenic E.coli,UPEC)、产维罗毒素大肠杆菌(Verotoxin-producing E.coli)、大肠杆菌O157:H7、大肠杆菌O104:H4、大肠杆菌O121、大肠杆菌O104:H21、大肠杆菌K1和大肠杆菌NC101。在一些实施例中,本公开教示了大肠杆菌K12、大肠杆菌B和大肠杆菌C的基因组工程化。

在一些实施例中,本公开教示了以下大肠杆菌菌株的基因组工程化:NCTC 12757、NCTC 12779、NCTC 12790、NCTC 12796、NCTC 12811、ATCC 11229、ATCC 25922、ATCC 8739、DSM 30083、BC 5849、BC 8265、BC 8267、BC 8268、BC 8270、BC 8271、BC 8272、BC 8273、BC8276、BC 8277、BC 8278、BC 8279、BC 8312、BC 8317、BC 8319、BC 8320、BC 8321、BC 8322、BC 8326、BC 8327、BC 8331、BC 8335、BC 8338、BC 8341、BC 8344、BC 8345、BC 8346、BC8347、BC 8348、BC 8863和BC 8864。

在一些实施例中,本公开教示了维罗毒素致病性大肠杆菌(VTEC),例如菌株BC4734(O26:H11)、BC 4735(O157:H-)、BC 4736、BC 4737(n.d.)、BC 4738(O157:H7)、BC 4945(O26:H-)、BC 4946(O157:H7)、BC 4947(O111:H-)、BC 4948(O157:H)、BC 4949(O5)、BC5579(O157:H7)、BC 5580(O157:H7)、BC 5582(O3:H)、BC 5643(O2:H5)、BC 5644(O128)、BC5645(O55:H-)、BC 5646(O69:H-)、BC 5647(O101:H9)、BC 5648(O103:H2)、BC 5850(O22:H8)、BC 5851(O55:H-)、BC 5852(O48:H21)、BC 5853(O26:H11)、BC 5854(O157:H7)、BC5855(O157:H-)、BC 5856(O26:H-)、BC 5857(O103:H2)、BC 5858(O26:H11)、BC 7832、BC7833(O原始形式:H-)、BC 7834(ONT:H-)、BC 7835(O103:H2)、BC 7836(O57:H-)、BC 7837(ONT:H-)、BC 7838、BC 7839(O128:H2)、BC 7840(O157:H-)、BC 7841(O23:H-)、BC 7842(O157:H-)、BC 7843、BC 7844(O157:H-)、BC 7845(O103:H2)、BC 7846(O26:H11)、BC 7847(O145:H-)、BC 7848(O157:H-)、BC 7849(O156:H47)、BC 7850、BC 7851(O157:H-)、BC 7852(O157:H-)、BC 7853(O5:H-)、BC 7854(O157:H7)、BC 7855(O157:H7)、BC 7856(O26:H-)、BC7857、BC 7858、BC 7859(ONT:H-)、BC 7860(O129:H-)、BC 7861、BC 7862(O103:H2)、BC7863、BC 7864(O原始形式:H-)、BC 7865、BC 7866(O26:H-)、BC 7867(O原始形式:H-)、BC7868、BC 7869(ONT:H-)、BC 7870(O113:H-)、BC 7871(ONT:H-)、BC 7872(ONT:H-)、BC7873、BC 7874(O原始形式:H-)、BC 7875(O157:H-)、BC 7876(O111:H-)、BC 7877(O146:H21)、BC 7878(O145:H-)、BC 7879(O22:H8)、BC 7880(O原始形式:H-)、BC 7881(O145:H-)、BC 8275(O157:H7)、BC 8318(O55:K-:H-)、BC 8325(O157:H7)和BC 8332(ONT)、BC 8333。

在一些实施例中,本公开教示了肠侵袭性大肠杆菌(EIEC),例如菌株BC 8246(O152:K-:H-)、BC 8247(O124:K(72):H3)、BC 8248(O124)、BC 8249(O112)、BC 8250(O136:K(78):H-)、BC 8251(O124:H-)、BC 8252(O144:K-:H-)、BC 8253(O143:K:H-)、BC 8254(O143)、BC 8255(O112)、BC 8256(O28a.e)、BC 8257(O124:H-)、BC 8258(O143)、BC 8259(O167:K-:H5)、BC 8260(O128a.c.:H35)、BC 8261(O164)、BC 8262(O164:K-:H-)、BC 8263(O164)和BC 8264(O124)。

在一些实施例中,本公开教示了产肠毒素大肠杆菌(ETEC),例如菌株BC 5581(O78:H11)、BC 5583(O2:K1)、BC 8221(O118)、BC 8222(O148:H-)、BC 8223(O111)、BC 8224(O110:H-)、BC 8225(O148)、BC 8226(O118)、BC 8227(O25:H42)、BC 8229(O6)、BC 8231(O153:H45)、BC 8232(O9)、BC 8233(O148)、BC 8234(O128)、BC 8235(O118)、BC 8237(O111)、BC 8238(O110:H17)、BC 8240(O148)、BC 8241(O6H16)、BC 8243(O153)、BC 8244(O15:H-)、BC 8245(O20)、BC 8269(O125a.c:H-)、BC 8313(O6:H6)、BC 8315(O153:H-)、BC8329、BC 8334(O118:H12)和BC 8339。

在一些实施例中,本公开教示了肠病原体大肠杆菌(EPEC),例如菌株BC 7567(O86)、BC 7568(O128)、BC 7571(O114)、BC 7572(O119)、BC 7573(O125)、BC 7574(O124)、BC 7576(O127a)、BC 7577(O126)、BC 7578(O142)、BC 7579(O26)、BC 7580(OK26)、BC 7581(O142)、BC 7582(O55)、BC 7583(O158)、BC 7584(O-)、BC 7585(O-)、BC 7586(O-)、BC8330、BC 8550(O26)、BC 8551(O55)、BC 8552(O158)、BC 8553(O26)、BC 8554(O158)、BC8555(O86)、BC 8556(O128)、BC 8557(OK26)、BC 8558(O55)、BC 8560(O158)、BC 8561(O158)、BC 8562(O114)、BC 8563(O86)、BC 8564(O128)、BC 8565(O158)、BC 8566(O158)、BC 8567(O158)、BC 8568(O111)、BC 8569(O128)、BC 8570(O114)、BC 8571(O128)、BC 8572(O128)、BC 8573(O158)、BC 8574(O158)、BC 8575(O158)、BC 8576(O158)、BC 8577(O158)、BC 8578(O158)、BC 8581(O158)、BC 8583(O128)、BC 8584(O158)、BC 8585(O128)、BC 8586(O158)、BC 8588(O26)、BC 8589(O86)、BC 8590(O127)、BC 8591(O128)、BC 8592(O114)、BC8593(O114)、BC 8594(O114)、BC 8595(O125)、BC 8596(O158)、BC 8597(O26)、BC 8598(O26)、BC 8599(O158)、BC 8605(O158)、BC 8606(O158)、BC 8607(O158)、BC 8608(O128)、BC 8609(O55)、BC 8610(O114)、BC 8615(O158)、BC 8616(O128)、BC 8617(O26)、BC 8618(O86)、BC 8619、BC 8620、BC 8621、BC 8622、BC 8623、BC 8624(O158)和BC 8625(O158)。

细胞发酵和培养

包括如本文所述进行基因工程化的微生物的本公开的微生物可以在适当时针对任何所需生物合成反应或选择进行改良的常规营养培养基中培养。在一些实施例中,本公开教示了在诱导型培养基中培养用于活化启动子。在一些实施例中,本公开教示了具有选择剂,包括转化体的选择剂(例如抗生素)或选择适合于在抑制条件(例如高乙醇条件)下生长的生物体的培养基。在一些实施例中,本公开教示了使细胞培养物在针对细胞生长优化的培养基中生长。在其它实施例中,本公开教示了细胞培养物在针对例如来源于葡萄糖代谢加工的所关注的产物或生物分子的产物产量优化的培养基中生长。在一些实施例中,本公开教示了培养物在能够诱导细胞生长并且还含有最终产物产生所需的前体(例如高水平糖以产生乙醇)的培养基中生长。

本文所提供的方法产生的所关注的生物分子或产物可以是葡萄糖产生的任何商品。在一些情况下,所关注的生物分子或产物是小分子、氨基酸、有机酸或醇。氨基酸可以是(不限于)酪氨酸、苯丙氨酸、色氨酸、天冬氨酸、天冬酰胺、苏氨酸、异亮氨酸、甲硫氨酸或赖氨酸。在特定实施例中,氨基酸是赖氨酸。在某些方面,赖氨酸是L-赖氨酸。有机酸可以是(不限于)丁二酸、乳酸或丙酮酸。醇可以是(不限于)乙醇或异丁醇。

培养条件(如温度、pH值等)是适合与选择用于表达的宿主细胞联合使用的那些条件,并且是所属领域的技术人员所显而易见的。如所提及,许多参考文献可供用于培养和产生许多细胞,包括细菌、植物、动物(包括哺乳动物)和古细菌来源的细胞。参见例如萨布鲁克(Sambrook),奥斯贝(Ausubel)(所有均见上文)以及伯杰(Berger),《分子克隆技术指南(Guide to Molecular Cloning Techniques)》,《酶学方法(Methods in Enzymology)》,第152卷,学术出版社有限公司(Academic Press,Inc.),加利福尼亚州圣地亚哥(San Diego,CA);以及弗瑞旭尼(Freshney)(1994),《动物细胞的培养:基本技术手册(Culture ofAnimal Cells,a Manual of Basic Technique)》,第三版,纽约威立-利斯(Wiley-Liss,New York)和其中引用的参考文献;多伊尔(Doyle)和格里菲思(Griffiths)(1997),《哺乳动物细胞培养:基本技术(Mammalian Cell Culture:Essential Techniques)》,约翰·威利父子出版公司(John Wiley and Sons),纽约州(NY);忽玛逊(Humason)(1979),《动物组织技术(Animal Tissue Techniques)》,第四版,W.H.弗里曼公司(W.H.Freeman andCompany);以及里奇埃德尔(Ricciardelle)等人,(1989),体外细胞发育生物学(In VitroCell Dev.Biol.)25:1016-1024,所有文献均以引用的方式并入本文中。关于植物细胞培养和再生,参见派恩(Payne)等人(1992),《液体系统中的植物细胞和组织培养(Plant Celland Tissue Culture in Liquid Systems)》,约翰·威利父子公司(John Wiley&Sons,Inc.),纽约州纽约市(New York,N.Y.);冈堡(Gamborg)和菲利浦(Phillips)(编)(1995),《植物细胞、组织和器官培养:基本方法(Plant Cell,Tissue and Organ Culture;Fundamental Methods)》,施普林格实验室手册(Springer Lab Manual),施普林格出版社(Springer-Verlag)(柏林海德堡,纽约);琼斯(Jones)编(1984),《植物基因转移和表达方案(Plant Gene Transfer and Expression Protocols)》,胡马纳出版社(Humana Press),新泽西州特图瓦市(Totowa,N.J.)以及《植物分子生物学(Plant Molecular Biology)》(1993)R.R.D.克洛(R.R.D.Croy)编,生物科学出版社(Bios Scientific Publishers),英国牛津(Oxford,U.K.)ISBN 0 12 198370 6,所有文献均以引用的方式并入本文中。细胞培养基一般性地阐述于阿特拉斯(Atlas)和帕克斯(Parks)(编),《微生物培养基手册(TheHandbook of Microbiological Media)》(1993)CRC出版社,佛罗里达州波卡拉顿(BocaRaton,Fla.),所述文献以引用的方式并入本文中。用于细胞培养的额外信息见于可获得的商业文献中,如得自西格玛-奥德里奇公司(Sigma-Aldrich,Inc)(密苏里州圣路易(StLouis,Mo.))的《生命科学研究细胞培养目录(Life Science Research Cell CultureCatalogue)》(“西格马-LSRCCC”)以及例如也得自西格玛-奥德里奇公司(密苏里州圣路易)的《植物培养目录和增刊(The Plant Culture Catalogue and supplement)》(“西格马-PCCS”),所述文献都以引用的方式并入本文中。

待用的培养基或发酵培养基必须以适合方式满足相应菌株的需求。用于各种微生物的培养基的描述存在于美国细菌学学会(American Society for Bacteriology)(美国华盛顿哥伦比亚特区,1981)的《通用细菌学方法手册(Manual of Methods for GeneralBacteriology)》中。术语培养基和发酵培养基可互换。

在一些实施例中,本公开教示了所产生的微生物可以连续培养,如例如在WO05/021772中所述,或者以分批法(分批培养)或分批进料或重复分批进料法不连续地培养,以产生所需有机化合物。关于已知培养方法的通用性质的概述可获得于希米尔(Chmiel)的教科书(《生物技术进展1:生物过程技术中的引入(Bioprozeβtechnik.1:Einführung in dieBioverfahrenstechnik)》(古斯塔夫·费希尔出版社(Gustav Fischer Verlag),斯图加特(Stuttgart),1991))或斯托哈思(Storhas)的教科书(《生物反应器和***设施(Bioreaktoren and periphere Einrichtungen)》(维尤戈出版社(Vieweg Verlag),不伦瑞克(Braunschweig)/威斯巴登(Wiesbaden),1994))。

在一些实施例中,本公开的细胞是在分批或连续发酵条件下生长。经典的分批发酵是一种封闭系统,其中在发酵开始时设定培养基的组成并且在发酵期间不进行人工改变。分批系统的一种变化形式是分批进料发酵,其也可以用于本发明中。在这种变化形式中,随着发酵进展,按增量添加底物。当代谢物抑制可能会抑制细胞代谢时并且在期望培养基中的底物的量有限的情况下,分批进料系统是适用的。分批和分批进料发酵是所属领域中常见且众所周知的。连续发酵是一种系统,其中将所定义的发酵培养基连续地添加到生物反应器中并且同时移出等量的改良性培养基以供加工和收获所需蛋白质。在一些实施例中,连续发酵通常使培养物维持在恒定的高密度下,其中细胞主要处于对数生长期。在一些实施例中,连续发酵通常使培养物维持稳定期或对数后期/稳定期生长。连续发酵系统力求维持稳态生长条件。

连续发酵工艺中用于调节营养物和生长因子的方法以及使产物形成速率最大化的技术在工业微生物学领域中是众所周知的。

举例来说,本公开的培养物的碳源的非限制性清单包括糖类和碳水化合物,例如葡萄糖、蔗糖、乳糖、果糖、麦芽糖、糖蜜、来自甜菜或甘蔗加工的含蔗糖溶液、淀粉、淀粉水解产物和纤维素;油和脂肪,例如大豆油、葵花油、花生油和椰脂;脂肪酸,例如棕榈酸、硬脂酸和亚油酸;醇类,例如甘油、甲醇和乙醇;以及有机酸,例如乙酸或乳酸。

用于本公开的培养物的氮源的非限制性清单包括含有机氮的化合物,如蛋白胨、酵母萃取物、肉萃取物、麦芽萃取物、玉米浆、大豆粉和尿素;或无机化合物,如硫酸铵、氯化铵、磷酸铵、碳酸铵和硝酸铵。氮源可以个别地使用或作为混合物使用。

用于本公开的培养物的可能磷源的非限制性清单包括磷酸、磷酸二氢钾或磷酸氢二钾或相应含钠盐。培养基可以另外包含生长所需的盐,例如呈氯化物形式的盐,或金属(例如钠、钾、镁、钙和铁)硫酸盐,例如硫酸镁或硫酸铁。最后,除上述物质之外,可以使用必需生长因子,如氨基酸,例如高丝氨酸和维生素,例如硫胺、生物素或泛酸。

在一些实施例中,培养物的pH值可以利用任何酸或碱或缓冲盐,包括(但不限于)氢氧化钠、氢氧化钾、氨或氨水,或酸性化合物,如磷酸或硫酸,通过适合方式来控制。在一些实施例中,pH值通常调节到6.0至8.5的值,优选6.5至8。

在一些实施例中,本公开的培养物可以包括消泡剂,例如脂肪酸聚二醇酯。在一些实施例中,本公开的培养物通过添加适合的选择性物质(例如抗生素)来调节以使培养物中的质粒稳定。

在一些实施例中,在好氧条件下进行培养。为了维持这些条件,将氧气或含氧气体混合物,例如空气引入培养物中。同样可以使用富含过氧化氢的液体。适当时,在高压下,例如在0.03至0.2MPa的高压下进行发酵。培养物的温度通常是20℃至45℃并且优选25℃至40℃,特别优选30℃至37℃。在分批或分批进料工艺中,培养优选持续至已经形成足以回收的量的所关注的期望产物(例如有机化合物)为止。此目的通常可以在10小时至160小时内实现。在连续工艺中,更长的培养时间是可能的。微生物的活性使得所关注的产物在发酵培养基中和/或在所述微生物的细胞中浓缩(积累)。

在一些实施例中,在厌氧条件下进行培养。

筛选

在一些实施例中,本公开教示了高通量初始筛选。在其它实施例中,本公开还教示了稳固的基于槽罐的性能数据验证。

在一些实施例中,设计高通量筛选方法以预测菌株在生物反应器中的性能。如此前所述,选择适于生物体并且反映生物反应器条件的培养条件。挑选个别菌落并且转移到96孔板中并且培育适合的时间量。随后将细胞转移到新的96孔板中用于额外的种子培养,或产生培养物。在可以进行多次测量的情况下,将培养物培育不同的时间长度。这些测量可以包括产物、生物质或预测菌株在生物反应器中的性能的其它特征的测量。使用高通量培养结果预测生物反应器性能。

在一些实施例中,使用基于槽罐的性能验证确认利用高通量筛选所分离的菌株的性能。发酵工艺/条件被设计成复制商用反应器条件。使用实验室规模发酵反应器,针对相关菌株性能特征,例如生产率或产量,筛选候选菌株。

产物回收和定量

针对所关注的产物的产生进行筛选的方法为所属领域的技术人员所知并且在本说明书中通篇论述。当筛选本公开的菌株时可以使用这类方法。本文所提供的方法产生的所关注的生物分子或产物可以是葡萄糖产生的任何商品。在一些情况下,所关注的生物分子或产物是氨基酸、有机酸或醇。氨基酸可以是(不限于)酪氨酸、苯丙氨酸、色氨酸、天冬氨酸、天冬酰胺、苏氨酸、异亮氨酸、甲硫氨酸或赖氨酸。在特定实施例中,氨基酸是赖氨酸。在某些方面,赖氨酸是L-赖氨酸。有机酸可以是(不限于)丁二酸、乳酸或丙酮酸。醇可以是(不限于)乙醇或异丁醇。

在一些实施例中,本公开教示了改良菌株的方法,所述菌株被设计成可产生非分泌性细胞内产物。举例来说,本公开教示了提高细胞培养物的稳定性、产量、效率或总体期望度,从而产生细胞内酶、油、医药或其它有价值的小分子或肽的方法。非分泌性细胞内产物的回收或分离可以利用所属领域中众所周知的溶解和回收技术,包括本文所述的那些技术来实现。

举例来说,在一些实施例中,本公开的细胞可以利用离心、过滤、沉降或其它方法收获。所收获的细胞接着利用任何便利的方法破碎,包括冷冻-解冻循环、声波处理、机械破碎或使用细胞溶解剂,或所属领域的技术人员众所周知的其它方法。

所得到的所关注的产物(例如多肽)可以利用所属领域中已知的多种方法中的任一种回收/分离并且任选地加以纯化。举例来说,可以利用常规程序从营养培养基中分离出产物多肽,所述常规程序包括(但不限于):离心、过滤、萃取、喷雾干燥、蒸发、色谱(例如离子交换、亲和、疏水性相互作用、色谱焦聚和尺寸排阻)或沉淀。最后,可以在最后纯化步骤中使用高效液相色谱(HPLC)。(参见例如细胞内蛋白质的纯化(Purification ofintracellular protein),如帕瑞(Parry)等人,2001,《生物化学杂志(Biochem.J.)》353:117和洪(Hong)等人,2007,《应用微生物学和生物技术(Appl.Microbiol.Biotechnol.)》73:1331中所述,两个文献均以引用的方式并入本文中)。

除上文提及的参考文献之外,多种纯化方法在所属领域中是众所周知的,包括例如以下文献中所述的纯化方法:桑德纳(Sandana)(1997),《蛋白质的生物分离(Bioseparation of Proteins)》,学术出版社有限公司;博拉格(Bollag)等人(1996),《蛋白质方法(Protein Methods)》第2版,纽约州威立-利斯;沃克(Walker)(1996),《蛋白质方案手册(The Protein Protocols Handbook)》,胡马纳出版社,新泽西州;哈里斯(Harris)和安格尔(Angal)(1990),《蛋白质纯化应用:实用方法》(Protein PurificationApplications:A Practical Approach),牛津IRL出版社,英国牛津;哈里斯和安格尔,蛋白质纯化方法:实用方法,牛津IRL出版社,英国牛津;斯科普斯(Scopes)(1993),《蛋白质纯化:原理和实践(Protein Purification:Principles and Practice)》第3版,斯普林格出版社,纽约州;詹森(Janson)和赖登(Ryden)(1998),《蛋白质纯化:原理、高分辨率方法和应用(Protein Purification:Principles,High Resolution Methods andApplications)》,第二版,威立-VCH,纽约州;以及沃克(Walker)(1998),《CD-ROM上的蛋白质方案(Protein Protocols on CD-ROM)》,胡马纳出版社,新泽西州,所有文献以引用的方式并入本文中。

在一些实施例中,本公开教示了改良菌株的方法,所述菌株被设计成可产生分泌性产物。举例来说,本公开教示了提高细胞培养物的稳定性、产量、效率或总体期望度,从而产生有价值的小分子或肽的方法。

在一些实施例中,可以利用免疫学方法检测和/或纯化由本公开的细胞产生的分泌性或非分泌性产物。在一种实例方法中,将使用常规方法针对产物分子(例如针对胰岛素多肽或其免疫原性片段)产生的抗体固定于珠粒上,在使内切葡聚糖酶结合的条件下与细胞培养基混合,且沉淀。在一些实施例中,本公开教示了酶联免疫吸附分析(ELISA)的使用。

在其它相关实施例中,使用如以下文献中所公开的免疫色谱法:美国专利第5,591,645号、美国专利第4,855,240号、美国专利第4,435,504号、美国专利第4,980,298号,以及赛旺佩克(Se-Hwan Paek)等人,“一步免疫色谱快速分析的开发(Development ofrapid One-Step Immunochromatographic assay,)”,《方法Methods》,22,53-60,2000),所述文献每一个以引用的方式并入本文中。通用的免疫色谱法通过使用两种抗体来检测试样。第一抗体存在于测试溶液中,或存在于由多孔膜制成的呈大致矩形形状的测试片末端的一部分处,其中有测试溶液滴落。这种抗体用胶乳颗粒或金胶体颗粒标记(这种抗体在下文中称为标记抗体)。当所滴落的测试溶液包括待检测的试样时,标记抗体识别试样从而与试样结合。试样与标记抗体的复合物通过毛细作用流向吸收剂,所述吸收剂由过滤纸制成并且连接到与已包括标记抗体的末端相对的末端。在流动期间,试样与标记抗体的复合物被存在于多孔膜中部的第二抗体(其在下文中称为轻敲抗体)识别且捕获,因此,复合物以可见信号的形式出现在多孔膜的检测部件上且被检测到。

在一些实施例中,本公开的筛选方法是基于光度检测技术(吸收、荧光)。举例来说,在一些实施例中,检测可以基于荧光团检测剂,如结合到抗体的GFP的存在。在其它实施例中,光度检测可以基于来自细胞培养的所期望产物的积累。在一些实施例中,可以通过培养物或得自所述培养物的提取物的UV检测到产物。

在一些实施例中,产物回收方法允许定量测定对每种候选gapA/转氢酶/gdh、asd、dapB和/或ddh基因的性能的作用。在一些实施例中,产物回收方法允许定量测定对每个候选gapA/转氢酶/gdh、asd、dapB和/或ddh基因组合的性能的作用,从而允许比较每个组合并且选择最佳组合。

表2中提供了经由本公开的方法和生物体产生和回收的产物的非限制性清单。

表2:本公开的产物(称为所关注的产物、所产生的化合物等)

Figure BDA0002279931240000681

Figure BDA0002279931240000691

所属领域中的技术人员将认识到,本公开的方法可与产生任何期望的所关注的生物分子产物的宿主细胞相容。

选择准则和目标

表达异源gapA/烟酰胺核苷酸转氢酶/苏氨酸醛缩酶/丙酮酸羧化酶/gdh、asd、dapB和/或ddh的宿主细胞的具体菌株的选择可以基于特定目标。举例来说,在一些实施例中,程序目标可以是最大化单次分批反应产量,无即刻时间限制。在其它实施例中,程序目标可以是生物合成产量的再平衡以产生特定产物,或产生特定的产物比率。在一些实施例中,程序目标可以是改良性能特征,如产量、效价、生产率、副产物消除、对过程偏移的容许性、最佳生长温度和生长速率。在一些实施例中,程序目标是改良宿主性能,如根据微生物所产生的所关注产物的体积生产率、比生产率、产量或力价所度量。

在其它实施例中,程序目标可以是就每一输入量的最终产物产量(例如每磅蔗糖所产生的乙醇总量)而言,优化商业菌株的合成效率。在其它实施例中,程序目标可以是优化合成速度,如根据例如分批完成速率或连续培养系统的产率所度量。在一个实施例中,程序目标是优化所关注的生物分子或产物的最终产物产量和/或生产速率。本文所提供的方法产生的所关注的生物分子或产物可以是葡萄糖产生的任何商品。在一些情况下,所关注的生物分子或产物是小分子、氨基酸、有机酸或醇。氨基酸可以是(不限于)酪氨酸、苯丙氨酸、色氨酸、天冬氨酸、天冬酰胺、苏氨酸、异亮氨酸、甲硫氨酸或赖氨酸。在特定实施例中,氨基酸是赖氨酸。在某些方面,赖氨酸是L-赖氨酸。在某些方面,苏氨酸是L-苏氨酸。有机酸可以是(不限于)丁二酸、乳酸或丙酮酸。醇可以是(不限于)乙醇或异丁醇。

所属领域中的技术人员将认识到如何定制菌株选择准则以满足具体项目目标。举例来说,按照反应饱和度选择菌株单批最大产量适于鉴别具有高单批产量的菌株。跨越一系列温度和条件,基于产量一致性的选择适用于鉴别稳定性和可靠性增强的菌株。

在一些实施例中,用于初始阶段和基于槽罐的验证的选择标准将是相同的。在其它实施例中,基于槽罐的选择可以依据额外和/或不同的选择准则运作。

测序

在一些实施例中,本公开教示了本文所述的生物体的全基因组测序。在其它实施例中,本公开还教示了质粒、PCR产物和其它寡核苷酸的测序,作为对本公开的方法的品质控制。大项目和小项目的测序方法是所属领域的技术人员众所周知的。

在一些实施例中,本公开的方法中可以使用用于核酸测序的任何高通量技术。在一些实施例中,本公开教示了全基因组测序。在其它实施例中,本公开教示了鉴别基因变异的扩增子测序超深度测序。在一些实施例中,本公开还教示了新颖的文库制备方法,包括添加标签(tagmentation)(参见WO/2016/073690)。DNA测序技术包括使用经标记的终止子或引物且在厚片或毛细管中进行凝胶隔离的经典双脱氧测序反应(桑格方法(Sangermethod));使用可逆封端的经标记的核苷酸的边合成边测序、焦磷酸测序;454测序;与经标记的寡核苷酸探针的文库进行等位基因特异性杂交;使用与经标记的克隆的文库进行等位基因特异性杂交、随后进行连接的边合成边测序;在聚合步骤期间并入经标记的核苷酸的实时监测;聚合酶克隆测序(polony sequencing);以及SOLiD测序。

在本发明的一个方面,使用高通量测序方法,其包含对上面进行并行测序的固体表面上的个别分子进行空间分离的步骤。这类固体表面可以包括无孔表面(如索莱萨测序(Solexa sequencing),例如本特雷(Bentley)等人,自然,456:53-59(2008),或全面基因组学测序(Complete Genomics sequencing),例如德尔马纳茨(Drmanac)等人,科学,327:78-81(2010));孔阵列,其可以包括珠粒或颗粒结合的模板(如用454,例如马古利斯(Margulies)等人,自然,437:376-380(2005)或离子激流测序(Ion Torrent sequencing),美国专利公开2010/0137143或2010/0304982);微机械加工的膜(如用SMRT测序,例如艾德(Eid)等人,科学,323:133-138(2009)),或珠粒阵列(如用SOLiD测序或聚合酶克隆测序,例如金(Kim)等人,科学,316:1481-1414(2007))。

在另一个实施例中,本公开的方法包含在对固体表面上的分子进行空间分离之前或之后,将经分离的分子扩增。先前扩增可以包含基于乳液的扩增,如乳液PCR,或滚环扩增。还教示了基于索莱萨的测序,其中对固体表面上的个别模板分子进行空间分离,随后通过桥式PCR对其进行并行扩增以形成单独的克隆群体或簇,且接着测序,如本特雷等人(上文引用)和制造商说明书(例如TruSeqTM样品制备试剂盒和数据表,启迪公司(Illumina,Inc.),加利福尼亚州圣地亚哥(San Diego,Calif.),2010)中所述;且进一步如以下参考文献所述:美国专利第6,090,592号、第6,300,070号、第7,115,400号;以及EP0972081B1,所述文献均以引用的方式并入本文。

在一个实施例中,安置于固体表面上并在固体表面上扩增的个别分子形成密度为每平方厘米至少105个簇,或密度为每平方厘米至少5×105个,或密度为每平方厘米至少106个簇的簇。在一个实施例中,使用具有相对较高错误率的测序化学物质。在这类实施例中,这类化学物质所产生的平均品质分数是序列读段长度的单调下降函数。在一个实施例中,这类下降相当于0.5%的序列读段在位置1-75中具有至少一个错误;1%的序列读段在位置76-100中具有至少一个错误;且2%的序列读段在位置101-125中具有至少一个错误。

序列变体

在一些实施例中,经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,经修饰的GAPDH包含与选自由SEQ IDNO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,其中gdh酶的变体包含与SEQ ID NO:42或44的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,其中asd的变异酶包含与SEQ ID NO:30或40的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,其中dapB的变异酶包含与SEQ ID NO:46或48的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,其中ddh的变异酶包含与SEQ ID NO:4的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。

在一些实施例中,gdh的变异酶包含与选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,asd的变异酶包含与选自由80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。在一些实施例中,苏氨酸醛缩酶的变异酶包含与选自由184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列共享至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的氨基酸序列。

在一些实施例中,多拷贝复制质粒包含与SEQ ID NO:77的thrABC操纵子序列至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的序列。在一些实施例中,gapA的重组蛋白片段包含与选自由SEQ ID NO:233、234、235、236和298组成的群组的氨基酸序列至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%序列同一性的序列。

实例

以下实例是为了说明本发明的各种实施例而提供并且不希望以任何方式限制本公开。所属领域的技术人员将认识到涵盖于由权利要求书范围限定的本发明精神内的其中的变化和其它用途。

这些实例显示增加宿主细胞中所关注的产物的产生的方法,其受NADPH的可用性限制。本公开教示的方法可以用于增加在代谢路径中依赖于NADPH可用性的任何所关注的产物的产生。举例来说,本公开提供了增加例如L-赖氨酸或L-苏氨酸等氨基酸的产生的方法,所述氨基酸是产生受到细胞中的NADPH可用性限制的两种所关注的产物。

众所周知NADPH是细菌中L-赖氨酸和L-苏氨酸产生的限制因素。因此,这些实例说明了克服宿主细胞中的NADPH可用性的限制的六个策略,其增加L-赖氨酸或L-苏氨酸产生。这些策略是:(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在宿主细胞中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和/或asd酶的同源物,将作为赖氨酸、苏氨酸、异亮氨酸和甲硫氨酸前体的天冬氨酸半醛(ASA)的合成重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性dapB和/或ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性ItA的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(PyC)或其同源物以增加草酰乙酸的合成,或增加内源性PyC的表达。在一个实施例中,目标生物体是大肠杆菌。在一个实施例中,目标生物体是棒状杆菌属。

下文提供了内容简表,仅仅是为了帮助读者。此内容表不意图限制本申请的实例或公开内容的范围。

表3-实例部分的内容表

实例1:加宽甘油醛3-磷酸脱氢酶(GAPDH)的辅酶特异性-赖氨酸

甘油醛-3-磷酸脱氢酶(GAPDH)是一种与中心碳代谢途径有关的酶。GAPDH最常见的形式是在迄今为止研究的所有生物体中发现的NAD依赖性酶gapA。此酶由gapA基因编码,并将甘油醛-3-磷酸转变成甘油酸-1,3-双磷酸。来自谷氨酸棒状杆菌的gapA酶的氨基酸序列如下:

MTIRVGINGFGRIGRNFFRAILERSDDLEVVAVNDLTDNKTLSTLLKFDSIMGRLGQEVEYDDDSITVGGKRIAVYAERDPKNLDWAAHNVDIVIESTGFFTDANAAKAHIEAGAKKVIISAPASNEDATFVYGVNHESYDPENHNVISGASCTTNCLAPMAKVLNDKFGIENGLMTTVHAYTGDQRLHDAPHRDLRRARAAAVNIVPTSTGAAKAVALVLPELKGKLDGYALRVPVITGSATDLTFNTKSEVTVESINAAIKEAAVGEFGETLAYSEEPLVSTDIVHDSHGSIFDAGLTKVSGNTVKVVSWYDNEWGYTCQLLRLTELVASKL(SEQ ID NO:58)。

如图1所示,gapA酶使用NAD作为辅酶,将甘油醛-3-磷酸转变成甘油酸-1,3-双磷酸。在此过程期间NAD转变成NADH。如图1中进一步所示,糖酵解途径加入生物合成途径,引起细菌中的L-赖氨酸产生。然而,如上文所论述,谷氨酸棒状杆菌下L-赖氨酸的生物技术产生中的关键因素是NADPH的充足供应。因而,增加谷氨酸棒状杆菌中的NADPH产生应该会增加L-赖氨酸的产生。一种实现此目标的方式将是改变谷氨酸棒状杆菌gapA的辅酶特异性,使得经修饰的酶使用NADP作为辅因子,结果细胞中产生更大量的NADPH。因此,本实验的目标是通过加宽gapA的辅酶特异性至包括NADP来提高谷氨酸棒状杆菌中的赖氨酸生产率。

先前的研究已显示谷氨酸棒状杆菌gapA中的D35G、L36T、T37K和P192S突变引起酶的辅酶特异性改变(从NAD至NADP)(博曼迪(Bomareddy R.R.)等人(2014),《代谢工程(Metab.Eng.)》,25:30-37)。产生谷氨酸棒状杆菌的几个菌株,每个菌株表达具有以上以上中的一或多个的gapA酶,如以下表4中所示。

与具有原生gapA的参考菌株相比,测试菌株产生L-赖氨酸的能力。发现单独或与L36T突变组合的T37K突变引起谷氨酸棒状杆菌gapA的辅酶特异性加宽,使得经修饰的酶显示对NAD和NADP两者的偏好,并且谷氨酸棒状杆菌中经修饰的酶的表达显著提高赖氨酸的生产率(图2)。下文描述谷氨酸棒状杆菌gapA突变菌株(T37K和L36T/T37K)的构建。

通过PCR,使用谷氨酸棒状杆菌的染色体DNA(ATCC 13032)作为模板,使用商业来源的寡核苷酸扩增gapA基因。将PCR片段装配至棒状杆菌属克隆载体中并使用标准的定点诱变技术诱变。最初使用标准的热休克转化技术将载体转化到大肠杆菌中,以便鉴别正确装配的克隆和扩增载体DNA用于棒状杆菌转化。

已验证的克隆通过电穿孔转化到谷氨酸棒状杆菌宿主细胞中。针对每次转化,测定每微克DNA的菌落形成单位(CFU)数目,其随***物尺寸而变。还分析棒状杆菌属基因组整合与同源臂长度的函数关系,且结果表明较短的臂具有较低的效率。

将被鉴别为已经成功整合***盒的棒状杆菌属培养物在含有卡那霉素的培养基上培养以进行反向选择,以便使卡那霉素抗性选择基因环出。

为了进一步验证环出事件,培养展现卡那霉素抗性的菌落并且通过测序加以分析。

在两种不同赖氨酸产生背景菌株亲代_2和亲代_1中,通过以上方法,产生几个突变菌株。表4描述引入每个亲本菌株真的特定突变。

表4:gapA突变体

针对在为了评估产物效价性能而设计的小规模培养(例如96孔板)中的赖氨酸产量来测试新产生的每个菌株和其亲代菌株。使用来自工业规模培养的培养基进行小规模培养。利用标准比色分析,在碳耗竭的情况下对产物效价进行光学测量(即,代表单一分批产量)。简单来说,制备浓缩的分析混合物并且添加到发酵样品中,使得试剂的最终浓度是160mM磷酸钠缓冲液、0.2mM安普莱荧光红(Amplex Red)、0.2U/mL辣根过氧化酶和0.005U/mL赖氨酸氧化酶。允许反应进行到终点并且使用Tecan M1000板式分光光度计在560nm波长下测量光学密度。实验结果概述于图2中。

具有赋予改变的针对NADP的辅酶特异性的某些突变的GAPDH的引入显著提高赖氨酸的生产率(图2)。菌株7000182994和7000184348各自含有T37K,并且性能比其相应的亲代亲代_1和亲代_2更佳。菌株7000182999和7000184352各自含有T37K和L36T,并且性能比其相应的亲代亲代_1和亲代_2更佳。菌株7000182997和7000184349各自含有P192S。菌株7000182998和7000184347各自含有L36T。

实例2:大肠杆菌K-12菌株W3110的产生苏氨酸的基础菌株的构建

如赖氨酸(以及甲硫氨酸、异亮氨酸和甘氨酸)一般,通向苏氨酸合成途径的初始步骤包括草酰乙酸转变成天冬氨酸,其使用通过谷氨酸脱氢酶(gdh)由2-酮戊二酸再生的谷氨酸。然后天冬氨酸盐转变成天冬氨酰磷酸,随后通过酶天冬氨酸半醛脱氢酶(asd)将天冬氨酰磷酸还原成天冬氨酸半醛(aspartate semialdehyde,ASA)。这些步骤是赖氨酸、苏氨酸、异亮氨酸和甲硫氨酸生物合成所通用的。除天冬氨酰磷酸经asd转变成ASA外,苏氨酸形成需要三个额外步骤:(1)通过双功能天冬氨酸激酶/高丝氨酸脱氢酶(thrA)使ASA转变成高丝氨酸;(2)通过高丝氨酸激酶(thrB)使高丝氨酸转变成L-高丝氨酸磷酸;以及最后(3)通过苏氨酸合成酶(thrC)使L-高丝氨酸磷酸转变成苏氨酸。最后这三个步骤独立于NADP/NADH运行。

首先使用野生型大肠杆菌K-12菌株W3110产生了产生苏氨酸的基础菌株。以两个步骤产生此苏氨酸基础菌株:首先,过表达原生的大肠杆菌thrLABC调控子(SEQ ID NO:76),其由以下组成:thrL(富含苏氨酸和异亮氨酸密码子的前导序列,后面紧跟着用以防止编码酶的基因在操纵子中转录的功能性转录终止子);thrA(双功能天冬氨酸激酶/高丝氨酸脱氢酶1);thrB(高丝氨酸激酶)和thrC(苏氨酸合成酶)。通过PCR,使用商业来源的寡核苷酸,由W3110基因组DNA扩增此多核苷酸。将thrLABC操纵子***多拷贝复制质粒(经修饰的pUC19载体;SEQ ID NO:78)中,置于合成启动子pMB085(图8A;SEQ ID NO:75)的控制下。为了缓解表达的减弱,构建此质粒的变体,其中去除了thrL前导序列(图8B;SEQ ID NO:77)。其次,使编码L-苏氨酸3-脱氢酶(tdh)的大肠杆菌W3110染色体的区域缺失,tdh是一种通过催化L-苏氨酸氧化成2-氨基-3-酮丁酸来反对苏氨酸产生的酶。

为了评估所得到的W3110苏氨酸基础菌株W3110 pMB085thrLABCΔtdh(THR01;7000336113)和W3110 pMB085thrABCΔtdh(THR02;7000341282)中的苏氨酸产生,针对在为了评估产物效价性能而设计的小规模培养(例如96孔板)中的苏氨酸产量来测试每个菌株和其亲代(W3110;7000284155)。小规模(300μl)培养物在TPM1培养基中生长。TPM1培养基每升含有:葡萄糖,50g;酵母提取物,2g;MgSO4.7H2O,2g;KH2PO4,4g;(NH4)2SO4,14g;甜菜碱,1g;L-甲硫氨酸,0.149g;L-赖氨酸,0.164g;痕量金属溶液,5ml;以及CaCO3,30g。痕量金属溶液每升含有:FeSO4.7H2O,10g;CaCl2,1.35g;ZnSO4.7H2O,2.25g;MnSO4.4H2O,0.5g;CuSO4.5H2O,1g;(NH4)6Mo7O24.4H2O,0.106g;Na2B4O7.10H2O,0.23g;35%HCl,10ml。通过添加4N KOH将最终pH值调至7.2。需要时向培养基中添加氯胺苯醇(35μg/ml)、卡那霉素(40μg/ml)和安比西林(50μg/ml)。培养物在潮湿(80%湿度)INFORS HT Multitron Pro恒温振荡培养箱中在以1000rpm恒定搅动下在37℃下生长大约36小时。

在无细胞培养基的样品中,针对肽和蛋白质水解产物氨基酸使用AccQ·Tag(沃特斯公司(Waters Corp.)前置柱衍生和分析技术,确定苏氨酸效价。沃特斯AccQ·Fluor试剂用于衍生样品中存在的氨基酸。然后这些衍生物通过逆相HPLC分开并通过荧光检测来定量。通过使用Tecan M1000板式分光光度计,在660nm波长下测量光学密度(OD)来确定每个样品的生物质估计值,并且通过标准比色分析来确定最终葡萄糖浓度。简单来说,用如下最终浓度的试剂制备浓缩的分析混合物:175mM磷酸钠缓冲液pH 7.0;0.2mM安普莱荧光红(Chemodex CDX-A0022);16U/mL来自黑曲霉(Aspergillus niger)的葡萄糖氧化酶(西格玛G7141)和0.2U/mL的辣根过氧化酶(VWR 0417-25000)。使反应在黑暗中在室温下进行30分钟并且使用Tecan M1000板式分光光度计在560nm波长下测量光学密度。以上培养条件和测量结果用于计算效价并估计在以下实例中描述的菌株的产量和生产率。

实例3:加宽甘油醛3-磷酸脱氢酶(GAPDH)的辅酶特异性-苏氨酸

实例2中描述的基础菌株用于以下实例实验。

甘油醛-3-磷酸脱氢酶(GAPDH)是一种与中心碳代谢途径有关的酶。GAPDH最常见的形式是在迄今为止研究的所有生物体中发现的NAD依赖性酶gapA。此酶由gapA基因编码,并将甘油醛-3-磷酸转变成甘油酸-1,3-双磷酸。

如图9所示,gapA酶使用NAD作为辅酶,将甘油醛-3-磷酸转变成甘油酸-1,3-双磷酸。在此过程期间NAD转变成NADH。如图9和图10A-C中进一步所示,糖酵解途径加入生物合成途径,引起细菌中的L-苏氨酸产生。然而,如上文所论述,大肠杆菌下L-苏氨酸的生物技术产生中的关键因素是NADPH的充足供应。因而,增加大肠杆菌中的NADPH产生应该会增加L-苏氨酸的产生。一种实现此目标的方式将是改变gapA的辅酶特异性,使得经修饰的酶使用NADP作为辅因子,结果细胞中产生更大量的NADPH。因此,本实验的目标是通过加宽gapA的辅酶特异性至包括NADP来提高大肠杆菌中的苏氨酸生产率。

先前的研究已显示谷氨酸棒状杆菌gapA中的D35G、L36T、T37K和P192S突变引起酶的辅酶特异性改变(从NAD至NADP)(博曼迪等人(2014),《代谢工程》,25:30-37)。来自谷氨酸棒状杆菌的gapA酶的氨基酸序列如下:

MTIRVGINGFGRIGRNFFRAILERSDDLEVVAVNDLTDNKTLSTLLKFDSIMGRLGQEVEYDDDSITVGGKRIAVYAERDPKNLDWAAHNVDIVIESTGFFTDANAAKAHIEAGAKKVIISAPASNEDATFVYGVNHESYDPENHNVISGASCTTNCLAPMAKVLNDKFGIENGLMTTVHAYTGDQRLHDAPHRDLRRARAAAVNIVPTSTGAAKAVALVLPELKGKLDGYALRVPVITGSATDLTFNTKSEVTVESINAAIKEAAVGEFGETLAYSEEPLVSTDIVHDSHGSIFDAGLTKVSGNTVKVVSWYDNEWGYTCQLLRLTELVASKL(SEQ ID NO:58)。

这里,产生大肠杆菌的几个菌株,每个表达具有以上突变中的一种的异源(谷氨酸棒状杆菌)gapA酶的变体:gapAv5(SEQ ID NO:69)、gapAv7(SEQ ID NO:71)或gapAv8(SEQID NO:73),如以下表5中所示。

表5:在本研究中测试的突变和原生gapA变体

与具有原生大肠杆菌gapA(SEQ ID NO:67)的参考菌株(W3110thrABCΔtdh)相比,测试菌株产生L-苏氨酸的能力。发现所有三种变体gapAv5(SEQ ID NO:69)、gapAv7(SEQ IDNO:71)和gapAv8(SEQ ID NO:73)的表达独立引起苏氨酸效价显著提高(图11A)。下文描述大肠杆菌gapA突变菌株的构建。

通过PCR,使用商业来源的寡核苷酸,由棒状杆菌属克隆载体扩增gapA变体(gapAv5(SEQ ID NO:69)、gapAv7(SEQ ID NO:71)或gapAv8(SEQ ID NO:73))。由W3110基因组DNA扩增原生大肠杆菌gapA。将PCR片段装配至大肠杆菌克隆载体至经修饰的pUC19载体(编码以SEQ ID NO:70、72和74提供的多核苷酸序列)中,并且最初使用标准的热休克转化技术转化到NEB 10-β大肠杆菌细胞中,以便鉴别正确装配的克隆和扩增载体DNA用于转化到大肠杆菌W3110苏氨酸基础菌株THR01和THR02中。

如上所述针对小规模培养(例如96孔板)中的苏氨酸产生来测试新产生的每个菌株和其亲代菌株。

具有赋予改变的针对NADP的辅酶特异性的某些突变的GAPDH的引入显著提高苏氨酸效价(图11A)。菌株7000342726(gapAv5)、7000342720(gapAv7)和7000342727(gapAv8)的性能都比其亲代菌株(7000341282)和表达大肠杆菌gapA的第二拷贝的亲代(7000342723)更佳。

表6:gapA变体的苏氨酸生产率

菌株ID 效价 STDEV
7000342726 gapAv5 19.04 8.33
7000342720 gapAv7 15.47 9.45
7000342727 gapAv8 8.73 4.18
7000342723 Ec_gapA 0.79 1.37
7000341282 thrABC 0.79 1.37
7000284155 W3110 0 0

实例4:通过利用对NADH具有辅因子特异性的变异酶将用于赖氨酸合成的DAP-途径重编程

引起细菌中L-赖氨酸产生的生物合成途径称为二氨基庚二酸(DAP)-途径(图1)。通向DAP-途径的初始步骤包括草酰乙酸转变成天冬氨酸,其使用通过谷氨酸脱氢酶(gdh)由2-酮戊二酸再生的谷氨酸。然后天冬氨酸转变成天冬氨酰磷酸,随后通过酶天冬氨酸半醛脱氢酶(asd)将天冬氨酰磷酸还原成天冬氨酸半醛(ASA)。这些步骤是赖氨酸、苏氨酸、异亮氨酸和甲硫氨酸生物合成所通用的。通向赖氨酸生物合成的第一定向步骤是在二氢吡啶甲酸合成酶催化下ASA转变成二氢吡啶甲酸(DHDP)。然后DHDP被二氢吡啶甲酸还原酶(dapB)还原成四氢吡啶甲酸(THDPA)。包括谷氨酸棒状杆菌在内的几种细菌具有酶内消旋-二氨基庚二酸脱氢酶(ddh),其催化THDPA直接转变成内消旋-二氨基庚二酸(mDAP),然后被二氨基庚二酸脱羧酶转变成L-赖氨酸。

如图1所示,原生谷氨酸棒状杆菌酶gdh、asd、dapB和ddh中的每一个需要NADPH作为辅酶用于其相应作用。然而,NADPH是以工业规模在谷氨酸棒状杆菌中由葡萄糖产生L-赖氨酸中的限制因素之一(贝克尔(Becker)等人(2005),《环境微生物学应用(Appl.Environ.Microbiol.)》,71(12):8587-8596)。因而,增加谷氨酸棒状杆菌中的NADPH产生应该会增加L-赖氨酸的产生。一种实现此目标的方式将是通过利用谷氨酸棒状杆菌酶gdh、asd、dapB和ddh的天然存在的同源物减少NADPH的利用,相比于NADPH,这些同源物更有效地使用NADH作为辅因子。因此,本实验的目标是加宽gdh、asd、dapB和ddh的辅酶依赖性以包括NADH以及NADPH。

谷氨酸棒状杆菌酶gdh和dapB分别在共生梭菌(利雷(Lilley K.S.)等人(1991),《生物化学与生物物理学报(Biochim Biophys Acta)》,1080(3):191-197)和大肠杆菌(雷迪(Reddy S.G.)等人(1995),《生物化学(Biochemistry)》,34(11):3492-3501)中具有已知的同源物,相比于NADPH,这些同源物更有效地使用NADH作为辅因子。不知谷氨酸棒状杆菌酶asd和ddh的此类同源物。因而,在细菌中进行全基因组同源性搜索,以找到谷氨酸棒状杆菌酶asd和ddh的氨基酸序列变体。对于asd和ddh每一种酶,同源性搜索得到9种变体。表7中概述变体的来源和其序列。gdh、asd、dapB和ddh的DNA序列相对于谷氨酸棒状杆菌进行密码子优化。

表7:途径同源物的来源和序列

Figure BDA0002279931240000801

对谷氨酸棒状杆菌gdh和dapB的已知同源物以及谷氨酸棒状杆菌asd和ddh的9种变体进行密码子优化以在谷氨酸棒状杆菌中表达。如图4中所示,gdh和dapB的两个型式各一个拷贝和asd和ddh的十个型式各一个拷贝以多种组合克隆至含有卡那霉素抗性标记基因的质粒中。在本实例中测试的酶的组合概述于表7中。

每个asd-gdh-dapB-ddh组合克隆至质粒(SEQ ID NO:51)中。图4示出一个示例性asd-gdh-dapB-ddh测试组合的盒排列。调控序列为SEQ ID NO:52-57。每个测试组合的最终盒可以如下从5'至3'端表示:

Figure BDA0002279931240000811

应注意,dapB和ddh等位基因的反向互补序列取向是表达盒排列的结果,并且不表示意图引发所述等位基因沉默。

表8:途径同源物的组合

Figure BDA0002279931240000812

Figure BDA0002279931240000821

最初使用标准的热休克转化技术将每个质粒转化到大肠杆菌中,以便鉴别正确装配的克隆和扩增载体DNA用于棒状杆菌转化。

已验证的克隆通过电穿孔转化到谷氨酸棒状杆菌宿主细胞中。针对每次转化,测定每微克DNA的菌落形成单位(CFU)数目,其随***物尺寸而变。还分析棒状杆菌属基因组整合与同源臂长度的函数关系,且结果表明较短的臂具有较低的效率。

将被鉴别为已经成功整合***盒的棒状杆菌属培养物在含有卡那霉素的培养基上培养以进行反向选择,以便使卡那霉素抗性选择基因环出。

为了进一步验证环出事件,培养展现卡那霉素抗性的菌落并且通过测序加以分析。

所有四种酶在谷氨酸棒状杆菌中同时表达。

含有每种酶的异源型式的重组菌株由3种不同亲代菌株制成,所有菌株都是基因上不同的赖氨酸生产菌株。针对在为了评估产物效价性能而设计的小规模培养(例如96孔板)中的赖氨酸产量来测试新产生的每种菌株和其亲代菌株。使用来自工业规模培养的培养基进行小规模培养。利用标准比色分析,在碳耗竭的情况下对产物效价进行光学测量(即,代表单一分批产量)。简单来说,制备浓缩的分析混合物并且添加到发酵样品中,使得试剂的最终浓度是160mM磷酸钠缓冲液、0.2mM安普莱荧光红、0.2U/mL辣根过氧化酶和0.005U/mL赖氨酸氧化酶。允许反应进行到终点并且使用帝肯(Tecan)M1000板式分光光度计在560nm波长下测量光学密度。实验结果呈现于表9中并且概述于图5A和图5B中。

两个谷氨酸棒状杆菌重组菌株7000186960和7000186992显示与相应的亲代-亲代_3和亲代_4相比,显著提高L-赖氨酸的生产率(图5A),每个菌株含有谷氨酸棒状杆菌ddh的原生酶以及gdh、asd和dapB的相同3种异源酶(使用NADH的共生梭菌gdh和大肠杆菌dapB的已知型式以及来自敏捷乳杆菌的asd的变体)。关于不同酶的组合的作用的数据呈现于表9中,并且与亲代相比显著提高的酶组合以粗体突出。

表9:同源物组合的数据

Figure BDA0002279931240000831

Figure BDA0002279931240000841

4基因盒的两个型式引入谷氨酸棒状杆菌亲代_6中并且监测赖氨酸的产生。4种基因是基于其用作替代辅因子NADH而选择,而非NADPH。4基因盒v1(菌株263254)含有来自谷氨酸棒状杆菌的天冬氨酸-半醛脱氢酶(asd)(SEQ ID NO:39)、来自共生梭菌的谷氨酸脱氢酶(gdh)(SEQ ID NO:43)、来自大肠杆菌的4-羟基-四氢二吡啶甲酸还原酶(dapB)(SEQ IDNO:47)和来自谷氨酸棒状杆菌的内消旋-二氨基庚二酸D-脱氢酶(ddh)(SEQ ID NO:3)。因此,4基因盒v1(菌株263254)编码来自谷氨酸棒状杆菌的天冬氨酸-半醛脱氢酶(asd)(SEQID NO:40)、来自共生梭菌的谷氨酸脱氢酶(gdh)(SEQ ID NO:44)、来自大肠杆菌的4-羟基-四氢二吡啶甲酸还原酶(dapB)(SEQ ID NO:48)和来自谷氨酸棒状杆菌的内消旋-二氨基庚二酸D-脱氢酶(ddh)(SEQ ID NO:4)。

4基因盒v2(菌株263264)含有来自敏捷乳杆菌的天冬氨酸-半醛脱氢酶(asd)(SEQID NO:29)、来自共生梭菌的谷氨酸脱氢酶(gdh)(SEQ ID NO:43)、来自大肠杆菌的4-羟基-四氢二吡啶甲酸还原酶(dapB)(SEQ ID NO:47)和来自谷氨酸棒状杆菌的内消旋-二氨基庚二酸D-脱氢酶(ddh)(SEQ ID NO:3)。因此,4基因盒v2(菌株263264)编码来自敏捷乳杆菌的天冬氨酸-半醛脱氢酶(asd)(SEQ ID NO:30)、来自共生梭菌的谷氨酸脱氢酶(gdh)(SEQ IDNO:44)、来自大肠杆菌的4-羟基-四氢二吡啶甲酸还原酶(dapB)(SEQ ID NO:48)和来自谷氨酸棒状杆菌的内消旋-二氨基庚二酸D-脱氢酶(ddh)(SEQ ID NO:4)。

4基因盒显著提高平板模型9中的赖氨酸产生。数据概述于表10中。

表10:提高赖氨酸产生

基因盒 菌株 效价mM(95%CI) 相对于亲代的提高%
亲代_6 6.45+/-0.9 n/a
盒v1 263254 12.41+/-0.9 92.4
盒v2 263264 9.33+/-1.1 44.7

实例5:通过利用对NADH具有辅因子特异性的变异酶将苏氨酸生物合成路径重编程

实例2中描述的基础菌株用于以下实例实验。

引起细菌中L-苏氨酸产生的生物合成途径称为thrABC途径(图9)。如赖氨酸(以及甲硫氨酸、异亮氨酸和甘氨酸)一般,通向苏氨酸合成途径的初始步骤包括草酰乙酸转变成天冬氨酸,其使用通过谷氨酸脱氢酶(gdh)由2-酮戊二酸再生的谷氨酸。然后天冬氨酸转变成天冬氨酰磷酸,随后通过酶天冬氨酸半醛脱氢酶(asd)将天冬氨酰磷酸还原成天冬氨酸半醛(ASA)。这些步骤是赖氨酸、苏氨酸、异亮氨酸和甲硫氨酸生物合成所通用的。除天冬氨酰磷酸经asd转变成ASA外,苏氨酸形成需要三个额外步骤:通过双功能天冬氨酸激酶/高丝氨酸脱氢酶(thrA)使ASA转变成高丝氨酸;通过高丝氨酸激酶(thrB)使高丝氨酸转变成L-高丝氨酸磷酸;以及最后通过苏氨酸合成酶(thrC)使L-高丝氨酸磷酸转变成苏氨酸,但最后这三个步骤独立于NADP/NADH运行,并且在苏氨酸基础菌株中此途径中的任何可能瓶颈通过过表达thrABC操纵子而去除风险。

如图9所示,原生大肠杆菌酶gdh和asd中的每一个需要NADPH作为辅酶用于其相应作用。然而,NADPH是以工业规模在大肠杆菌中由葡萄糖产生L-苏氨酸中的限制因素之一(贝克尔等人(2005),《环境微生物学应用》,71(12):8587-8596)。因而,增加大肠杆菌中的NADPH产生应该会增加L-苏氨酸的产生。一种实现此目标的方式将是通过利用大肠杆菌酶gdh和asd的天然存在的同源物减少NADPH的利用,相比于NADPH,这些同源物更有效地使用NADH作为辅因子。因此,本实验的目标是加宽gdh和asd的辅酶依赖性以包括NADH以及NADPH。

大肠杆菌酶gdh在共生梭菌(利雷等人(1991),《生物化学与生物物理学报》,1080(3):191-197)中具有已知的同源物,相比于NADPH,其同源物更有效地使用NADH作为辅因子。不知大肠杆菌asd的此类同源物。为了研究是否可以鉴别具有更强NADH偏好的额外gdh同源物和具有NADH偏好的新颖asd同源物,对由环境样品开发出的内部宏基因组学文库进行全基因组同源性搜索。搜索由使用敏捷乳杆菌asd(asd_lag;SEQ ID 30)和梭菌目(Clostridiales)gdh(gdh_csy;SEQ ID:44)的蛋白质序列对所述文库进行BlastP分析组成。同源性搜索检索数百个序列,但应用进一步过滤和选择标准以达到每种酶二十四个序列的文库。从与查询序列的同一性<70%的结果的过滤亚群选择每种酶的大约十二个序列。从与查询序列的同一性>70%的序列亚群选择大约另十二个序列。

表11.在用于产生苏氨酸基础菌株的多拷贝苏氨酸操纵子表达载体的构建中所使用的部分的概述和多核苷酸序列

部分类型 部分名称 SEQ ID
启动子 pMB085 75
***物(基因) thrLABC 76
***物(基因) thrABC 77
骨架 pUC19载体 78

表12:途径同源物的来源和序列

Figure BDA0002279931240000861

Figure BDA0002279931240000871

Figure BDA0002279931240000881

大肠杆菌gdh(梭菌目gdh;SEQ ID NO:134)、敏捷乳杆菌asd(SEQ ID NO ID:80)的已知同源物以及酶的24种变体的开放阅读框(ORF)通过PCR使用商业来源的寡核苷酸扩增,并且克隆至含有调控序列、启动子pMB038(SEQ ID NO:237)和转录终止子(SEQ ID NO:238)的基于p15A的多拷贝质粒序列(SEQ ID NO:239)中,如图12所示。asd的型式26和gdh的型式26各一个拷贝呈双顺反子盒以多种组合克隆至基于p15A的多拷贝质粒骨架(SEQ ID NO:239)中。

最初使用标准的热休克转化技术将每个质粒转化到大肠杆菌中,以便鉴别正确装配的克隆和扩增载体DNA用于苏氨酸基础菌株(THR01-02)转化。

已验证的克隆通过电穿孔转化到大肠杆菌基础菌株细胞中。如上所述针对小规模培养中的苏氨酸产量来测试新产生的每种菌株和其亲代菌株。实验结果呈现于表13中。等位基因asd_13(SEQ ID NO:108)和asd_18(SEQ ID NO:118)性能更佳,但未与对照显著不同。等位基因gdh_1(SEQ ID NO:136)、ghd_8(SEQ ID NO:150)、gdh_14(SEQ ID NO:162)、gdh_16(SEQ ID NO:166)、gdh_18(SEQ ID NO:170)、gdh_20(SEQ ID NO:174)和gdh_22(SEQID NO:178)各与W3110和对照菌株相比,均增加苏氨酸(图13)。并非全部菌株都成功地建构和测试。展示较差/无生长的复制样品和统计离群值未在图13中展示,但在表13中表示。

表13.过表达asd和gdh变体的菌株的效价的概述

Figure BDA0002279931240000882

Figure BDA0002279931240000891

实例6:通过利用具有不同底物偏好和酶动力学的变异苏氨酸醛缩酶提高苏氨酸效价实例2中描述的基础菌株用于以下实例实验。

本实例展示一种使用异源苏氨酸醛缩酶基因增加细菌宿主细胞中的L-苏氨酸产生的方法。在大肠杆菌中,苏氨酸醛缩酶(ltaE)通过将L-苏氨酸转变成乙醛和甘氨酸来反对苏氨酸的累积。然而,苏氨酸醛缩酶(TA)的更宽分类家族内存在不同的底物特异性和酶动力学。本实例示出一种利用在TA之间发现的不同底物偏好,通过允许添加具有不同底物偏好或酶动力学的异源TA或用其置换原生ltaE基因来提高苏氨酸产量的策略。然而,应注意本实例如上述实施例一般,是例示性的,并且不应理解为以任何方式限制本公开的范围。

醛缩酶供体组分(亲核试剂)经可逆醇醛加成至接受体组分。大肠杆菌苏氨酸醛缩酶(ltaE)催化L-别-苏氨酸和L-苏氨酸裂解成甘氨酸和乙醛(图10A)。在大肠杆菌中,ltaE通过将L-苏氨酸转变成甘氨酸来反对苏氨酸的累积。然而,苏氨酸醛缩酶基因(TA)的更宽分类家族内存在不同的底物特异性。已经描述了具有促进L-苏氨酸形成的底物偏好(例如丝氨酸、丙氨酸)和动力学的TA(费斯科(Fesko)等人,2015)。

为了研究是否可以鉴别出具有促进苏氨酸产生的底物偏好或酶动力学的大肠杆菌ltaE的同源物,对由环境样品开发出的内部宏基因组学文库进行全基因组同源性搜索。搜索由使用来自阪崎肠杆菌苏氨酸醛缩酶(Csa_ltaE;SEQ ID NO:183)的蛋白质序列对所述文库进行BlastP分析组成,阪崎肠杆菌苏氨酸醛缩酶是一种据报导对甘氨酸偏好的酶(费斯科,2015)。同源性搜索检索数百个序列,但应用进一步过滤和选择标准以达到二十四个序列的文库。从与查询序列的同一性<70%的结果的过滤亚群选择大约十二个序列。从与查询序列的同一性>70%的序列亚群选择大约另十二个序列。

阪崎肠杆菌苏氨酸醛缩酶的开放阅读框(ORF)针对大肠杆菌(SEQ ID NO:183)进行密码子优化并合成为gBlock基因片段(IDT)。通过PCR使用商业来源的寡核苷酸扩增24种ltaE变体,并且克隆至含有启动子pMB038(SEQ ID NO:237)和原生大肠杆菌thrL转录终止子(SEQ ID NO:238)的基于p15A的多拷贝质粒序列(SEQ ID NO:239)中,如图12所示。

最初使用标准的热休克转化技术将每个质粒转化到化学胜任型NEB 10-β大肠杆菌细胞中,以便鉴别正确装配的克隆和扩增载体DNA用于转化至大肠杆菌苏氨酸基础菌株。

已验证的克隆通过电穿孔转化到大肠杆菌基础菌株细胞中。如上所述针对小规模培养中的苏氨酸产量来测试新产生的每种菌株和其亲代菌株。实验结果呈现于表14中。等位基因ltaE_6(SEQ ID NO:196)、ltaE_11(SEQ ID NO:206)、ltaE_18(SEQ ID NO:220)、ltaE_20(SEQ ID NO:224)、lta_24(SEQ ID NO:232)各与thrABC+p15A空白载体对照(对照质粒)和W3310菌株相比增加苏氨酸效价(图14)。

表14.过表达ltaE变体的菌株的效价的概述

Figure BDA0002279931240000901

Figure BDA0002279931240000911

实例7:在大肠杆菌中表达经修饰或变异gapA、gdh、asd和ltaE酶的组合以增加L-苏氨酸产生

实例2中描述的基础菌株用于以下实例实验。

以上策略中的一或多种可以组合使用以进一步增加大肠杆菌中的NADPH产生,因此,增加L-苏氨酸产量。

将gapA、gsd、asd、ltaE的多种组合引入实例2中所述的大肠杆菌thrABCΔtdh背景中。在一些情况下,如上所述,使用商业来源的寡核苷酸,这些组合克隆至含有pMB085-thrABC的相同的经修饰的pUC19载体中并其上转化,呈多顺反子添加在thrABC操纵子的下游,并且由pMB085启动子驱动。当多种基因串列添加时,包括以下核糖体结合位点(RBS)连接子:RBS1(agctggtggaatat(SEQ ID NO:306);在thrC后)、RBS2(aggaggttgt(SEQ ID NO:307);介于基因1与2之间)和RBS3(tgacacctattg(SEQ ID NO:308);介于基因2和3之间)。这些连接序列包括在寡核苷酸尾部并且在基因的PCR扩增期间引入。当gapA、gsd、asd、ltaE的组合表示为thrABC效价的多顺反子操纵子时,针对某些组合观测到高达并超过15mg/L苏氨酸(图11A-C)和表15。

表15.在pUC19质粒上共表达thrABC以及gapA、asd、gdh和ltaE的组合的菌株的效价的概述

Figure BDA0002279931240000912

Figure BDA0002279931240000921

除在pUC19质粒上呈多顺反子表达的基因的以上组合外,还用表达asd、gdh和ltaE的个别文库变体(以上描述和测试)的p15A质粒(SEQ ID NO:239)或空白p15A载体对照(例如Csy_gdh+p15A(-))转化以上菌株中的三种(7000342721、7000342726和7000342720;分别Csy_gdh(SEQ ID NO:44)、gapAv5(SEQ ID NO:69)和gapAv7(SEQ ID NO:71))。这些菌株和其性能(苏氨酸效价)的概述在表16中示出。除W3110外,所有这些菌株都在pMB085-thrABCtdh缺失背景中。对于这些实验来说,大部分相关对照是经空白p15A对照质粒(7000349886、7000349887和7000349885;分别Csy_gdh+p15A(-)、gapAv5+p15A(-)和gapAv7+p15A(-))转化的亲代菌株(Csy_gdh、gapAv5和gapAv7)。asd、gdh或ltaE变体与Csy_gdh、gapAv5或gapAv7的某些组合提高苏氨酸效价。对于表达asd、gdh或ltaE文库变体的许多菌株来说,至少一种生物复制品性能比相关对照菌株更佳(图15)。感到提高苏氨酸效价的个别生物重复品说明由这些组合产生的提高。高变化性(由未能产生苏氨酸的大量复制品产生的巨大标准偏差)可能是当菌株维持两种质粒时质粒不稳定性或高突变速率的结果,但可以通过这些基因的染色体整合来缓解。额外p15A质粒的维持和氯霉素(chloramphenical)中的生长也导致在维持两种质粒的菌株中相对于亲代(例如-p15A(-)质粒相对于亲代),观测到较低效价。

表16.表达Csy_gdh、gapAv5或gapAv7与asd、gdh或ltaE文库变体的组合的菌株的效价的概述

Figure BDA0002279931240000922

实例8:表达由NADH产生NADPH的转氢酶

谷氨酸棒状杆菌中L-赖氨酸的生物技术产生中的关键因素是NADPH的充足供应。如图1所示,膜整合烟酰胺核苷酸转氢酶可以经由氧化NADH来驱动NADP+的还原,由此由NADH产生NADPH。因此表达转氢酶是一种在谷氨酸棒状杆菌中增加细胞NADPH产生并且因此增加L-赖氨酸产生的有效策略。

实例9:表达丙酮酸羧化酶

丙酮酸羧酸酯是一种补充在生长期间生物合成或工业发酵中的赖氨酸和谷氨酸产生所消耗的草酰乙酸的重要回补酶。

已经从以下各菌克隆丙酮酸羧化酶基因并测序:埃特里根瘤菌(Rhizobium etli)(邓恩(Dunn,F.F.)等人,《细菌学杂志(J.Bacteriol.)》178:5960-5970(1996))、嗜热脂肪芽孢杆菌(Bacillus stearothermophilus)(近藤(Kondo,H.)等人,《基因(Gene)》191:47-50(1997)、枯草芽孢杆菌(Bacillus subtillis)(Genbank登录号Z97025)、结核分枝杆菌(Mycobacterium tuberculosis)(Genbank登录号Z83018)和嗜热自养甲烷杆菌(Methanobacterium thermoautotrophicum)(穆霍帕迪亚(Mukhopadhyay,B.),《生物化学杂志(J.Biol.Chem.)》273:5155-5166(1998)。先前已经在乳酸发酵短杆菌(Brevibacterium lactofermentum)(登坂(Tosaka,O.)等人,农业与生物化学(Agric.Biol.Chem.)43:1513-1519(1979))和谷氨酸棒状杆菌(彼得斯-温迪赫(Peters-Wendisch,P.G.)等人,《微生物学(Microbiology)》143:1095-1103(1997)中测量丙酮酸羧化酶活性。

研究表明氨基酸的天冬氨酸家族的产量和生产率关键取决于回补途径的碳通量(瓦尼路(Vallino,J.J.)和斯特凡诺普洛斯(Stephanopoulos,G.),《生物技术与生物工程学(Biotechnol.Bioeng.)》41:633-646(1993))。基于代谢物平衡,可以经由回补途径,显示赖氨酸产生速率小于或等于草酰乙酸合成速率。

谷氨酸棒状杆菌的丙酮酸羧化酶基因可以用突变体或变体置换,由此优选地,丙酮酸羧化酶表达比其在谷氨酸棒状杆菌基础菌株中的表达高2至20倍。

认为大肠杆菌缺乏内源性丙酮酸羧化酶基因。可以提供异源丙酮酸羧化酶。可以将来自谷氨酸棒状杆菌或另一微生物的异源丙酮酸羧化酶基因引入任何大肠杆菌菌株,例如实例2中所述的基础菌株。在一些应用中,需要内源性或异源丙酮酸羧化酶的表达水平的精确调节或微调,因为由突变或变异pyc基因的表达水平或活性水平引起的丙酮酸羧化酶活性不足或丙酮酸羧化酶活性过度可能引起不太理想的后果。在这类情况下,启动子梯可以用于调节或微调表达。通过测试变化强度的启动子元件与多种pyc变体或突变体组合,可以确定产生最佳基因活性的启动子与pyc基因的组合,从而增加例如L-苏氨酸等所期望化合物的产生。

实例10:在谷氨酸棒状杆菌或大肠杆菌中表达经修饰的gapA、转氢酶和经修饰的gdh、asd、dapB和ddh酶的组合以增加L-赖氨酸或L-苏氨酸产生

以上策略中的一或多种可以组合使用以进一步增加谷氨酸棒状杆菌或大肠杆菌中的NADPH产生,因此,增加L-赖氨酸或L-苏氨酸的产量。

实例11:鉴别新颖甘油醛3-磷酸脱氢酶(GAPDH)等位基因

使用gapAv9(D35G、L36T、T37K、P192S)(SEQ ID NO:303)作为起始序列产生gapA基因的NNK文库。每种诱变基因作为gapA的第二拷贝个别地引入在具有内源性gapA等位基因的谷氨酸棒状杆菌中的中性整合基因座(介于cg1504与cg1505之间)处,处于原生gapA启动子的调控下。在两种不同平板分析中筛选超过1200种gapA整合体以鉴别提高赖氨酸效价的等位基因。某些整合体显示与亲代菌株(黑色菱形)赖氨酸表达增加(黑色圆圈)相比(图14)。

几个截短gapA序列引起赖氨酸表达增加。原生gapA序列加下划线。剩余氨基酸是框移突变的假象。

表17:gapA截短增加赖氨酸表达

Figure BDA0002279931240000951

表18.提高赖氨酸产生的gapA基因中的新突变的清单

Figure BDA0002279931240000952

具有SEQ ID NO识别符的本公开序列

Figure BDA0002279931240000953

Figure BDA0002279931240000961

Figure BDA0002279931240000971

Figure BDA0002279931240000981

Figure BDA0002279931240000991

Figure BDA0002279931240001001

本公开的编号实施例

尽管随附条款,但本公开阐述以下编号实施例。

提高使用NADPH产生的化合物的产生

1.一种提高宿主细胞产生使用NADPH产生的化合物的能力的方法,所述方法包含改变细胞的可利用的NADPH。

2.条款1的方法,其中所述可利用的NADPH通过在所述细胞中表达经修饰的甘油醛-3-磷酸脱氢酶(GAPDH)来改变,其中所述经修饰的GAPDH经过修饰,使得其辅酶特异性加宽。

3.条款2的方法,其中所述经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性。

4.条款3的方法,其中所述天然存在的GAPDH是gapA。

5.条款4的方法,其中所述gapA具有SEQ ID NO:58的氨基酸序列。

6.条款2至5中任一条款的方法,其中所述经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列共享至少70%序列同一性的氨基酸序列。

7.条款2至5中任一条款的方法,其中所述经修饰的GAPDH包含与选自由SEQ IDNO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

8.条款2至7中任一条款的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。

9.条款2至8中任一条款的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。

10.条款8或9的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

11.条款9的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸36相对应的位置处的残基是苏氨酸,并且所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

12.条款2至11中任一条款的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸192相对应的位置处包含氨基酸置换。

13.条款12的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸192相对应的位置处的残基是丝氨酸。

14.条款2至13中任一条款的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸224相对应的位置处的残基是丝氨酸。

15.条款2至14中任一条款的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸110相对应的位置处的残基是天冬氨酸。

16.条款2至15中任一条款的方法,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸140相对应的位置处的残基是甘氨酸。

17.条款2至5中任一条款的方法,其中所述经修饰的GAPDH包含与选自由SEQ IDNO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组的氨基酸序列相同的氨基酸序列。

18.条款1至17中任一条款的方法,其中所述化合物选自表2。

19.条款18的方法,其中所述化合物是赖氨酸。

20.条款18的方法,其中所述化合物是苏氨酸。

21.条款1至20中任一条款的方法,其中所述宿主细胞是原核细胞。

22.条款21的方法,其中所述宿主细胞来自选自由以下组成的群组的属:土壤杆菌属(Agrobacterium)、脂环杆菌属(Alicyclobacillus)、念珠藻属(Anabaena)、倒囊藻属(Anacystis)、不动杆菌属(Acinetobacter)、酸热菌属(Acidothermus)、节杆菌属(Arthrobacter)、固氮菌属(Azobacter)、芽孢杆菌属(Bacillus)、双歧杆菌属(Bifidobacterium)、短杆菌属(Brevibacterium)、丁酸弧菌属(Butyrivibrio)、布赫纳氏菌属(Buchnera)、平原菟丝子(Campestris)、弯曲杆菌属(Camplyobacter)、梭菌属(Clostridium)、棒状杆菌属、红色硫黃细菌属(Chromatium)、粪球菌属(Coprococcus)、埃希氏杆菌属(Escherichia)、肠球菌属(Enterococcus)、肠杆菌属(Enterobacter)、欧文菌属(Erwinia)、梭杆菌属(Fusobacterium)、粪栖杆菌属(Faecalibacterium)、弗朗西斯氏菌属(Francisella)、黄杆菌属(Flavobacterium)、土芽孢杆菌属(Geobacillus)、嗜血杆菌属(Haemophilus)、螺旋杆菌属(Helicobacter)、克雷伯氏菌属(Klebsiella)、乳杆菌属(Lactobacillus)、乳球菌属(Lactococcus)、泥杆菌属(Ilyobacter)、微球菌属(Micrococcus)、微杆菌属(Microbacterium)、中间根瘤菌属(Mesorhizobium)、甲基杆菌属(Methylobacterium)、甲基杆菌属、分枝杆菌属(Mycobacterium)、奈瑟菌属(Neisseria)、泛菌属(Pantoea)、假单胞菌属(Pseudomonas)、原绿球藻属(Prochlorococcus)、红细菌属(Rhodobacter)、红假单胞菌属(Rhodopseudomonas)、红假单胞菌属、罗斯氏菌属(Roseburia)、红螺菌属(Rhodospirillum)、红球菌属(Rhodococcus)、栅列藻属(Scenedesmus)、链霉菌属(Streptomyces)、链球菌属(Streptococcus)、聚球藻属(Synecoccus)、糖单孢菌属(Saccharomonospora)、葡萄球菌属(Staphylococcus)、沙雷氏菌属(Serratia)、沙门氏菌属(Salmonella)、志贺杆菌属(Shigella)、嗜热厌氧杆菌属(Thermoanaerobacterium)、养障体(Tropheryma)、土拉热(Tularensis)、蒂梅丘拉(Temecula)、嗜热聚球藻属(Thermosynechococcus)、热球菌属(Thermococcus)、脲原体属(Ureaplasma)、黄单胞菌属(Xanthomonas)、木杆菌属(Xylella)、耶尔森氏菌属(Yersinia)和发酵单胞菌属(Zymomonas)。

23.条款22的方法,其中所述宿主细胞是谷氨酸棒状杆菌。

24.条款22的方法,其中所述宿主细胞是大肠杆菌。

25.条款1至20中任一条款的方法,其中所述宿主细胞是真核细胞。

26.条款25的方法,其中所述宿主细胞来自选自由以下组成的群组的属:棉霉属(Achlya)、枝顶孢属(Acremonium)、曲霉属(Aspergillus)、短梗霉属(Aureobasidium)、烟管霉属(Bjerkandera)、拟蜡菌属(Ceriporiopsis)、头孢霉属(Cephalosporium)、金孢霉属(Chrysosporium)、旋孢腔菌属(Cochliobolus)、棒囊壳属(Corynascus)、隐丛赤壳属(Cryphonectria)、隐球菌属(Cryptococcus)、鬼伞属(Coprinus)、革盖菌属(Coriolus)、色二孢属(Diplodia)、内斯菌属(Endothis)、镰孢菌属(Fusarium)、赤霉属(Gibberella)、胶霉属(Gliocladium)、腐殖菌属(Humicola)、肉座菌属(Hypocrea)、毁丝霉属(Myceliophthora)、白霉菌属(Mucor)、脉孢菌属(Neurospora)、青霉属(Penicillium)、柄孢壳属(Podospora)、射脉菌属(Phlebia)、瘤胃壶菌属(Piromyces)、梨胞霉属(Pyricularia)、根毛霉属(Rhizomucor)、根霉菌属(Rhizopus)、裂殖菌属(Schizophyllum)、革节孢属(Scytalidium)、孢子丝菌属(Sporotrichum)、踝节菌属(Talaromyces)、嗜热子囊菌属(Thermoascus)、梭孢壳霉属(Thielavia)、栓菌属(Tramates)、弯颈霉菌属(Tolypocladium)、木霉属(Trichoderma)、轮枝孢属(Verticillium)和小包脚菇属(Volvariella)。

包含经修饰的GAPDH的宿主细胞

27.一种宿主细胞,其包含相对于天然存在的GAPDH具有加宽的辅酶特异性的经修饰的GAPDH,其中所述宿主细胞相对于对应的缺乏所述经修饰的GAPDH的宿主细胞具有提高的使用NADPH产生的化合物的产生。

28.条款27的宿主细胞,其中所述可利用的NADPH通过在所述细胞中表达经修饰的甘油醛-3-磷酸脱氢酶(GAPDH)来改变,其中所述经修饰的GAPDH经过修饰,使得其辅酶特异性加宽。

29.条款27或28的宿主细胞,其中所述经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性。

30.条款29的宿主细胞,其中所述天然存在的GAPDH是gapA。

31.条款30的宿主细胞,其中所述gapA具有SEQ ID NO:58的氨基酸序列。

32.条款27至31中任一条款的宿主细胞,其中所述经修饰的GAPDH包含与SEQ IDNO:58的氨基酸序列共享至少70%序列同一性的氨基酸序列。

33.条款27至31中任一条款的宿主细胞,其中所述经修饰的GAPDH包含与选自由SEQ ID NO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

34.条款27至33中任一条款的宿主细胞,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。

35.条款27至34中任一条款的宿主细胞,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。

36.条款34或35的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

37.条款35的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸36相对应的位置处的残基是苏氨酸,并且所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

38.条款27至37中任一条款的宿主细胞,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸192相对应的位置处包含氨基酸置换。

39.条款38的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸192相对应的位置处的残基是丝氨酸。

40.条款27至39中任一条款的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸224相对应的位置处的残基是丝氨酸。

41.条款27至40中任一条款的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸110相对应的位置处的残基是天冬氨酸。

42.条款27至41中任一条款的宿主细胞,其中所述经修饰的GAPDH的与SEQ ID NO:58的氨基酸140相对应的位置处的残基是甘氨酸。

43.条款27至31中任一条款的宿主细胞,其中所述经修饰的GAPDH包含与选自由SEQ ID NO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组的氨基酸序列相同的氨基酸序列。

44.条款27至43中任一条款的宿主细胞,其中所述化合物选自表2。

45.条款44的宿主细胞,其中所述化合物是赖氨酸。

46.条款44的宿主细胞,其中所述化合物是苏氨酸。

47.条款27至46中任一条款的宿主细胞,其中所述宿主细胞是原核细胞。

48.条款47的宿主细胞,其中所述宿主细胞来自选自由以下组成的群组的属:土壤杆菌属、脂环杆菌属、念珠藻属、倒囊藻属、不动杆菌属、酸热菌属、节杆菌属、固氮菌属、芽孢杆菌属、双歧杆菌属、短杆菌属、丁酸弧菌属、布赫纳氏菌属、平原菟丝子、弯曲杆菌属、梭菌属、棒状杆菌属、红色硫黃细菌属、粪球菌属、埃希氏杆菌属、肠球菌属、肠杆菌属、欧文菌属、梭杆菌属、粪栖杆菌属、弗朗西斯氏菌属、黄杆菌属、土芽孢杆菌属、嗜血杆菌属、螺旋杆菌属、克雷伯氏菌属、乳杆菌属、乳球菌属、泥杆菌属、微球菌属、微杆菌属、中间根瘤菌属、甲基杆菌属、甲基杆菌属、分枝杆菌属、奈瑟菌属、泛菌属、假单胞菌属、原绿球藻属、红细菌属、红假单胞菌属、红假单胞菌属、罗斯氏菌属、红螺菌属、红球菌属、栅列藻属、链霉菌属、链球菌属、聚球藻属、糖单孢菌属、葡萄球菌属、沙雷氏菌属、沙门氏菌属、志贺杆菌属、嗜热厌氧杆菌属、养障体、土拉热、蒂梅丘拉、嗜热聚球藻属、热球菌属、脲原体属、黄单胞菌属、木杆菌属、耶尔森氏菌属和发酵单胞菌属。

49.条款48的宿主细胞,其中所述宿主细胞是谷氨酸棒状杆菌。

50.条款48的宿主细胞,其中所述宿主细胞是大肠杆菌。

51.条款27至46中任一条款的宿主细胞,其中所述宿主细胞是真核细胞。

52.条款51的宿主细胞,其中所述宿主细胞来自选自由以下组成的群组的属:棉霉属、枝顶孢属、曲霉属、短梗霉属、烟管霉属、拟蜡菌属、头孢霉属、金孢霉属、旋孢腔菌属、棒囊壳属、隐丛赤壳属、隐球菌属、鬼伞属、革盖菌属、色二孢属、内斯菌属、镰孢菌属、赤霉属、胶霉属、腐殖菌属、肉座菌属、毁丝霉属、白霉菌属、脉孢菌属、青霉属、柄孢壳属、射脉菌属、瘤胃壶菌属、梨胞霉属、根毛霉属、根霉菌属、裂殖菌属、革节孢属、孢子丝菌属、踝节菌属、嗜热子囊菌属、梭孢壳霉属、栓菌属、弯颈霉菌属、木霉属、轮枝孢属和小包脚菇属。

在棒状杆菌属中产生L-赖氨酸的方法

53.一种产生L-赖氨酸的方法,其包含培养棒状杆菌属菌株并从所培养的棒状杆菌属菌株或培养液回收L-赖氨酸,其中所述棒状杆菌属菌株表达使用NADP作为辅酶的经修饰的GAPDH,并且其中所述棒状杆菌属菌株的L-赖氨酸生产率得到提高。

加宽GAPDH的辅酶特异性的方法

54.一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性。

55.条款54的方法,其中相对于NAD,所述经修饰的GAPDH具有增加的针对辅酶NADP的特异性。

56.条款54或55的方法,其中相比于NAD,所述经修饰的GAPDH更有效地使用NADP。

提高产生使用NADPH产生的化合物的效率的方法

57.一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

58.条款57的方法,其中所述化合物选自表2。

59.条款57或58的方法,其中相比于NADPH,所述变异酶更有效地使用NADH。

60.条款57至59中任一条款的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与SEQ ID NO:42或44的氨基酸序列共享至少70%序列同一性的氨基酸序列。

61.条款57至60中任一条款的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与SEQ ID NO:30或40的氨基酸序列共享至少70%序列同一性的氨基酸序列。

62.条款57至61中任一条款的方法,其中所述方法包含表达dapB的变异酶,其中所述变异酶包含与SEQ ID NO:46或48的氨基酸序列共享至少70%序列同一性的氨基酸序列。

63.条款57至62中任一条款的方法,其中所述方法包含表达ddh的变异酶,其中所述变异酶包含与SEQ ID NO:4的氨基酸序列共享至少70%序列同一性的氨基酸序列。

64.条款57至63中任一条款的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

65.条款57至63中任一条款的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

66.条款57至63中任一条款的方法,其中其中所述方法包含表达gdh的变异酶,并且包含表达asd的变异酶,包含表达dapB的变异酶,包含表达ddh的变异酶,包含表达ddh的变异酶。

67.条款57至66中任一条款的方法,其中所述化合物选自表2。

68.条款68的方法,其中所述化合物是赖氨酸。

69.条款68的方法,其中所述化合物是苏氨酸。

70.条款57至69中任一条款的方法,其中所述宿主细胞是原核细胞。

71.条款70的方法,其中所述宿主细胞来自选自由以下组成的群组的属:土壤杆菌属、脂环杆菌属、念珠藻属、倒囊藻属、不动杆菌属、酸热菌属、节杆菌属、固氮菌属、芽孢杆菌属、双歧杆菌属、短杆菌属、丁酸弧菌属、布赫纳氏菌属、平原菟丝子、弯曲杆菌属、梭菌属、棒状杆菌属、红色硫黃细菌属、粪球菌属、埃希氏杆菌属、肠球菌属、肠杆菌属、欧文菌属、梭杆菌属、粪栖杆菌属、弗朗西斯氏菌属、黄杆菌属、土芽孢杆菌属、嗜血杆菌属、螺旋杆菌属、克雷伯氏菌属、乳杆菌属、乳球菌属、泥杆菌属、微球菌属、微杆菌属、中间根瘤菌属、甲基杆菌属、甲基杆菌属、分枝杆菌属、奈瑟菌属、泛菌属、假单胞菌属、原绿球藻属、红细菌属、红假单胞菌属、红假单胞菌属、罗斯氏菌属、红螺菌属、红球菌属、栅列藻属、链霉菌属、链球菌属、聚球藻属、糖单孢菌属、葡萄球菌属、沙雷氏菌属、沙门氏菌属、志贺杆菌属、嗜热厌氧杆菌属、养障体、土拉热、蒂梅丘拉、嗜热聚球藻属、热球菌属、脲原体属、黄单胞菌属、木杆菌属、耶尔森氏菌属和发酵单胞菌属。

72.条款71的方法,其中所述宿主细胞是谷氨酸棒状杆菌。

73.条款71的方法,其中所述宿主细胞是大肠杆菌。

74.条款57至69中任一条款的方法,其中所述宿主细胞是真核细胞。

75.条款74的方法,其中所述宿主细胞来自选自由以下组成的群组的属:棉霉属、枝顶孢属、曲霉属、短梗霉属、烟管霉属、拟蜡菌属、头孢霉属、金孢霉属、旋孢腔菌属、棒囊壳属、隐丛赤壳属、隐球菌属、鬼伞属、革盖菌属、色二孢属、内斯菌属、镰孢菌属、赤霉属、胶霉属、腐殖菌属、肉座菌属、毁丝霉属、白霉菌属、脉孢菌属、青霉属、柄孢壳属、射脉菌属、瘤胃壶菌属、梨胞霉属、根毛霉属、根霉菌属、裂殖菌属、革节孢属、孢子丝菌属、踝节菌属、嗜热子囊菌属、梭孢壳霉属、栓菌属、弯颈霉菌属、木霉属、轮枝孢属和小包脚菇属。

包含gdh、asd、dapB或ddh的变体的宿主细胞

76.一种宿主细胞,其包含:一或多种酶gdh、asd、dapB和ddh的变体,其中所述变体展现针对辅酶NADH和NADPH的双特异性。

使用新颖烟酰胺核苷酸转氢酶的方法

77.一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

通过策略提高L-赖氨酸产生效率的方法

78.一种提高宿主细胞产生L-赖氨酸的效率的方法,其包含以下中的两个或更多个:

(1)对内源性GAPDH进行修饰,使得经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性;(2)在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性;以及(3)在所述宿主细胞中表达新颖烟酰胺核苷酸转氢酶。

使用gdh和/或asd的方法

79.一种提高宿主细胞产生使用NADPH产生的化合物的效率的方法,其包含:在所述宿主细胞中表达酶谷氨酸脱氢酶(gdh)和天冬氨酸半醛脱氢酶(asd)中的一或两种酶的变异酶,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

80.条款79的方法,其中相比于NADPH,所述变异酶更有效地使用NADH。

81.条款79或80的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

82.条款81的方法,其中所述gdh的变异酶包含选自由SEQ ID NO:144、150、162、166、170、174和178组成的群组的氨基酸序列。

83.条款79至82中任一条款的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

84.条款83的方法,其中所述asd的变异酶包含选自由SEQ ID NO:108和118组成的群组的氨基酸序列。

提高用苏氨酸醛缩酶产生L-苏氨酸的效率的方法

85.一种提高宿主细胞产生L-苏氨酸的效率的方法,其包含:在所述宿主细胞中表达苏氨酸醛缩酶的变异酶,其中所述变异酶展现与大肠杆菌苏氨酸醛缩酶(ltaE)不同的底物偏好或酶动力学。

86.条款85的方法,其中所述变异酶促进苏氨酸产生超过甘氨酸产生。

87.条款85或86的方法,其中所述方法包含表达苏氨酸醛缩酶的变异酶,其中所述变异酶包含与选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

88.条款87的方法,其中所述变异酶包含选自由SEQ ID NO:196、206、220、224和232组成的群组的氨基酸序列。

89.条款85至88中任一条款的方法,其中所述宿主细胞是原核细胞。

90.条款89的方法,其中所述宿主细胞来自选自由以下组成的群组的属:土壤杆菌属、脂环杆菌属、念珠藻属、倒囊藻属、不动杆菌属、酸热菌属、节杆菌属、固氮菌属、芽孢杆菌属、双歧杆菌属、短杆菌属、丁酸弧菌属、布赫纳氏菌属、平原菟丝子、弯曲杆菌属、梭菌属、棒状杆菌属、红色硫黃细菌属、粪球菌属、埃希氏杆菌属、肠球菌属、肠杆菌属、欧文菌属、梭杆菌属、粪栖杆菌属、弗朗西斯氏菌属、黄杆菌属、土芽孢杆菌属、嗜血杆菌属、螺旋杆菌属、克雷伯氏菌属、乳杆菌属、乳球菌属、泥杆菌属、微球菌属、微杆菌属、中间根瘤菌属、甲基杆菌属、甲基杆菌属、分枝杆菌属、奈瑟菌属、泛菌属、假单胞菌属、原绿球藻属、红细菌属、红假单胞菌属、红假单胞菌属、罗斯氏菌属、红螺菌属、红球菌属、栅列藻属、链霉菌属、链球菌属、聚球藻属、糖单孢菌属、葡萄球菌属、沙雷氏菌属、沙门氏菌属、志贺杆菌属、嗜热厌氧杆菌属、养障体、土拉热、蒂梅丘拉、嗜热聚球藻属、热球菌属、脲原体属、黄单胞菌属、木杆菌属、耶尔森氏菌属和发酵单胞菌属。

91.条款90的方法,其中所述宿主细胞是谷氨酸棒状杆菌。

92.条款90的方法,其中所述宿主细胞是大肠杆菌。

93.条款85至88中任一条款的方法,其中所述宿主细胞是真核细胞。

94.条款93的方法,其中所述宿主细胞来自选自由以下组成的群组的属:棉霉属、枝顶孢属、曲霉属、短梗霉属、烟管霉属、拟蜡菌属、头孢霉属、金孢霉属、旋孢腔菌属、棒囊壳属、隐丛赤壳属、隐球菌属、鬼伞属、革盖菌属、色二孢属、内斯菌属、镰孢菌属、赤霉属、胶霉属、腐殖菌属、肉座菌属、毁丝霉属、白霉菌属、脉孢菌属、青霉属、柄孢壳属、射脉菌属、瘤胃壶菌属、梨胞霉属、根毛霉属、根霉菌属、裂殖菌属、革节孢属、孢子丝菌属、踝节菌属、嗜热子囊菌属、梭孢壳霉属、栓菌属、弯颈霉菌属、木霉属、轮枝孢属和小包脚菇属。

通过变异酶提高L-苏氨酸产生效率的方法

95.一种增加宿主细胞的L-苏氨酸产生的方法,其包含:在所述宿主细胞中表达酶甘油醛3-磷酸脱氢酶(gapA)、谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、苏氨酸醛缩酶(ltaE)和丙酮酸羧化酶(pyc)中的一或多种酶的变异酶。

96.条款95的方法,其中gdh的变异酶或asd的变异酶展现针对辅酶NADH和NADPH的双特异性。

97.条款95的方法,其中相比于NADPH,gapA的变异酶、gdh的变异酶或asd的变异酶更有效地使用NADH。

98.条款95至97中任一条款的方法,其中苏氨酸醛缩酶的变异酶促进苏氨酸产生超过甘氨酸产生。

99.条款95至98中任一条款的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

100.条款99的方法,其中所述gdh的变异酶包含选自由SEQ ID NO:144、150、162、166、170、174和178组成的群组的氨基酸序列。

101.条款95至100中任一条款的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

102.条款101的方法,其中所述asd的变异酶包含选自由SEQ ID NO:108和118组成的群组的氨基酸序列。

103.条款95至102中任一条款的方法,其中所述方法包含表达苏氨酸醛缩酶的变异酶,其中所述变异酶包含与选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

104.条款103的方法,其中所述苏氨酸醛缩酶的变异酶包含选自由SEQ ID NO:196、206、220、224和232组成的群组的氨基酸序列。

105.条款95至104中任一条款的方法,其中所述方法包含表达gapA的变异酶,其中所述gapA的变异酶包含选自由SEQ ID NO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组的氨基酸序列。

106.条款95至105中任一条款的方法,其中所述方法包含表达gapA的变异酶,其中gapA的变异酶包含与SEQ ID NO:58的氨基酸序列共享至少70%序列同一性的氨基酸序列。

107.条款95至105中任一条款的方法,其中所述方法包含表达gapA的变异酶,其中gapA的变异酶包含与选自由SEQ ID NO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列共享至少70%序列同一性的氨基酸序列。

108.条款106或107的方法,其中gapA的变异酶在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。

109.条款106或107的方法,其中gapA的变异酶在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。

110.条款108或109的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

111.条款109的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸36相对应的位置处的残基是苏氨酸,并且gapA的变异酶的与SEQ ID NO:58的氨基酸37相对应的位置处的残基是赖氨酸。

112.条款106至111中任一条款的方法,其中gapA的变异酶在与SEQ ID NO:58的氨基酸192相对应的位置处包含氨基酸置换。

113.条款112的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸192相对应的位置处的残基是丝氨酸。

114.条款106至113中任一条款的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸224相对应的位置处的残基是丝氨酸。

115.条款106至114中任一条款的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸110相对应的位置处的残基是天冬氨酸。

116.条款106至115中任一条款的方法,其中gapA的变异酶的与SEQ ID NO:58的氨基酸140相对应的位置处的残基是甘氨酸。

117.条款95至105中任一条款的方法,其中所述方法包含表达gapA的变异酶,其中gapA的变异酶包含与选自由SEQ ID NO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组的氨基酸序列相同的氨基酸序列。

苏氨酸基础菌株

118.一种宿主细胞,其包含多拷贝复制质粒,所述多拷贝复制质粒包含各自可操作地连接到一或多个合成启动子的thrA基因、thrB基因和thrC基因。

119.条款118的宿主细胞,其中所述宿主细胞是tdh缺失(Δtdh)细胞。

120.条款118或119的宿主细胞,其中所述多拷贝复制质粒包含与SEQ ID NO:77的thrABC操纵子序列至少70%相同的序列。

通过包括苏氨酸醛缩酶和丙酮酸羧化酶的策略提高化合物产生效率的方法

121.一种提高宿主细胞产生化合物的效率的方法,其包含以下中的两个或更多个:

(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在宿主细胞中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ItA)的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

122.条款121的方法,其中所述通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化包含表达gapA的变异酶,所述变异酶包含选自由SEQ ID NO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列。

123.条款121或122的方法,其中所述通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程包含以下中的一或多个:

i)表达gdh的变异酶,所述变异酶包含选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列;

ii)表达asd的变异酶,所述变异酶包含选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列;

iii)表达dapB的变异酶,所述变异酶包含选自由SEQ ID NO:46和48组成的群组的氨基酸序列;以及

iv)表达ddh的变异酶,所述变异酶包含选自由SEQ ID NO:2、4、6、8、10、12、14、16、18和20组成的群组的氨基酸序列。

124.条款121至123中任一条款的方法,其中所述通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程包含以下中的一或多个:

i)表达gdh的变异酶,所述变异酶包含选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列;以及

ii)表达asd的变异酶,所述变异酶包含选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列。

125.条款121至124中任一条款的方法,其中所述通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ItA)的同源物,将苏氨酸合成重编程包含表达ltA的变异酶,所述变异酶包含选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列。

126.条款121至125中任一条款的方法,其中所述表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达包含表达pyc的变异酶,所述变异酶包含选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232241、243、245、247、249、251、253、255、257、259、261、263、265、267、269、271、273、275、277、279、281、283、285、287和289组成的群组的氨基酸序列。

新gapA变体-多核苷酸

127.一种人工多核苷酸,其编码截短的甘油醛-3-磷酸脱氢酶(gapA)基因,其中所述多核苷酸包含与选自由SEQ ID NO:290、291、292和293组成的群组的多核苷酸序列至少85%、90%、95%或99%相同的序列。

128.条款127的人工多核苷酸,其中所述多核苷酸包含选自由SEQ ID NO:290、291、292和293组成的群组的多核苷酸序列。

129.一种载体,其包含可操作地连接到启动子的条款127或128的人工多核苷酸。

新gapA变体-蛋白质

130.一种甘油醛-3-磷酸脱氢酶(gapA)的重组蛋白片段,其中所述重组蛋白片段包含与选自由SEQ ID NO:233、234、235、236和298组成的群组的氨基酸序列至少70%、80%、90%或95%相同的序列。

131.条款130的重组蛋白片段,其中所述重组蛋白片段包含选自由SEQ ID NO:233、234、235、236和298组成的群组的氨基酸序列。

132.条款130或131的重组蛋白片段,其中所述重组蛋白片段缺乏gapA活性。

133.条款130至133中任一条款的重组蛋白片段,其中当宿主细胞包含具有gapA活性的另一蛋白质时所述重组蛋白片段增强所述宿主细胞的选自表2的化合物的生产率。

其它实施例

134.一种提高微生物细胞产生使用NADPH产生的化合物的能力的方法,所述方法包含改变所述细胞的可利用的NADPH。

135.技术方案134的方法,其中所述可利用的NADPH通过在所述细胞中表达经修饰的甘油醛-3-磷酸脱氢酶(GAPDH)来改变,其中所述经修饰的GAPDH经过修饰,使得其辅酶特异性加宽。

136.技术方案134的方法,其中通过在所述微生物细胞中表达酶谷氨酸脱氢酶(gdh)、天冬氨酸半醛脱氢酶(asd)、二氢吡啶甲酸还原酶(dapB)和内消旋-二氨基庚二酸脱氢酶(ddh)中的一或多种酶的变异酶来改变所述细胞的可利用的NADPH,其中所述变异酶展现针对辅酶NADH和NADPH的双特异性。

137.技术方案135的方法,其中所述经修饰的GAPDH相对于对应的天然存在的GAPDH具有增加的针对辅酶NADP的特异性。

138.技术方案134至137中任一技术方案的方法,其中所述微生物细胞是细菌细胞。

139.技术方案138的方法,其中所述细菌细胞来自选自由棒状杆菌属、埃希氏杆菌属、芽孢杆菌属或土芽孢杆菌属组成的群组的细菌。

140.技术方案138的方法,其中所述细菌是谷氨酸棒状杆菌或大肠杆菌。

141.技术方案134至137中任一技术方案的方法,其中所述微生物细胞是酵母细胞。

142.技术方案141的方法,其中所述酵母细胞是来自酵母菌属的细胞。

143.技术方案137的方法,其中所述天然存在的GAPDH是gapA。

144.技术方案143的方法,其中所述gapA具有SEQ ID NO:58的氨基酸序列。

145.技术方案134的方法,其中所述经修饰的GAPDH包含与SEQ ID NO:58的氨基酸序列至少70%相同的氨基酸序列。

146.技术方案134的方法,其中所述经修饰的GAPDH包含与选自由SEQ ID NO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列至少70%相同的氨基酸序列。

147.技术方案145或146的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸37相对应的位置处包含氨基酸置换。

148.技术方案147的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸36和37相对应的位置处包含氨基酸置换。

149.技术方案147的方法,其中在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。

150.技术方案148的方法,其中在与SEQ ID NO:58的氨基酸36相对应的位置处的亮氨酸已经被苏氨酸置换,并且在与SEQ ID NO:58的氨基酸37相对应的位置处的苏氨酸已经被赖氨酸置换。

151.技术方案135的方法,其中所述经修饰的GAPDH在与SEQ ID NO:58的氨基酸192相对应的位置处包含氨基酸置换。

152.技术方案135的方法,其中在与SEQ ID NO:58的氨基酸172相对应的位置处的脯氨酸已经被丝氨酸置换。

153.技术方案135的方法,其中在与SEQ ID NO:58的氨基酸224相对应的位置处的亮氨酸已经被丝氨酸置换。

154.技术方案135的方法,其中在与SEQ ID NO:58的氨基酸110相对应的位置处的组氨酸已经被天冬氨酸置换。

155.技术方案135的方法,其中在与SEQ ID NO:58的氨基酸140相对应的位置处的酪氨酸已经被甘氨酸置换。

156.技术方案146的方法,其中所述经修饰的GAPDH选自由SEQ ID NO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组。

157.技术方案134至137中任一技术方案的方法,其中所述化合物选自表2。

158.技术方案157的方法,其中所述化合物是L-赖氨酸或L-苏氨酸。

159.一种微生物细胞,其包含相对于天然存在的GAPDH,具有加宽的辅酶特异性的经修饰的GAPDH,其中相对于缺乏所述经修饰的GAPDH的对应微生物细胞,所述微生物细胞提高使用NADPH产生的化合物的产生。

160.技术方案159的微生物细胞,其中所述经修饰的GAPDH相对于所述天然存在的GAPDH具有增加的针对NADP的特异性。

161.技术方案160的微生物细胞,其中所述经修饰的GAPDH包含与SEQ ID NO:58至少70%相同的氨基酸序列。

162.技术方案160的微生物细胞,其中所述经修饰的GAPDH包含与选自由SEQ IDNO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列至少70%相同的氨基酸序列。

163.技术方案160的微生物细胞,其中所述经修饰的GAPDH包含与SEQ ID NO:58至少70%相同的氨基酸序列并且其中所述经修饰的GAPDH包含在SEQ ID NO:58的位置36、37或两个位置处的氨基酸的取代。

164.技术方案160的微生物细胞,其中所述经修饰的GAPDH选自由SEQ ID NO:69、71、73、303、294、296、233、234、235、236、298和300组成的群组。

165.技术方案159的微生物细胞,其中所述化合物选自表2。

166.技术方案165的微生物细胞,其中所述化合物是L-赖氨酸或L-苏氨酸。

167.技术方案159的微生物细胞,其中所述微生物细胞来自细菌。

168.技术方案167的微生物细胞,其中所述细菌是棒状杆菌属、埃希氏杆菌属、芽孢杆菌属或土芽孢杆菌属。

169.技术方案168的微生物细胞,其中所述细菌是谷氨酸棒状杆菌或大肠杆菌。

170.技术方案165的微生物细胞,其中所述微生物细胞是酵母细胞。

171.一种加宽GAPDH的辅酶特异性的方法,其包含:对所述GAPDH进行修饰,使得经修饰的GAPDH具有针对辅酶NADP和NAD的双特异性。

172.技术方案171的方法,其中相对于NAD,所述经修饰的GAPDH具有增加的针对辅酶NADP的特异性。

173.技术方案172的方法,其中相比于NAD,所述经修饰的GAPDH更有效地使用NADP。

174.技术方案136的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与SEQ ID NO:42或44的氨基酸序列至少70%相同的氨基酸序列。

175.技术方案136的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与SEQ ID NO:30或40的氨基酸序列至少70%相同的氨基酸序列。

176.技术方案136的方法,其中所述方法包含表达dapB的变异酶,其中所述变异酶包含与SEQ ID NO:46或48的氨基酸序列至少70%相同的氨基酸序列。

177.技术方案136的方法,其中所述方法包含表达ddh的变异酶,其中所述ddh酶包含SEQ ID NO:4的氨基酸序列。

178.技术方案136的方法,其中所述方法包含表达gdh的变异酶,其中所述变异酶包含与选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列至少70%相同的氨基酸序列。

179.技术方案136的方法,其中所述方法包含表达asd的变异酶,其中所述变异酶包含与选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列至少70%相同的氨基酸序列。

180.技术方案136的方法,其中所有四种酶的变体同时在所述微生物细胞中表达。

181.一种微生物细胞,其包含:一或多种酶gdh、asd、dapB和ddh的变体,其中所述变体展现针对辅酶NADH和NADPH的双特异性。

182.技术方案178的方法,其中所述gdh的变异酶包含选自由SEQ ID NO:144、150、162、166、170、174、178组成的群组的氨基酸序列。

183.技术方案179的方法,其中所述asd的变异酶包含选自由SEQ ID NO:108和118组成的群组的氨基酸序列。

184.技术方案134至137中任一技术方案的方法,其进一步包含:在所述微生物细胞中表达苏氨酸醛缩酶的变异酶,其中苏氨酸醛缩酶的变异酶展现与大肠杆菌苏氨酸醛缩酶(ltaE)不同的底物偏好或酶动力学。

185.技术方案184的方法,其中变异苏氨酸醛缩酶促进苏氨酸产生超过甘氨酸产生。

186.技术方案184的方法,其中所述变异苏氨酸醛缩酶包含与选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列至少70%相同的氨基酸序列。

187.技术方案186的方法,其中所述变异苏氨酸醛缩酶包含选自由SEQ IDNO:196、206、220、224和232组成的群组的氨基酸序列。

188.技术方案184的方法,其中所述化合物是L-苏氨酸。

189.技术方案140的方法,其中所述细菌是大肠杆菌并且所述方法进一步包含在所述大肠杆菌细胞中表达pyc。

190.技术方案189的方法,其中所述方法包含表达pyc的变异酶,其中所述pyc的变异酶包含与选自由SEQ ID NO:241、243、245、247、249、251、253、255、257、259、261、263、265、267、269、271、273、275、277、279、281、283、285、287和289组成的群组的氨基酸序列至少70%相同的氨基酸序列。

191.一种微生物细胞,其包含多拷贝复制质粒,所述多拷贝复制质粒包含各自可操作地连接到一或多个合成启动子的thrA基因、thrB基因和thrC基因。

192.技术方案191的微生物细胞,其中所述微生物细胞是tdh缺失(Δtdh)细胞。

193.技术方案191的微生物细胞,其中所述多拷贝复制质粒包含与SEQ ID NO:77的thrABC操纵子序列至少70%相同的序列。

194.一种提高微生物细胞产生化合物的效率的方法,其包含以下中的两个或更多个:

(1)通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化;(2)在细菌中表达由NADH产生NADPH的转氢酶;(3)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程;(4)通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程;(5)通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ItA)的同源物,将苏氨酸合成重编程;以及(6)表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达。

195.技术方案194的方法,其中所述化合物选自表2。

196.技术方案195的方法,其中所述化合物是L-苏氨酸。

197.技术方案194至196中任一技术方案的方法,其中所述通过加宽内源性糖酵解酶甘油醛-3-磷酸脱氢酶(gapA)的辅酶特异性,使所述酶具有针对NADP和NAD的双特异性,将产生NADPH的糖酵解途径工程化包含表达gapA的变异酶,所述变异酶包含选自由SEQ IDNO:294、296、233、234、235、236、298和300组成的群组的氨基酸序列。

198.技术方案194至196中任一技术方案的方法,其中所述通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh、asd、dapB和ddh酶的同源物,将用于赖氨酸合成的DAP-途径重编程包含以下中的一或多个:

i)表达gdh的变异酶,所述变异酶包含选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列;

ii)表达asd的变异酶,所述变异酶包含选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列;

iii)表达dapB的变异酶,所述变异酶包含选自由SEQ ID NO:46和48组成的群组的氨基酸序列;以及

iv)表达ddh的变异酶,所述变异酶包含选自由SEQ ID NO:2、4、6、8、10、12、14、16、18和20组成的群组的氨基酸序列。

199.技术方案194至196中任一技术方案的方法,其中所述通过表达相比于NADPH更有效地使用NADH作为辅因子的内源性gdh和asd酶的同源物,将用于苏氨酸合成的thrABC-途径重编程包含以下中的一或多个:

i)表达gdh的变异酶,所述变异酶包含选自由SEQ ID NO:132、134、136、138、140、142、144、146、148、150、152、154、156、158、160、162、164、166、168、170、172、174、176、178、180和182组成的群组的氨基酸序列;以及

ii)表达asd的变异酶,所述变异酶包含选自由SEQ ID NO:80、82、84、86、88、90、92、94、96、98、100、102、104、106、108、110、112、114、116、118、120、122、124、126、128和130组成的群组的氨基酸序列。

200.技术方案194至196中任一技术方案的方法,其中所述通过表达减少或逆转苏氨酸降解成甘氨酸的内源性L-苏氨酸醛缩酶(ItA)的同源物,将苏氨酸合成重编程包含表达ltA的变异酶,所述变异酶包含选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232组成的群组的氨基酸序列。

201.技术方案194至196中任一技术方案的方法,其中所述表达异源丙酮酸羧化酶(pyc)或其同源物以增加草酰乙酸的合成,或增加内源性pyc的表达包含表达pyc的变异酶,所述变异酶包含选自由SEQ ID NO:184、186、188、190、192、194、196、198、200、202、204、206、208、210、212、214、216、218、220、222、224、226、228、230和232241、243、245、247、249、251、253、255、257、259、261、263、265、267、269、271、273、275、277、279、281、283、285、287和289组成的群组的氨基酸序列。

202.一种人工多核苷酸,其编码截短的甘油醛-3-磷酸脱氢酶(gapA)基因,其中所述多核苷酸包含与选自由SEQ ID NO:290、291、292和293组成的群组的多核苷酸序列至少70%相同的序列。

203.技术方案202的人工多核苷酸,其中所述多核苷酸包含选自由SEQ ID NO:290、291、292和293组成的群组的多核苷酸序列。

204.一种载体,其包含可操作地连接到启动子的条款202或203的人工多核苷酸。

205.一种甘油醛-3-磷酸脱氢酶(gapA)的重组蛋白片段,其中所述重组蛋白片段包含与选自由SEQ ID NO:233、234、235、236和298组成的群组的氨基酸序列至少70%相同的序列。

206.技术方案205的重组蛋白片段,其中所述重组蛋白片段包含选自由SEQ IDNO:233、234、235、236和298组成的群组的氨基酸序列。

207.技术方案205或206的重组蛋白片段,其中所述重组蛋白片段缺乏gapA活性。

208.技术方案207的重组蛋白片段,其中当微生物细胞包含具有gapA活性的另一蛋白质时所述重组蛋白片段增强所述微生物细胞的选自表2的化合物的生产率。

209.技术方案159至169中任一技术方案的微生物细胞,其中所述微生物细胞进一步包含变异苏氨酸醛缩酶、pyc蛋白质或两者。

引用并入

本文中所引用的所有参考文献、论文、公开、专利、专利公开以及专利申请以全文引用的方式并入以用于所有目的。

然而,提及本文引用的任何参考文献、论文、公开、专利、专利公开以及专利申请案不是并且不应认为是承认或以任何形式暗示其构成有效现有技术或形成世界上任何国家的公共常识的部分。具体地说,以下申请以全文引用的方式并入本文中:2016年12月30日提交的美国申请第15/396,230号;2016年12月7日提交的国际申请第PCT/US2016/065465号;2016年4月27日提交的美国申请第15/140,296号;2016年7月29日提交的美国临时申请第62/368,786号;以及2015年12月7日提交的美国临时申请第62/264,232号。

序列表

<110> 齐默尔根公司(Zymergen Inc.)

S•曼彻斯特(Manchester, Shawn)

<120> 增加NADPH的生物合成途径的基因组工程化

<130> ZYMR-011/01WO 327574-2057

<150> US 62/508,589

<151> 2017-05-19

<160> 308

<170> PatentIn version 3.5

<210> 1

<211> 969

<212> DNA

<213> 人工序列

<220>

<223> 来自口腔放线菌的进行密码子优化的ddh

<400> 1

atgattcgcg ttgcgatcaa tggatatggc aacctgggac ggggtgtcga acaagcgatt 60

acgaagaacg cggacatgga agtcgcggtc gtgtttacgc gccgcgaccc agctacggtg 120

actacccagg gcgcccccgt cgcccatgtt gatgacatgg ccgcttgggc cgataaagtg 180

gatgtctgtc ttaactgcgg cggatcagcg accgacttga ttgaacaaac gcccgctgcg 240

gcagctcttt tcaacaccgt agattcgttc gatacgcatg cccggattcc tgagcatttc 300

gccgcggtgg acgccgcagc gaaggcatca ggccatgtgg cgttgatttc agcgggctgg 360

gacccaggac ttttttccat gctccgggtc ctcggcgaag cagtcctccc agacggtgct 420

accacgacct tctggggccc cggagtttcg cagggtcatt cagacgctct gcgtcgcatc 480

gacggtgtgg tagatgcgaa acaatacact cggccagtcg aggcaacggt ggctgccgtc 540

aaggcaggag atgatgttga gctcactacg cgctcaatgc acactcgtga ctgctatgta 600

gttgcggagg aaggcgcaga tcttgcccgg atcgagcggg agatcgttga gatgcccaat 660

tacttcgctg attacgatac taccgttact ttcattactg ccgaggaact tgccgcggag 720

cacgcgggta ttccgcatgg aggatcggta attcggcgtg gccataccag cgaaggagtg 780

gccgaaaccg tgtcgtttga gctgcaattg ggctctaacc ccgaatttac gggatcagtc 840

ctggttgcta cggcgcgtgc tgtcgcacgg cttgctgccc ggggcgaaac tggtgcccgg 900

acggtttttg acgttactct tgccgacttg tctccgacta gccccgagga gctccgtgct 960

cactacctg 969

<210> 2

<211> 323

<212> PRT

<213> 人工序列

<220>

<223> 来自口腔放线菌的进行密码子优化的ddh

<400> 2

Met Ile Arg Val Ala Ile Asn Gly Tyr Gly Asn Leu Gly Arg Gly Val

1 5 10 15

Glu Gln Ala Ile Thr Lys Asn Ala Asp Met Glu Val Ala Val Val Phe

20 25 30

Thr Arg Arg Asp Pro Ala Thr Val Thr Thr Gln Gly Ala Pro Val Ala

35 40 45

His Val Asp Asp Met Ala Ala Trp Ala Asp Lys Val Asp Val Cys Leu

50 55 60

Asn Cys Gly Gly Ser Ala Thr Asp Leu Ile Glu Gln Thr Pro Ala Ala

65 70 75 80

Ala Ala Leu Phe Asn Thr Val Asp Ser Phe Asp Thr His Ala Arg Ile

85 90 95

Pro Glu His Phe Ala Ala Val Asp Ala Ala Ala Lys Ala Ser Gly His

100 105 110

Val Ala Leu Ile Ser Ala Gly Trp Asp Pro Gly Leu Phe Ser Met Leu

115 120 125

Arg Val Leu Gly Glu Ala Val Leu Pro Asp Gly Ala Thr Thr Thr Phe

130 135 140

Trp Gly Pro Gly Val Ser Gln Gly His Ser Asp Ala Leu Arg Arg Ile

145 150 155 160

Asp Gly Val Val Asp Ala Lys Gln Tyr Thr Arg Pro Val Glu Ala Thr

165 170 175

Val Ala Ala Val Lys Ala Gly Asp Asp Val Glu Leu Thr Thr Arg Ser

180 185 190

Met His Thr Arg Asp Cys Tyr Val Val Ala Glu Glu Gly Ala Asp Leu

195 200 205

Ala Arg Ile Glu Arg Glu Ile Val Glu Met Pro Asn Tyr Phe Ala Asp

210 215 220

Tyr Asp Thr Thr Val Thr Phe Ile Thr Ala Glu Glu Leu Ala Ala Glu

225 230 235 240

His Ala Gly Ile Pro His Gly Gly Ser Val Ile Arg Arg Gly His Thr

245 250 255

Ser Glu Gly Val Ala Glu Thr Val Ser Phe Glu Leu Gln Leu Gly Ser

260 265 270

Asn Pro Glu Phe Thr Gly Ser Val Leu Val Ala Thr Ala Arg Ala Val

275 280 285

Ala Arg Leu Ala Ala Arg Gly Glu Thr Gly Ala Arg Thr Val Phe Asp

290 295 300

Val Thr Leu Ala Asp Leu Ser Pro Thr Ser Pro Glu Glu Leu Arg Ala

305 310 315 320

His Tyr Leu

<210> 3

<211> 960

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的ddh

<400> 3

atgacgaaca tccgtgtagc aatcgtcgga tacggtaatc tgggacgcag cgtagaaaaa 60

ctcatcgcca agcaaccaga catggatctt gttggaattt tctcgcgccg ggcgactctc 120

gatacgaaga cccccgtctt cgatgtggcg gacgttgata aacatgccga tgatgtcgat 180

gtactctttt tgtgcatggg atctgcaacg gatatcccgg agcaagcccc caagttcgct 240

caatttgcct gtacggtgga cacgtacgat aatcatcgtg atatcccccg gcatcgccaa 300

gttatgaatg aagctgcaac cgcagcaggc aatgtagcgt tggtttctac gggctgggac 360

ccaggcatgt tttcgattaa tcgtgtttat gccgctgctg ttttggccga gcaccagcaa 420

cacacgtttt ggggtccagg acttagccag ggccatagcg atgcccttcg ccgcatcccg 480

ggtgttcaaa aggctgttca gtacacgttg ccttctgaag atgcgcttga aaaagcacgg 540

cgcggcgagg ccggagattt gaccggcaag caaacgcata aacgccagtg cttcgttgtg 600

gccgacgcgg ccgaccatga gcgcatcgag aacgatattc ggactatgcc cgattacttc 660

gtaggctatg aggtggaagt caatttcatc gatgaagcaa ccttcgactc tgaacatacg 720

ggtatgcccc acggcggtca cgtgatcacg actggcgaca ctggcggttt taaccacacc 780

gttgagtata ttctcaagct ggaccgtaat cccgacttca ctgcgtcctc tcaaatcgcg 840

ttcggccgtg cagcgcaccg catgaaacaa caaggccaat caggtgcctt taccgttctg 900

gaagttgccc catatttgtt gagcccggaa aacttggacg acttgattgc ccgggatgtg 960

<210> 4

<211> 320

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的ddh

<400> 4

Met Thr Asn Ile Arg Val Ala Ile Val Gly Tyr Gly Asn Leu Gly Arg

1 5 10 15

Ser Val Glu Lys Leu Ile Ala Lys Gln Pro Asp Met Asp Leu Val Gly

20 25 30

Ile Phe Ser Arg Arg Ala Thr Leu Asp Thr Lys Thr Pro Val Phe Asp

35 40 45

Val Ala Asp Val Asp Lys His Ala Asp Asp Val Asp Val Leu Phe Leu

50 55 60

Cys Met Gly Ser Ala Thr Asp Ile Pro Glu Gln Ala Pro Lys Phe Ala

65 70 75 80

Gln Phe Ala Cys Thr Val Asp Thr Tyr Asp Asn His Arg Asp Ile Pro

85 90 95

Arg His Arg Gln Val Met Asn Glu Ala Ala Thr Ala Ala Gly Asn Val

100 105 110

Ala Leu Val Ser Thr Gly Trp Asp Pro Gly Met Phe Ser Ile Asn Arg

115 120 125

Val Tyr Ala Ala Ala Val Leu Ala Glu His Gln Gln His Thr Phe Trp

130 135 140

Gly Pro Gly Leu Ser Gln Gly His Ser Asp Ala Leu Arg Arg Ile Pro

145 150 155 160

Gly Val Gln Lys Ala Val Gln Tyr Thr Leu Pro Ser Glu Asp Ala Leu

165 170 175

Glu Lys Ala Arg Arg Gly Glu Ala Gly Asp Leu Thr Gly Lys Gln Thr

180 185 190

His Lys Arg Gln Cys Phe Val Val Ala Asp Ala Ala Asp His Glu Arg

195 200 205

Ile Glu Asn Asp Ile Arg Thr Met Pro Asp Tyr Phe Val Gly Tyr Glu

210 215 220

Val Glu Val Asn Phe Ile Asp Glu Ala Thr Phe Asp Ser Glu His Thr

225 230 235 240

Gly Met Pro His Gly Gly His Val Ile Thr Thr Gly Asp Thr Gly Gly

245 250 255

Phe Asn His Thr Val Glu Tyr Ile Leu Lys Leu Asp Arg Asn Pro Asp

260 265 270

Phe Thr Ala Ser Ser Gln Ile Ala Phe Gly Arg Ala Ala His Arg Met

275 280 285

Lys Gln Gln Gly Gln Ser Gly Ala Phe Thr Val Leu Glu Val Ala Pro

290 295 300

Tyr Leu Leu Ser Pro Glu Asn Leu Asp Asp Leu Ile Ala Arg Asp Val

305 310 315 320

<210> 5

<211> 993

<212> DNA

<213> 人工序列

<220>

<223> 来自超嗜热古菌的进行密码子优化的ddh

<400> 5

atgaaaaaaa tcaacgtcgg aattattggt tacggtaacg tcggacgggg tgtgaagcaa 60

gctctcgaga aaaacgcaga catgaaactg gtcgctatcc tgactcgtcg cccagagcgg 120

gtacggaagg aaatcaaaga cgtgcatgtt ttccggactg atgaatcgtt gccgaaatcc 180

tttgaaatcg acgtggcggt cttgtgtggt ggatccaaga aagacatgcc aatccaaggt 240

ccaaaatttg cggccaaata caacaccgtt gatagcttcg acacccatgc cgacatccct 300

agctatttca agaaaatgga ttcaatcgct aaaaaacatg gtaatgtgtc tatcatctca 360

gcgggatggg atcccggtat tttcagcctg gagcgtgtcc ttggcggcgc ttttctgccg 420

gaatctaagc ggtatacgtt ttggggcaag ggtgtgtctc tcggtcactc tgatgctgct 480

cgccgcgtga aaggtgtctc tgatgctatt caatacacca ttccgattga gaaggctatt 540

caacgcatcc gtgcgggaga tgcgccagac tttagcaaaa cggaaatgca caagcgcgtt 600

gtttacgttg tccctgaaga gggtgccgac cttaagaaaa tccggaagga aattaccgag 660

atgccaaagt attttgaagg atatgatacg gaggtcattt ttatcactga gaaagaaatg 720

aaaaaacact ccacgtttcc ccacggcggc tttgtcttca ctagcggtgt aacgggagat 780

tctaaccgtc aaatcctcga atataaatgc cagctcgaga acaatagcga gttcactgcg 840

tctgtccttg tagcgtgcgc acgcgctgcg tatcgtctga atgagaaagg ctaccgtggt 900

gcttttacct ttttggactt tcccttgtcg tttcttatcg agtcggagtt tagcgcgtgc 960

ttcgaaagcc gcgcccggcg caatccctct cct 993

<210> 6

<211> 331

<212> PRT

<213> 人工序列

<220>

<223> 来自超嗜热古菌的进行密码子优化的ddh

<400> 6

Met Lys Lys Ile Asn Val Gly Ile Ile Gly Tyr Gly Asn Val Gly Arg

1 5 10 15

Gly Val Lys Gln Ala Leu Glu Lys Asn Ala Asp Met Lys Leu Val Ala

20 25 30

Ile Leu Thr Arg Arg Pro Glu Arg Val Arg Lys Glu Ile Lys Asp Val

35 40 45

His Val Phe Arg Thr Asp Glu Ser Leu Pro Lys Ser Phe Glu Ile Asp

50 55 60

Val Ala Val Leu Cys Gly Gly Ser Lys Lys Asp Met Pro Ile Gln Gly

65 70 75 80

Pro Lys Phe Ala Ala Lys Tyr Asn Thr Val Asp Ser Phe Asp Thr His

85 90 95

Ala Asp Ile Pro Ser Tyr Phe Lys Lys Met Asp Ser Ile Ala Lys Lys

100 105 110

His Gly Asn Val Ser Ile Ile Ser Ala Gly Trp Asp Pro Gly Ile Phe

115 120 125

Ser Leu Glu Arg Val Leu Gly Gly Ala Phe Leu Pro Glu Ser Lys Arg

130 135 140

Tyr Thr Phe Trp Gly Lys Gly Val Ser Leu Gly His Ser Asp Ala Ala

145 150 155 160

Arg Arg Val Lys Gly Val Ser Asp Ala Ile Gln Tyr Thr Ile Pro Ile

165 170 175

Glu Lys Ala Ile Gln Arg Ile Arg Ala Gly Asp Ala Pro Asp Phe Ser

180 185 190

Lys Thr Glu Met His Lys Arg Val Val Tyr Val Val Pro Glu Glu Gly

195 200 205

Ala Asp Leu Lys Lys Ile Arg Lys Glu Ile Thr Glu Met Pro Lys Tyr

210 215 220

Phe Glu Gly Tyr Asp Thr Glu Val Ile Phe Ile Thr Glu Lys Glu Met

225 230 235 240

Lys Lys His Ser Thr Phe Pro His Gly Gly Phe Val Phe Thr Ser Gly

245 250 255

Val Thr Gly Asp Ser Asn Arg Gln Ile Leu Glu Tyr Lys Cys Gln Leu

260 265 270

Glu Asn Asn Ser Glu Phe Thr Ala Ser Val Leu Val Ala Cys Ala Arg

275 280 285

Ala Ala Tyr Arg Leu Asn Glu Lys Gly Tyr Arg Gly Ala Phe Thr Phe

290 295 300

Leu Asp Phe Pro Leu Ser Phe Leu Ile Glu Ser Glu Phe Ser Ala Cys

305 310 315 320

Phe Glu Ser Arg Ala Arg Arg Asn Pro Ser Pro

325 330

<210> 7

<211> 978

<212> DNA

<213> 人工序列

<220>

<223> 来自粪芽孢菌属的进行密码子优化的ddh

<400> 7

atgatcaaaa tcggcatcgt gggctacgga aacctgggac gtggtgtgga atgcgcggtc 60

catcattcgc aggatatgga attggcggga gttttcacgc ggcgcaatcc ggagacggtt 120

aaaactcaca ccgacgttcc tgtgtatgat atggagaaac tgtacgacat gcagggcgac 180

attgatgtcc tcgtgctgtg cggaggctcc gctaatgatc tgccgaagca gacggttgag 240

ttggcacagt atttcaatgt tgtagactct ttcgatactc atgccaaaat ccccgagcat 300

ttctctaatg ttaatcaaag cagcgagaaa ggtaaacata tttcgattat ttcagtaggc 360

tgggatcctg gattgttctc cctcaatcgg ctgtatggac aagcaattct gccgaatgga 420

aacgactaca ctttttgggg taaaggagtt tcacagggac attctgatgc gatccggcgt 480

atcgcaggcg ttaaagacgc gcgccagtac acgatccccg tggacgccgc gttggagagc 540

gttcgcaatg gagaaaatcc gaccctcacc actcgggaga agcacactcg ggagtgtttc 600

gttgtcgctg aagacggcgc tgaccttaaa gtgatcgaag aaacgatcaa gaccatgcca 660

aactattttg ctgactatga cacgactgtt catttcatct cggaggaaga gctgatgcgg 720

gatcatcaag gaattccgca tggtggcgtc gtacttcgca gcggaaccac gggctttgac 780

tatgagaaca agcacgtaat cgaatacaaa ctcactctgg attcgaaccc cgagttcacc 840

tcctctgttc tcgttgcata tgctcgggcc gcttatcgta tgcaccaaga gggccaatgc 900

ggttgtaaaa ctgtatttga tattgccccg gcataccttc acgttgaatc cggagaggaa 960

ttgcgtaaga aactcttg 978

<210> 8

<211> 326

<212> PRT

<213> 人工序列

<220>

<223> 来自粪芽孢菌属的进行密码子优化的ddh

<400> 8

Met Ile Lys Ile Gly Ile Val Gly Tyr Gly Asn Leu Gly Arg Gly Val

1 5 10 15

Glu Cys Ala Val His His Ser Gln Asp Met Glu Leu Ala Gly Val Phe

20 25 30

Thr Arg Arg Asn Pro Glu Thr Val Lys Thr His Thr Asp Val Pro Val

35 40 45

Tyr Asp Met Glu Lys Leu Tyr Asp Met Gln Gly Asp Ile Asp Val Leu

50 55 60

Val Leu Cys Gly Gly Ser Ala Asn Asp Leu Pro Lys Gln Thr Val Glu

65 70 75 80

Leu Ala Gln Tyr Phe Asn Val Val Asp Ser Phe Asp Thr His Ala Lys

85 90 95

Ile Pro Glu His Phe Ser Asn Val Asn Gln Ser Ser Glu Lys Gly Lys

100 105 110

His Ile Ser Ile Ile Ser Val Gly Trp Asp Pro Gly Leu Phe Ser Leu

115 120 125

Asn Arg Leu Tyr Gly Gln Ala Ile Leu Pro Asn Gly Asn Asp Tyr Thr

130 135 140

Phe Trp Gly Lys Gly Val Ser Gln Gly His Ser Asp Ala Ile Arg Arg

145 150 155 160

Ile Ala Gly Val Lys Asp Ala Arg Gln Tyr Thr Ile Pro Val Asp Ala

165 170 175

Ala Leu Glu Ser Val Arg Asn Gly Glu Asn Pro Thr Leu Thr Thr Arg

180 185 190

Glu Lys His Thr Arg Glu Cys Phe Val Val Ala Glu Asp Gly Ala Asp

195 200 205

Leu Lys Val Ile Glu Glu Thr Ile Lys Thr Met Pro Asn Tyr Phe Ala

210 215 220

Asp Tyr Asp Thr Thr Val His Phe Ile Ser Glu Glu Glu Leu Met Arg

225 230 235 240

Asp His Gln Gly Ile Pro His Gly Gly Val Val Leu Arg Ser Gly Thr

245 250 255

Thr Gly Phe Asp Tyr Glu Asn Lys His Val Ile Glu Tyr Lys Leu Thr

260 265 270

Leu Asp Ser Asn Pro Glu Phe Thr Ser Ser Val Leu Val Ala Tyr Ala

275 280 285

Arg Ala Ala Tyr Arg Met His Gln Glu Gly Gln Cys Gly Cys Lys Thr

290 295 300

Val Phe Asp Ile Ala Pro Ala Tyr Leu His Val Glu Ser Gly Glu Glu

305 310 315 320

Leu Arg Lys Lys Leu Leu

325

<210> 9

<211> 981

<212> DNA

<213> 人工序列

<220>

<223> 来自竹节状甲烷鬃毛菌的进行密码子优化的ddh

<400> 9

atggaaaagc tccgcattgg catcgtagga tacggaaatg ttggccgggc tgtggagctg 60

tctcttcgcc aaaacccgga tatgatggcc gcagtcgtct tgacccgccg tgacccccgt 120

ggtattcgga cgctcacgcc cggtttgatg gcctcttcga ttgaggaggc tgaacggtac 180

gcttcagagg tcgacgtggc cgtactctgt ggcggtagcg ctacggacct tccagttcaa 240

ggaccggcta tggcgtcaat tttcaacact gtagattcct acgacaatca tccgcgcatc 300

ccagaatact ttgcagcagt agattctgcg gcacgccgtg gccgtcggac cgcaatcgtg 360

agcaccggtt gggatcccgg tctgttctcg ttgatccgcc tgcttgagga ggccgttttg 420

cccgaaggca ctgattatac gttttggggc cctggagtgt cccaaggaca ttctgacgct 480

gtacggcggg tcgaaggagt gcgtgatgcg cgccaatata ctatccctat cgaggacacc 540

gtggctcgcg tgcgttccgg cgaggcaccc tccctcagca cccgggaacg ccatcttcgt 600

cgttgctacg tggtggccga agagggagcc gaccccggtg agatccgtga gaaaattcgg 660

tcaatgccta attattttgc agattatgat accaaggtct cgtttatttc gcaggaagag 720

atggaacgca gccacaaccg gatgccacat ggtggtttcg ttatgcgtgc gggaaagacc 780

gccgacggaa cgggtcacgt ccttgagttc cgtcttaaat tggactctaa ccccgctttc 840

accgcatccg tgttgttggc ttatgcacgt gcagcatatc ggctgcacca agaaggcgca 900

attggcgcac ggaccgtatt tgatgtaccg ccagcgcatc tgtctcctaa aacgccagag 960

gagattcgtc gttccatgct t 981

<210> 10

<211> 327

<212> PRT

<213> 人工序列

<220>

<223> 来自竹节状甲烷鬃毛菌的进行密码子优化的ddh

<400> 10

Met Glu Lys Leu Arg Ile Gly Ile Val Gly Tyr Gly Asn Val Gly Arg

1 5 10 15

Ala Val Glu Leu Ser Leu Arg Gln Asn Pro Asp Met Met Ala Ala Val

20 25 30

Val Leu Thr Arg Arg Asp Pro Arg Gly Ile Arg Thr Leu Thr Pro Gly

35 40 45

Leu Met Ala Ser Ser Ile Glu Glu Ala Glu Arg Tyr Ala Ser Glu Val

50 55 60

Asp Val Ala Val Leu Cys Gly Gly Ser Ala Thr Asp Leu Pro Val Gln

65 70 75 80

Gly Pro Ala Met Ala Ser Ile Phe Asn Thr Val Asp Ser Tyr Asp Asn

85 90 95

His Pro Arg Ile Pro Glu Tyr Phe Ala Ala Val Asp Ser Ala Ala Arg

100 105 110

Arg Gly Arg Arg Thr Ala Ile Val Ser Thr Gly Trp Asp Pro Gly Leu

115 120 125

Phe Ser Leu Ile Arg Leu Leu Glu Glu Ala Val Leu Pro Glu Gly Thr

130 135 140

Asp Tyr Thr Phe Trp Gly Pro Gly Val Ser Gln Gly His Ser Asp Ala

145 150 155 160

Val Arg Arg Val Glu Gly Val Arg Asp Ala Arg Gln Tyr Thr Ile Pro

165 170 175

Ile Glu Asp Thr Val Ala Arg Val Arg Ser Gly Glu Ala Pro Ser Leu

180 185 190

Ser Thr Arg Glu Arg His Leu Arg Arg Cys Tyr Val Val Ala Glu Glu

195 200 205

Gly Ala Asp Pro Gly Glu Ile Arg Glu Lys Ile Arg Ser Met Pro Asn

210 215 220

Tyr Phe Ala Asp Tyr Asp Thr Lys Val Ser Phe Ile Ser Gln Glu Glu

225 230 235 240

Met Glu Arg Ser His Asn Arg Met Pro His Gly Gly Phe Val Met Arg

245 250 255

Ala Gly Lys Thr Ala Asp Gly Thr Gly His Val Leu Glu Phe Arg Leu

260 265 270

Lys Leu Asp Ser Asn Pro Ala Phe Thr Ala Ser Val Leu Leu Ala Tyr

275 280 285

Ala Arg Ala Ala Tyr Arg Leu His Gln Glu Gly Ala Ile Gly Ala Arg

290 295 300

Thr Val Phe Asp Val Pro Pro Ala His Leu Ser Pro Lys Thr Pro Glu

305 310 315 320

Glu Ile Arg Arg Ser Met Leu

325

<210> 11

<211> 972

<212> DNA

<213> 人工序列

<220>

<223> 来自微核巨球形菌的进行密码子优化的ddh

<400> 11

atggacaaaa ttcgcattgg tatcgtggga tacggcaacc tgggtcgggg agcggaggct 60

tcggtcaagc tccagccgga tatggagctg atcggtgttt tctctcggcg gaagggaatt 120

aagactgtgt cgggagtgcc tgcatatact atggacgaga tgctcaactt taagggtaaa 180

atcgatgtta tgattttgtg tggaggatcg gcaacggacc tgatcgaaca gacccctgcg 240

gtggcagccc actttacctg tattgactcc tttgatactc accctcggat taccgaacac 300

tttaataacg tagataaagc ggctaaagca gcaggtaccg ccgccctgat ttcatgtggt 360

tgggacccag gaatgttttc tcttcaacgt gttttcgcgg aagcaatttt gccccaaggc 420

aagtcttata cgttctgggg ccggggagtg tctcagggcc attcggacgc cattcggcgg 480

atcgatggag tcgtcgacgc gcggcagtat actgtaccaa aagataaata cctgaatgcc 540

atccgtaatg gtgaaatgcc cgaggtcact ggacaggagg cgcatctgcg tgactgctac 600

gttgtcgctg cggagggcgc agataaagct cggatcgaga acgaaattaa gaccatgaaa 660

aactattttg tgggatacga aaccgtagta cacttcattt cacaggagga actggaccgg 720

gatcacaagg gcattccgca cggtggtttc gtacttcgca gcggcgagtc gacccccggt 780

accaaacatg tggtggaata tcgcctccag ttggattcca acccggagtt tactggttct 840

gtgcttacgg cgtatgctcg cggccttaac cgcttggcta agcataaagc caccggagct 900

ttcacggtgt tcgatattcc tcccgcgtgg attagcgtac attctgacga ggagctgcgg 960

gcacactcac tg 972

<210> 12

<211> 324

<212> PRT

<213> 人工序列

<220>

<223> 来自微核巨球形菌的进行密码子优化的ddh

<400> 12

Met Asp Lys Ile Arg Ile Gly Ile Val Gly Tyr Gly Asn Leu Gly Arg

1 5 10 15

Gly Ala Glu Ala Ser Val Lys Leu Gln Pro Asp Met Glu Leu Ile Gly

20 25 30

Val Phe Ser Arg Arg Lys Gly Ile Lys Thr Val Ser Gly Val Pro Ala

35 40 45

Tyr Thr Met Asp Glu Met Leu Asn Phe Lys Gly Lys Ile Asp Val Met

50 55 60

Ile Leu Cys Gly Gly Ser Ala Thr Asp Leu Ile Glu Gln Thr Pro Ala

65 70 75 80

Val Ala Ala His Phe Thr Cys Ile Asp Ser Phe Asp Thr His Pro Arg

85 90 95

Ile Thr Glu His Phe Asn Asn Val Asp Lys Ala Ala Lys Ala Ala Gly

100 105 110

Thr Ala Ala Leu Ile Ser Cys Gly Trp Asp Pro Gly Met Phe Ser Leu

115 120 125

Gln Arg Val Phe Ala Glu Ala Ile Leu Pro Gln Gly Lys Ser Tyr Thr

130 135 140

Phe Trp Gly Arg Gly Val Ser Gln Gly His Ser Asp Ala Ile Arg Arg

145 150 155 160

Ile Asp Gly Val Val Asp Ala Arg Gln Tyr Thr Val Pro Lys Asp Lys

165 170 175

Tyr Leu Asn Ala Ile Arg Asn Gly Glu Met Pro Glu Val Thr Gly Gln

180 185 190

Glu Ala His Leu Arg Asp Cys Tyr Val Val Ala Ala Glu Gly Ala Asp

195 200 205

Lys Ala Arg Ile Glu Asn Glu Ile Lys Thr Met Lys Asn Tyr Phe Val

210 215 220

Gly Tyr Glu Thr Val Val His Phe Ile Ser Gln Glu Glu Leu Asp Arg

225 230 235 240

Asp His Lys Gly Ile Pro His Gly Gly Phe Val Leu Arg Ser Gly Glu

245 250 255

Ser Thr Pro Gly Thr Lys His Val Val Glu Tyr Arg Leu Gln Leu Asp

260 265 270

Ser Asn Pro Glu Phe Thr Gly Ser Val Leu Thr Ala Tyr Ala Arg Gly

275 280 285

Leu Asn Arg Leu Ala Lys His Lys Ala Thr Gly Ala Phe Thr Val Phe

290 295 300

Asp Ile Pro Pro Ala Trp Ile Ser Val His Ser Asp Glu Glu Leu Arg

305 310 315 320

Ala His Ser Leu

<210> 13

<211> 1002

<212> DNA

<213> 人工序列

<220>

<223> 来自反硝化无色杆菌的进行密码子优化的ddh

<400> 13

atgggtcttg ataacaatgc acgcacggcc atccgtatcg gtatcgttgg atacggcaac 60

cttggacgtg gtgtggaagc ggcagtcgcc cgcaattcgg atatggcagt tgccggaatt 120

tacacgcggc gtgaccctgc ccaaattgaa cccatgggcg cgggagtgcc agtgcacgcc 180

atggactcgc tccctggtca taaaggttcg attgatgttc tggtactttg cggaggctca 240

aaagatgatc tgccgcgcca atcccccgag ttggccgctc actttagcct ggttgattcc 300

tttgacaccc atgctcggat cccagagcac ttcgctgcgg ttgacgcggc ggcgcaagca 360

ggacgtacga cggcactgat ttctgcaggt tgggacccgg gaatgttttc catcaatcgg 420

gtaatgggcg aggccctctt gccggatggc gccacctata cgttctgggg caagggactc 480

tcccagggcc actctgatgc ggtgcgtcgg gttccgggcg tagctggcgg tgtgcagtat 540

actatccccg tggacgaagc ggtagctcag gtacggtccg gtttgcgtcc tgccctcacc 600

acgcgggaaa aacaccggcg cgaatgcttc gttgtactcg aagcgggagc agacgcctcg 660

gccgtgcgta agacgattgt tacgatgccc cattattttg atgagtatga caccactgta 720

cactttatcg gcgccgagga attggctcgg gaacacggcg ccatgccgca cggcggattt 780

gtcatccgct caggtaatac ctctcaggaa aacaaacagg taatcgagta tcgtctccaa 840

ctcgactcta accctgaatt taccagctct gtcctcgtcg catatgcacg tgccgtacat 900

cgtatgcaac aggccggtca gtggggctgc aagacggtat ttgatgttgc gccaggcctg 960

ctgtctccgc gctcggcggc cgaactccgc gctcaacttc tt 1002

<210> 14

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> 来自反硝化无色杆菌的进行密码子优化的ddh

<400> 14

Met Gly Leu Asp Asn Asn Ala Arg Thr Ala Ile Arg Ile Gly Ile Val

1 5 10 15

Gly Tyr Gly Asn Leu Gly Arg Gly Val Glu Ala Ala Val Ala Arg Asn

20 25 30

Ser Asp Met Ala Val Ala Gly Ile Tyr Thr Arg Arg Asp Pro Ala Gln

35 40 45

Ile Glu Pro Met Gly Ala Gly Val Pro Val His Ala Met Asp Ser Leu

50 55 60

Pro Gly His Lys Gly Ser Ile Asp Val Leu Val Leu Cys Gly Gly Ser

65 70 75 80

Lys Asp Asp Leu Pro Arg Gln Ser Pro Glu Leu Ala Ala His Phe Ser

85 90 95

Leu Val Asp Ser Phe Asp Thr His Ala Arg Ile Pro Glu His Phe Ala

100 105 110

Ala Val Asp Ala Ala Ala Gln Ala Gly Arg Thr Thr Ala Leu Ile Ser

115 120 125

Ala Gly Trp Asp Pro Gly Met Phe Ser Ile Asn Arg Val Met Gly Glu

130 135 140

Ala Leu Leu Pro Asp Gly Ala Thr Tyr Thr Phe Trp Gly Lys Gly Leu

145 150 155 160

Ser Gln Gly His Ser Asp Ala Val Arg Arg Val Pro Gly Val Ala Gly

165 170 175

Gly Val Gln Tyr Thr Ile Pro Val Asp Glu Ala Val Ala Gln Val Arg

180 185 190

Ser Gly Leu Arg Pro Ala Leu Thr Thr Arg Glu Lys His Arg Arg Glu

195 200 205

Cys Phe Val Val Leu Glu Ala Gly Ala Asp Ala Ser Ala Val Arg Lys

210 215 220

Thr Ile Val Thr Met Pro His Tyr Phe Asp Glu Tyr Asp Thr Thr Val

225 230 235 240

His Phe Ile Gly Ala Glu Glu Leu Ala Arg Glu His Gly Ala Met Pro

245 250 255

His Gly Gly Phe Val Ile Arg Ser Gly Asn Thr Ser Gln Glu Asn Lys

260 265 270

Gln Val Ile Glu Tyr Arg Leu Gln Leu Asp Ser Asn Pro Glu Phe Thr

275 280 285

Ser Ser Val Leu Val Ala Tyr Ala Arg Ala Val His Arg Met Gln Gln

290 295 300

Ala Gly Gln Trp Gly Cys Lys Thr Val Phe Asp Val Ala Pro Gly Leu

305 310 315 320

Leu Ser Pro Arg Ser Ala Ala Glu Leu Arg Ala Gln Leu Leu

325 330

<210> 15

<211> 957

<212> DNA

<213> 人工序列

<220>

<223> 来自藤黄微球菌的进行密码子优化的ddh

<400> 15

atgaccattc gcgcgggaat tgtaggatat ggaaacctgg gtcgctctgt agaaaaactt 60

gttaaactgc agccggacat ggaacttgtt ggcatttttt cccggcggac tggactcgac 120

acggataccc cagtacttcc tgcggaacgt gcggccgagc acgcgggtga gattgatgtg 180

ctttttctgt gccttggaag cgcgactgat attccagagc aagcggccgg ttacgcacgc 240

cacttcacga ccgttgatac gtatgataac catcaactga tcccacggca tcggtctgaa 300

atggatgctg cggcccggga gggcggccac gtagcgatga tctcaactgg atgggaccca 360

ggactttttt ctgtcaatcg ggtccttgga gccgcccttt ttccgcagcc ccagcaaaat 420

actttttggg gcaagggcct ctcacaaggt cactcggatg cagtgcggcg ggtgccgggt 480

gtacggcgtg gcgttcagta cactattccg tcagaggaag cgattgcaga ggcccgggct 540

ggtcgcggtg cagagattac tggtgcgtcg gctcatgttc gggagtgtta cgtcgttgca 600

gacgaggcag atcatgctgc tatcactgag gcgatcacca ccatgccgga ttactttgcc 660

ccctatgaga cgaccgtaca ctttatttcg gaggaagaat ttgagcggga tcatcagggt 720

atgccacacg gaggccacgt tgtcacgtct ggtgacttgg gaggctctcg ctctgcggta 780

gaatttgtcc tcgaactcga atctaatcct gactttaccg cagcagccca ggtagcctat 840

ggccgggccg ccgctcgcct taaggcccag ggtgagactg gcgctcgtac ggtacttgag 900

gtcgctccct atcttctgtc accgacgggt ttggatgagc tgattcgccg cgacgtg 957

<210> 16

<211> 319

<212> PRT

<213> 人工序列

<220>

<223> 来自藤黄微球菌的进行密码子优化的ddh

<400> 16

Met Thr Ile Arg Ala Gly Ile Val Gly Tyr Gly Asn Leu Gly Arg Ser

1 5 10 15

Val Glu Lys Leu Val Lys Leu Gln Pro Asp Met Glu Leu Val Gly Ile

20 25 30

Phe Ser Arg Arg Thr Gly Leu Asp Thr Asp Thr Pro Val Leu Pro Ala

35 40 45

Glu Arg Ala Ala Glu His Ala Gly Glu Ile Asp Val Leu Phe Leu Cys

50 55 60

Leu Gly Ser Ala Thr Asp Ile Pro Glu Gln Ala Ala Gly Tyr Ala Arg

65 70 75 80

His Phe Thr Thr Val Asp Thr Tyr Asp Asn His Gln Leu Ile Pro Arg

85 90 95

His Arg Ser Glu Met Asp Ala Ala Ala Arg Glu Gly Gly His Val Ala

100 105 110

Met Ile Ser Thr Gly Trp Asp Pro Gly Leu Phe Ser Val Asn Arg Val

115 120 125

Leu Gly Ala Ala Leu Phe Pro Gln Pro Gln Gln Asn Thr Phe Trp Gly

130 135 140

Lys Gly Leu Ser Gln Gly His Ser Asp Ala Val Arg Arg Val Pro Gly

145 150 155 160

Val Arg Arg Gly Val Gln Tyr Thr Ile Pro Ser Glu Glu Ala Ile Ala

165 170 175

Glu Ala Arg Ala Gly Arg Gly Ala Glu Ile Thr Gly Ala Ser Ala His

180 185 190

Val Arg Glu Cys Tyr Val Val Ala Asp Glu Ala Asp His Ala Ala Ile

195 200 205

Thr Glu Ala Ile Thr Thr Met Pro Asp Tyr Phe Ala Pro Tyr Glu Thr

210 215 220

Thr Val His Phe Ile Ser Glu Glu Glu Phe Glu Arg Asp His Gln Gly

225 230 235 240

Met Pro His Gly Gly His Val Val Thr Ser Gly Asp Leu Gly Gly Ser

245 250 255

Arg Ser Ala Val Glu Phe Val Leu Glu Leu Glu Ser Asn Pro Asp Phe

260 265 270

Thr Ala Ala Ala Gln Val Ala Tyr Gly Arg Ala Ala Ala Arg Leu Lys

275 280 285

Ala Gln Gly Glu Thr Gly Ala Arg Thr Val Leu Glu Val Ala Pro Tyr

290 295 300

Leu Leu Ser Pro Thr Gly Leu Asp Glu Leu Ile Arg Arg Asp Val

305 310 315

<210> 17

<211> 981

<212> DNA

<213> 人工序列

<220>

<223> 来自粪短杆菌的进行密码子优化的ddh

<400> 17

atgaccgttc atcgtattgg catcgtagga tatggaaacc tcggacgtgg agtagagatc 60

gcgaccagct tgcaggaaga catgcaactc gttggtgtct tcacgcgccg cgacccttca 120

acggtaagca ccgttcatgc tcagacgcca gtacgctcaa tcgacgccct tgaggagatg 180

caagacgaaa ttgatgtgct cgttctttgt ggtggatcac gtaccgacct tcctgaacag 240

acgccccagt tggctgaacg gtttactgtg gttgattcgt ttgacaccca cgcgcggatt 300

cctgagcatt tcgccaaagt tgatgcagcg gcgcgcgctg ctggaaccac cgccctgatt 360

tccactggct gggatccagg cttgttttcg atcaatcgtg tatatggcga agcaatcctt 420

gcgactggaa ctacctacac cttttggggt cggggacttt cccagggcca ctccgatgct 480

gtacggcggg tcgatggcgt agctgctgcc gtacagtaca ctgtaccgag ccaagaagcg 540

attgctcggg tgcgggccgg cgaacagccc acgctgtcga cgcgggaaaa acacacccgg 600

gaatgtttcg tcgttttgga ggatggcgcg gatgctgaga ctgtccgcga ggagatcgta 660

accatgcccc actattttga accttatgac actaccgtaa ccttcctgtc tgcagaggaa 720

ctggcgcgcg atcaccaggg catgccgcac ggcggttttg tgattcggtc aggagagtca 780

agcccaggca ctacccagac tattgaatac cggcttcagg aagactctaa cccggaattt 840

actgcgtcgg tccttgtcgc atatactcgt gctgccgccc ggctcgcagc cgccggcgaa 900

catggtgcta agactccttt cgacgttgcc ccgggccttc tgtccccgaa gtcgcccgaa 960

cagctgcgcg ccgagctcct g 981

<210> 18

<211> 327

<212> PRT

<213> 人工序列

<220>

<223> 来自粪短杆菌的进行密码子优化的ddh

<400> 18

Met Thr Val His Arg Ile Gly Ile Val Gly Tyr Gly Asn Leu Gly Arg

1 5 10 15

Gly Val Glu Ile Ala Thr Ser Leu Gln Glu Asp Met Gln Leu Val Gly

20 25 30

Val Phe Thr Arg Arg Asp Pro Ser Thr Val Ser Thr Val His Ala Gln

35 40 45

Thr Pro Val Arg Ser Ile Asp Ala Leu Glu Glu Met Gln Asp Glu Ile

50 55 60

Asp Val Leu Val Leu Cys Gly Gly Ser Arg Thr Asp Leu Pro Glu Gln

65 70 75 80

Thr Pro Gln Leu Ala Glu Arg Phe Thr Val Val Asp Ser Phe Asp Thr

85 90 95

His Ala Arg Ile Pro Glu His Phe Ala Lys Val Asp Ala Ala Ala Arg

100 105 110

Ala Ala Gly Thr Thr Ala Leu Ile Ser Thr Gly Trp Asp Pro Gly Leu

115 120 125

Phe Ser Ile Asn Arg Val Tyr Gly Glu Ala Ile Leu Ala Thr Gly Thr

130 135 140

Thr Tyr Thr Phe Trp Gly Arg Gly Leu Ser Gln Gly His Ser Asp Ala

145 150 155 160

Val Arg Arg Val Asp Gly Val Ala Ala Ala Val Gln Tyr Thr Val Pro

165 170 175

Ser Gln Glu Ala Ile Ala Arg Val Arg Ala Gly Glu Gln Pro Thr Leu

180 185 190

Ser Thr Arg Glu Lys His Thr Arg Glu Cys Phe Val Val Leu Glu Asp

195 200 205

Gly Ala Asp Ala Glu Thr Val Arg Glu Glu Ile Val Thr Met Pro His

210 215 220

Tyr Phe Glu Pro Tyr Asp Thr Thr Val Thr Phe Leu Ser Ala Glu Glu

225 230 235 240

Leu Ala Arg Asp His Gln Gly Met Pro His Gly Gly Phe Val Ile Arg

245 250 255

Ser Gly Glu Ser Ser Pro Gly Thr Thr Gln Thr Ile Glu Tyr Arg Leu

260 265 270

Gln Glu Asp Ser Asn Pro Glu Phe Thr Ala Ser Val Leu Val Ala Tyr

275 280 285

Thr Arg Ala Ala Ala Arg Leu Ala Ala Ala Gly Glu His Gly Ala Lys

290 295 300

Thr Pro Phe Asp Val Ala Pro Gly Leu Leu Ser Pro Lys Ser Pro Glu

305 310 315 320

Gln Leu Arg Ala Glu Leu Leu

325

<210> 19

<211> 975

<212> DNA

<213> 人工序列

<220>

<223> 来自肉食杆菌属的进行密码子优化的ddh

<400> 19

atgactaaca agattcggat tggtcttgtg ggttacggta acatcggaaa gggcgtcgaa 60

ttggcgctgg aagagtttcc tgacatggaa ggcattgcgg tcttcactcg tcggaatccc 120

gaagatctcg attcaaagct caaagctatc tctttggacc acattcttga ttaccaggaa 180

gatctggacg ttttgatcct ttgcggcgga agcgccaccg atttgcctgg tcagggtcct 240

gctcttgcaa agcatttctc tacgattgac tcctacgata atcacaatca aattcctgaa 300

tatttcgaaa ctatggacca atctgcaaag gcaggcaaga acatttcaat tatctcggtc 360

ggctgggatc cgggactgtt ctcactgaat cgggccgttt tcgagtccat ccttccggcg 420

ggagagactt acactttttg gggcaaagga ctgtcccagg gccactccga cgccattcgt 480

cggattgatg gcgtcaagtt tggcgttcaa tacaccattc ccgtcgaaac cgcactggag 540

gaagtacggt ctggatcgaa tccgaccctt tccactcggg agaagcacaa acgtgtgtgc 600

tacgttgtag cggaagcggg ctccgaccag aatttgattg aggaaacgat taaaaccatg 660

ccggactact tcgagccgta cgacacgacc gtccatttca tcgacgagaa aacgttcaag 720

gaggagcatc agaaaatgcc acatggtggc ttcgtgatcc gtactgcaac ttcagctacg 780

ggcaacaagc agaaagctga gttccagctc gaattggagt ccaatgcaga attcacttct 840

tcaatcctcg ttgcgtacgc tcgtgccgcc tacaagttta agaaagatgg caagtctggc 900

gctctttcgg tgctggatgt ccctccggca tacctgtctc caaagtcggc agcgcagctc 960

cgcaaggagc tcctg 975

<210> 20

<211> 325

<212> PRT

<213> 人工序列

<220>

<223> 来自肉食杆菌属的进行密码子优化的ddh

<400> 20

Met Thr Asn Lys Ile Arg Ile Gly Leu Val Gly Tyr Gly Asn Ile Gly

1 5 10 15

Lys Gly Val Glu Leu Ala Leu Glu Glu Phe Pro Asp Met Glu Gly Ile

20 25 30

Ala Val Phe Thr Arg Arg Asn Pro Glu Asp Leu Asp Ser Lys Leu Lys

35 40 45

Ala Ile Ser Leu Asp His Ile Leu Asp Tyr Gln Glu Asp Leu Asp Val

50 55 60

Leu Ile Leu Cys Gly Gly Ser Ala Thr Asp Leu Pro Gly Gln Gly Pro

65 70 75 80

Ala Leu Ala Lys His Phe Ser Thr Ile Asp Ser Tyr Asp Asn His Asn

85 90 95

Gln Ile Pro Glu Tyr Phe Glu Thr Met Asp Gln Ser Ala Lys Ala Gly

100 105 110

Lys Asn Ile Ser Ile Ile Ser Val Gly Trp Asp Pro Gly Leu Phe Ser

115 120 125

Leu Asn Arg Ala Val Phe Glu Ser Ile Leu Pro Ala Gly Glu Thr Tyr

130 135 140

Thr Phe Trp Gly Lys Gly Leu Ser Gln Gly His Ser Asp Ala Ile Arg

145 150 155 160

Arg Ile Asp Gly Val Lys Phe Gly Val Gln Tyr Thr Ile Pro Val Glu

165 170 175

Thr Ala Leu Glu Glu Val Arg Ser Gly Ser Asn Pro Thr Leu Ser Thr

180 185 190

Arg Glu Lys His Lys Arg Val Cys Tyr Val Val Ala Glu Ala Gly Ser

195 200 205

Asp Gln Asn Leu Ile Glu Glu Thr Ile Lys Thr Met Pro Asp Tyr Phe

210 215 220

Glu Pro Tyr Asp Thr Thr Val His Phe Ile Asp Glu Lys Thr Phe Lys

225 230 235 240

Glu Glu His Gln Lys Met Pro His Gly Gly Phe Val Ile Arg Thr Ala

245 250 255

Thr Ser Ala Thr Gly Asn Lys Gln Lys Ala Glu Phe Gln Leu Glu Leu

260 265 270

Glu Ser Asn Ala Glu Phe Thr Ser Ser Ile Leu Val Ala Tyr Ala Arg

275 280 285

Ala Ala Tyr Lys Phe Lys Lys Asp Gly Lys Ser Gly Ala Leu Ser Val

290 295 300

Leu Asp Val Pro Pro Ala Tyr Leu Ser Pro Lys Ser Ala Ala Gln Leu

305 310 315 320

Arg Lys Glu Leu Leu

325

<210> 21

<211> 1062

<212> DNA

<213> 人工序列

<220>

<223> 来自詹氏甲烷球菌的进行密码子优化的asd

<400> 21

atgtccaagg gagaaaaaat gaagatcaag gttggcgtat tgggtgctac cggatcggtt 60

ggccaacgct ttgtgcagct gcttgcagac caccccatgt tcgaattgac tgctctggca 120

gcaagcgaac ggtccgcggg taaaaaatac aaagatgctt gttactggtt tcaagatcgg 180

gacattccag aaaatattaa ggatatggtt gtaattccga cggatccgaa gcacgaagaa 240

ttcgaagacg ttgatattgt ttttagcgcg ctgccctcgg atctggctaa aaaattcgaa 300

cccgaattcg cgaaagaagg aaagctgatc ttcagcaacg catcagccta tcgtatggag 360

gaagatgtgc cgcttgtaat tccagaggta aacgctgatc acctcgaatt gattgaaatt 420

cagcgcgaga agcggggttg ggacggagcc attatcacta acccaaactg ttcaaccatt 480

tgcgccgtaa tcacccttaa gccaattatg gacaaattcg gtcttgaagc ggtgtttatc 540

gctaccatgc aggctgtatc gggcgcagga tacaacggtg tcccgagcat ggctattctg 600

gataacttga ttccctttat taagaatgag gaggagaaga tgcagactga atcgcttaag 660

ttgctgggca cgcttaagga tggaaaagtg gaactcgcta acttcaaaat cagcgcatca 720

tgcaatcgtg tggctgtgat cgacggccac accgaatcga tcttcgtgaa gaccaaggag 780

ggtgcggaac ctgaggaaat taaagaagtg atggacaaat ttgatcctct taaagacctt 840

aaccttccga cgtatgccaa accaatcgta attcgcgaag agatcgatcg cccacagcca 900

cgtcttgacc gcaatgaggg taatggcatg tctattgtcg ttggtcgtat ccgtaaagat 960

ccgatttttg atgttaagta caccgccctg gaacataaca ctatccgtgg cgccgcgggc 1020

gcatcagtgt tgaatgcgga gtatttcgta aagaaataca tc 1062

<210> 22

<211> 354

<212> PRT

<213> 人工序列

<220>

<223> 来自詹氏甲烷球菌的进行密码子优化的asd

<400> 22

Met Ser Lys Gly Glu Lys Met Lys Ile Lys Val Gly Val Leu Gly Ala

1 5 10 15

Thr Gly Ser Val Gly Gln Arg Phe Val Gln Leu Leu Ala Asp His Pro

20 25 30

Met Phe Glu Leu Thr Ala Leu Ala Ala Ser Glu Arg Ser Ala Gly Lys

35 40 45

Lys Tyr Lys Asp Ala Cys Tyr Trp Phe Gln Asp Arg Asp Ile Pro Glu

50 55 60

Asn Ile Lys Asp Met Val Val Ile Pro Thr Asp Pro Lys His Glu Glu

65 70 75 80

Phe Glu Asp Val Asp Ile Val Phe Ser Ala Leu Pro Ser Asp Leu Ala

85 90 95

Lys Lys Phe Glu Pro Glu Phe Ala Lys Glu Gly Lys Leu Ile Phe Ser

100 105 110

Asn Ala Ser Ala Tyr Arg Met Glu Glu Asp Val Pro Leu Val Ile Pro

115 120 125

Glu Val Asn Ala Asp His Leu Glu Leu Ile Glu Ile Gln Arg Glu Lys

130 135 140

Arg Gly Trp Asp Gly Ala Ile Ile Thr Asn Pro Asn Cys Ser Thr Ile

145 150 155 160

Cys Ala Val Ile Thr Leu Lys Pro Ile Met Asp Lys Phe Gly Leu Glu

165 170 175

Ala Val Phe Ile Ala Thr Met Gln Ala Val Ser Gly Ala Gly Tyr Asn

180 185 190

Gly Val Pro Ser Met Ala Ile Leu Asp Asn Leu Ile Pro Phe Ile Lys

195 200 205

Asn Glu Glu Glu Lys Met Gln Thr Glu Ser Leu Lys Leu Leu Gly Thr

210 215 220

Leu Lys Asp Gly Lys Val Glu Leu Ala Asn Phe Lys Ile Ser Ala Ser

225 230 235 240

Cys Asn Arg Val Ala Val Ile Asp Gly His Thr Glu Ser Ile Phe Val

245 250 255

Lys Thr Lys Glu Gly Ala Glu Pro Glu Glu Ile Lys Glu Val Met Asp

260 265 270

Lys Phe Asp Pro Leu Lys Asp Leu Asn Leu Pro Thr Tyr Ala Lys Pro

275 280 285

Ile Val Ile Arg Glu Glu Ile Asp Arg Pro Gln Pro Arg Leu Asp Arg

290 295 300

Asn Glu Gly Asn Gly Met Ser Ile Val Val Gly Arg Ile Arg Lys Asp

305 310 315 320

Pro Ile Phe Asp Val Lys Tyr Thr Ala Leu Glu His Asn Thr Ile Arg

325 330 335

Gly Ala Ala Gly Ala Ser Val Leu Asn Ala Glu Tyr Phe Val Lys Lys

340 345 350

Tyr Ile

<210> 23

<211> 1047

<212> DNA

<213> 人工序列

<220>

<223> 来自普通索利氏菌的进行密码子优化的asd

<400> 23

atgcagacgc ggatcgaggt aggaattctt ggagcgactg gtatggtcgg tcagcacttt 60

atcaaatttt tgcaaggcca cccttggttc gatctcaagt ggctgggtgc ttcagaccgc 120

tccgccggta aacagtacaa agacgcgatg acctggcatc ttgctggagg aaccccagat 180

tcagtcgctg gtctcaccgt cgaagaatgc aaacccggca atgccccccg tctgcttttc 240

agcgctatgg acgctggagt tgcgaccgat attgaacgtg cgtttgcgca ggcgggtcat 300

gtggttgtct cgaatagccg caaccaccgg atggagcaag acgttccttt gatggtgcct 360

gagattaacc cagatcatct gaagctggta ccgggacaac aacgcgcgcg gggatggaaa 420

ggacagattg tcacgaaccc gaattgctct acgatcggtc tggtgatggg tctcggtcca 480

atgaaacagt tcggcattac gaagatcctt gttaccacga tgcaggctat ttcaggcgca 540

ggatacccag gagtagcatc catggatatt atgggtaacg ttgtccccta catcggctct 600

gaagaggaaa agatggagat ggaaactcaa aaaattatgg gtgatttcgc gggcgatcgc 660

atcgtgccgc ttgcagcaaa ggtctcggcc cactgcaatc gggtaggcgt tgttgacggc 720

catacggaaa ctgtgtcagt cgaattctct atgaaaccaa cggaggcaga tttgcgccat 780

gcgatcgaat cctttactgc agtgccccag gaacgcaagc tcccgagcgc accaggacgt 840

ccggttatct atatgaagga agccaaccgg ccccaacctc gtaaggatgc tgaacgggag 900

cgtggcatgg cagcgtttgt tggtcgcctc cgggcatgcc cggtactgga ttataaattt 960

gtggtcctgt cccacaatac gattcgcggc gcagcaggcg cagcagtctt gaatgccgaa 1020

ctcatgcact cagagggaat gttggat 1047

<210> 24

<211> 349

<212> PRT

<213> 人工序列

<220>

<223> 来自普通索利氏菌的进行密码子优化的asd

<400> 24

Met Gln Thr Arg Ile Glu Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln His Phe Ile Lys Phe Leu Gln Gly His Pro Trp Phe Asp Leu

20 25 30

Lys Trp Leu Gly Ala Ser Asp Arg Ser Ala Gly Lys Gln Tyr Lys Asp

35 40 45

Ala Met Thr Trp His Leu Ala Gly Gly Thr Pro Asp Ser Val Ala Gly

50 55 60

Leu Thr Val Glu Glu Cys Lys Pro Gly Asn Ala Pro Arg Leu Leu Phe

65 70 75 80

Ser Ala Met Asp Ala Gly Val Ala Thr Asp Ile Glu Arg Ala Phe Ala

85 90 95

Gln Ala Gly His Val Val Val Ser Asn Ser Arg Asn His Arg Met Glu

100 105 110

Gln Asp Val Pro Leu Met Val Pro Glu Ile Asn Pro Asp His Leu Lys

115 120 125

Leu Val Pro Gly Gln Gln Arg Ala Arg Gly Trp Lys Gly Gln Ile Val

130 135 140

Thr Asn Pro Asn Cys Ser Thr Ile Gly Leu Val Met Gly Leu Gly Pro

145 150 155 160

Met Lys Gln Phe Gly Ile Thr Lys Ile Leu Val Thr Thr Met Gln Ala

165 170 175

Ile Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Met Asp Ile Met Gly

180 185 190

Asn Val Val Pro Tyr Ile Gly Ser Glu Glu Glu Lys Met Glu Met Glu

195 200 205

Thr Gln Lys Ile Met Gly Asp Phe Ala Gly Asp Arg Ile Val Pro Leu

210 215 220

Ala Ala Lys Val Ser Ala His Cys Asn Arg Val Gly Val Val Asp Gly

225 230 235 240

His Thr Glu Thr Val Ser Val Glu Phe Ser Met Lys Pro Thr Glu Ala

245 250 255

Asp Leu Arg His Ala Ile Glu Ser Phe Thr Ala Val Pro Gln Glu Arg

260 265 270

Lys Leu Pro Ser Ala Pro Gly Arg Pro Val Ile Tyr Met Lys Glu Ala

275 280 285

Asn Arg Pro Gln Pro Arg Lys Asp Ala Glu Arg Glu Arg Gly Met Ala

290 295 300

Ala Phe Val Gly Arg Leu Arg Ala Cys Pro Val Leu Asp Tyr Lys Phe

305 310 315 320

Val Val Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala Val

325 330 335

Leu Asn Ala Glu Leu Met His Ser Glu Gly Met Leu Asp

340 345

<210> 25

<211> 1032

<212> DNA

<213> 人工序列

<220>

<223> 来自内部盐碱湖菌的进行密码子优化的asd

<400> 25

atggcagtgc gggtaggtgt attgggcgct acgggagcag tgggtcaacg gcttatccag 60

ctcctcgagc ctcaccctga attcgaaatt gctgctctca ccgcgtcgga gtcttccgct 120

ggtaaaactt atcgtcaggc ggcgaaatgg cgcgtagact ccccaatccc tgacgacgtc 180

gcagagatga ccgtaagcgc aacggatccc gatgaggttc cggatgacgt agatttgctg 240

ttcagcagct tgccgtcaag cgtcggcgaa caggtagagc ccgctttttg cgaagccgga 300

tacgtgatgt cgtccaattc ttctaatgct cgtatggcgg atgacgtccc acttgttatc 360

ccagaggtaa atgctgaaca tattgatctt cttgaggtcc aacgcgatga acgtggatgg 420

gatggcgcga tggtaaaaaa ccctaattgt tcaactatta cctttgtccc aactcttgcg 480

gcccttgagc agtttggcct ggaggaagtc cacgttgcaa cgctgcaagc ggtgtccggt 540

gcaggttatg atggagtctc ctccatggag atcattgaca atgcaattcc ttatattgga 600

tcggaagaag agaaactgga aacggaatct cgtaagctcc tgggagaatt tgacggcgct 660

gaactgtcgc ataactcagt tgaagtcgca gcttcgtgca accgtatccc gaccattgac 720

ggacacttgg agaacgtgtg ggttgagacc gaagacgacc ttacgcccga agatgccgcg 780

gatgcaatgc gcgcgtatcc atcgttggag cttcgttcat ctccggacca gctgattcat 840

gtctttgatg aaccagaccg cccgcaaccg cggatggacc ggactttggg agacggaatg 900

gcaatcgcgg ctggtggttt gcgtgaatcg actttcgacc ttcaatacaa ttgcttggct 960

cataacacca tccggggtgc agcgggagcc tcggttctga acggagagct gttgttggac 1020

caaggttata tt 1032

<210> 26

<211> 344

<212> PRT

<213> 人工序列

<220>

<223> 来自内部盐碱湖菌的进行密码子优化的asd

<400> 26

Met Ala Val Arg Val Gly Val Leu Gly Ala Thr Gly Ala Val Gly Gln

1 5 10 15

Arg Leu Ile Gln Leu Leu Glu Pro His Pro Glu Phe Glu Ile Ala Ala

20 25 30

Leu Thr Ala Ser Glu Ser Ser Ala Gly Lys Thr Tyr Arg Gln Ala Ala

35 40 45

Lys Trp Arg Val Asp Ser Pro Ile Pro Asp Asp Val Ala Glu Met Thr

50 55 60

Val Ser Ala Thr Asp Pro Asp Glu Val Pro Asp Asp Val Asp Leu Leu

65 70 75 80

Phe Ser Ser Leu Pro Ser Ser Val Gly Glu Gln Val Glu Pro Ala Phe

85 90 95

Cys Glu Ala Gly Tyr Val Met Ser Ser Asn Ser Ser Asn Ala Arg Met

100 105 110

Ala Asp Asp Val Pro Leu Val Ile Pro Glu Val Asn Ala Glu His Ile

115 120 125

Asp Leu Leu Glu Val Gln Arg Asp Glu Arg Gly Trp Asp Gly Ala Met

130 135 140

Val Lys Asn Pro Asn Cys Ser Thr Ile Thr Phe Val Pro Thr Leu Ala

145 150 155 160

Ala Leu Glu Gln Phe Gly Leu Glu Glu Val His Val Ala Thr Leu Gln

165 170 175

Ala Val Ser Gly Ala Gly Tyr Asp Gly Val Ser Ser Met Glu Ile Ile

180 185 190

Asp Asn Ala Ile Pro Tyr Ile Gly Ser Glu Glu Glu Lys Leu Glu Thr

195 200 205

Glu Ser Arg Lys Leu Leu Gly Glu Phe Asp Gly Ala Glu Leu Ser His

210 215 220

Asn Ser Val Glu Val Ala Ala Ser Cys Asn Arg Ile Pro Thr Ile Asp

225 230 235 240

Gly His Leu Glu Asn Val Trp Val Glu Thr Glu Asp Asp Leu Thr Pro

245 250 255

Glu Asp Ala Ala Asp Ala Met Arg Ala Tyr Pro Ser Leu Glu Leu Arg

260 265 270

Ser Ser Pro Asp Gln Leu Ile His Val Phe Asp Glu Pro Asp Arg Pro

275 280 285

Gln Pro Arg Met Asp Arg Thr Leu Gly Asp Gly Met Ala Ile Ala Ala

290 295 300

Gly Gly Leu Arg Glu Ser Thr Phe Asp Leu Gln Tyr Asn Cys Leu Ala

305 310 315 320

His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ser Val Leu Asn Gly Glu

325 330 335

Leu Leu Leu Asp Gln Gly Tyr Ile

340

<210> 27

<211> 1047

<212> DNA

<213> 人工序列

<220>

<223> 来自嗜热光合绿曲菌的进行密码子优化的asd

<400> 27

atggccacta ttccagtcgc cgttctgggt gccacgggtg ccgtgggtca acggttcatt 60

cagttgcttg agggtcaccc gctttttcag gtagttgccc tgactggcag cgagcgttcc 120

gctggtaaaa aataccacga ggtgtgtcgt tgggttttgg atactcctat gcccgcagcg 180

gttgcaaacc tgacggtact ggatgcagac gcagacctcc ccgcacagct cgtgttctcc 240

gcgctcccgt ctaccgtcgc cggcccgatc gaacaacgtc ttgctgctgc tggtcatatc 300

gtgtgctcca acgcttcgaa ccatcgtatg gagccagatg tgccactcat tattcccgaa 360

gtcaacccgg accatcttgc cttgattccc gttcaacgcc gccgccgtgg ttggtccggt 420

gctattgtta ccaacccaaa ctgcacttcc acgccggcga cgatggtgtt gcgccctttg 480

ctcgatacct ttggagtccg gcgcatgctt ttggtgtcaa tgcaagccct ctctggagcc 540

ggctacccag gtgtgccctc atacgatgta gttgataacg tgatccccta catcggtgga 600

gaagaaccaa aactcgagat tgagccgcag aaaatgctgg gacgtctgga aggagaaacg 660

attgttccag caggcttcac gacttccgca cactgcaatc gggtccctgt gctcgaaggc 720

cacctggttt gtctctcgat cgagcttgaa cggaaagccg accctgccga gatcgcgacg 780

gtgctcagca atttccgtgc actccctcag gaattgcggc tgccgactgc gccagagcag 840

cctatcattg tacgtcacga acccgaccgt cctcaaccgc gccgcgaccg tgatgctgga 900

cggggaatgg ccaccgtagt aggtcgcatt cggccctgca gcctttttga cattaagttg 960

atcgcattgt cacataacac catccggggc gccgccggag cgagcatcct gaacgccgag 1020

cttatgcatg cccaaggttg gctggcg 1047

<210> 28

<211> 349

<212> PRT

<213> 人工序列

<220>

<223> 来自嗜热光合绿曲菌的进行密码子优化的asd

<400> 28

Met Ala Thr Ile Pro Val Ala Val Leu Gly Ala Thr Gly Ala Val Gly

1 5 10 15

Gln Arg Phe Ile Gln Leu Leu Glu Gly His Pro Leu Phe Gln Val Val

20 25 30

Ala Leu Thr Gly Ser Glu Arg Ser Ala Gly Lys Lys Tyr His Glu Val

35 40 45

Cys Arg Trp Val Leu Asp Thr Pro Met Pro Ala Ala Val Ala Asn Leu

50 55 60

Thr Val Leu Asp Ala Asp Ala Asp Leu Pro Ala Gln Leu Val Phe Ser

65 70 75 80

Ala Leu Pro Ser Thr Val Ala Gly Pro Ile Glu Gln Arg Leu Ala Ala

85 90 95

Ala Gly His Ile Val Cys Ser Asn Ala Ser Asn His Arg Met Glu Pro

100 105 110

Asp Val Pro Leu Ile Ile Pro Glu Val Asn Pro Asp His Leu Ala Leu

115 120 125

Ile Pro Val Gln Arg Arg Arg Arg Gly Trp Ser Gly Ala Ile Val Thr

130 135 140

Asn Pro Asn Cys Thr Ser Thr Pro Ala Thr Met Val Leu Arg Pro Leu

145 150 155 160

Leu Asp Thr Phe Gly Val Arg Arg Met Leu Leu Val Ser Met Gln Ala

165 170 175

Leu Ser Gly Ala Gly Tyr Pro Gly Val Pro Ser Tyr Asp Val Val Asp

180 185 190

Asn Val Ile Pro Tyr Ile Gly Gly Glu Glu Pro Lys Leu Glu Ile Glu

195 200 205

Pro Gln Lys Met Leu Gly Arg Leu Glu Gly Glu Thr Ile Val Pro Ala

210 215 220

Gly Phe Thr Thr Ser Ala His Cys Asn Arg Val Pro Val Leu Glu Gly

225 230 235 240

His Leu Val Cys Leu Ser Ile Glu Leu Glu Arg Lys Ala Asp Pro Ala

245 250 255

Glu Ile Ala Thr Val Leu Ser Asn Phe Arg Ala Leu Pro Gln Glu Leu

260 265 270

Arg Leu Pro Thr Ala Pro Glu Gln Pro Ile Ile Val Arg His Glu Pro

275 280 285

Asp Arg Pro Gln Pro Arg Arg Asp Arg Asp Ala Gly Arg Gly Met Ala

290 295 300

Thr Val Val Gly Arg Ile Arg Pro Cys Ser Leu Phe Asp Ile Lys Leu

305 310 315 320

Ile Ala Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ser Ile

325 330 335

Leu Asn Ala Glu Leu Met His Ala Gln Gly Trp Leu Ala

340 345

<210> 29

<211> 1089

<212> DNA

<213> 人工序列

<220>

<223> 来自敏捷乳杆菌的进行密码子优化的asd

<400> 29

atggatgaaa aactccgtgc cggtgttctg ggcgccacgg gtatggtagg acagcggttc 60

gtagcgatgt tggagaatca cccgtggttc gaagtaacca ctcttgcagc ttcgccgcgc 120

tcagcaggta aaacgtacgc acaggctgtg gatggccggt ggaaaatgga aactcccatt 180

ccagaggccg tcaaggatct caagattctt gatgtatcgg aagttgagaa agtcgcagct 240

caagtcgatt ttgtgttttc cgcagtttct atgtccaaag acaagattaa agcgattgaa 300

gaagcctacg cgaaaaccga aactccggta gtatcgaaca attcggcgca ccgttggacc 360

ccagatgttc ctatggtcgt gcccgaaatt aacccggagc atttcaaggt aattgattac 420

cagcggaaac ggctcggcac gaagcgcggc ttcattgccg ttaagccgaa ctgttctatc 480

cagagctacg ccccggctct cagcgcatgg ttgaaattcg aaccgtacga ggtaatcgct 540

tcaacttatc aggctatctc gggagctggt aagaacttcg acgactggcc ggagatgaag 600

ggaaacatca tcccttttat ttctggcgag gaggaaaaat cagagaagga gcccctcaag 660

atctggggac aacttgacga agctaaggga gagatcgtcc cagccactag ccctgttatt 720

acgagccaat gtattcgggt cccgatcctt tacggacaca ccgcgaccgt ctttgttaaa 780

ttcaagcaga acccaacgaa agaggaactg gtagctgctt tggaatcata tcagggactg 840

cctcaatcct tgaatttgcc gtctacccct aagcaattta ttcagtatct cagcgaagac 900

gaccgtccgc aggttgcgaa ggacgttaac tttgagaatg gtatgggtat ctctattggc 960

cgccttcgta aagattcggt ttacgattgg aagttcgtag gactctcgca caacaccgcg 1020

cgtggcgccg caggaggcgg cgtcctttcg gccgaattgc tgacggctca gggctatatt 1080

accaaaaag 1089

<210> 30

<211> 363

<212> PRT

<213> 人工序列

<220>

<223> 来自敏捷乳杆菌的进行密码子优化的asd

<400> 30

Met Asp Glu Lys Leu Arg Ala Gly Val Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Ala Met Leu Glu Asn His Pro Trp Phe Glu Val

20 25 30

Thr Thr Leu Ala Ala Ser Pro Arg Ser Ala Gly Lys Thr Tyr Ala Gln

35 40 45

Ala Val Asp Gly Arg Trp Lys Met Glu Thr Pro Ile Pro Glu Ala Val

50 55 60

Lys Asp Leu Lys Ile Leu Asp Val Ser Glu Val Glu Lys Val Ala Ala

65 70 75 80

Gln Val Asp Phe Val Phe Ser Ala Val Ser Met Ser Lys Asp Lys Ile

85 90 95

Lys Ala Ile Glu Glu Ala Tyr Ala Lys Thr Glu Thr Pro Val Val Ser

100 105 110

Asn Asn Ser Ala His Arg Trp Thr Pro Asp Val Pro Met Val Val Pro

115 120 125

Glu Ile Asn Pro Glu His Phe Lys Val Ile Asp Tyr Gln Arg Lys Arg

130 135 140

Leu Gly Thr Lys Arg Gly Phe Ile Ala Val Lys Pro Asn Cys Ser Ile

145 150 155 160

Gln Ser Tyr Ala Pro Ala Leu Ser Ala Trp Leu Lys Phe Glu Pro Tyr

165 170 175

Glu Val Ile Ala Ser Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys Asn

180 185 190

Phe Asp Asp Trp Pro Glu Met Lys Gly Asn Ile Ile Pro Phe Ile Ser

195 200 205

Gly Glu Glu Glu Lys Ser Glu Lys Glu Pro Leu Lys Ile Trp Gly Gln

210 215 220

Leu Asp Glu Ala Lys Gly Glu Ile Val Pro Ala Thr Ser Pro Val Ile

225 230 235 240

Thr Ser Gln Cys Ile Arg Val Pro Ile Leu Tyr Gly His Thr Ala Thr

245 250 255

Val Phe Val Lys Phe Lys Gln Asn Pro Thr Lys Glu Glu Leu Val Ala

260 265 270

Ala Leu Glu Ser Tyr Gln Gly Leu Pro Gln Ser Leu Asn Leu Pro Ser

275 280 285

Thr Pro Lys Gln Phe Ile Gln Tyr Leu Ser Glu Asp Asp Arg Pro Gln

290 295 300

Val Ala Lys Asp Val Asn Phe Glu Asn Gly Met Gly Ile Ser Ile Gly

305 310 315 320

Arg Leu Arg Lys Asp Ser Val Tyr Asp Trp Lys Phe Val Gly Leu Ser

325 330 335

His Asn Thr Ala Arg Gly Ala Ala Gly Gly Gly Val Leu Ser Ala Glu

340 345 350

Leu Leu Thr Ala Gln Gly Tyr Ile Thr Lys Lys

355 360

<210> 31

<211> 1092

<212> DNA

<213> 人工序列

<220>

<223> 来自小鸡双歧杆菌的进行密码子优化的asd

<400> 31

atgtccgaga aactgaaggt aggaattatt ggagcgaccg gcatggtggg tcagcggttc 60

gtgactctgt tggataatca cccatggttt gaggtcacca ccttggctgc ctcagcacac 120

tcggccggaa aaacctacga gcaggccgtt ggtggccggt ggaagatgga gacgcctatg 180

ccggcggcgg tgaaggacat gattgtccgg gatgccaagg atgtggagag cgtggctgca 240

gacgtggact tcgtgttctc tgcagtgaac atgccgaagg acgagatccg tgccttggag 300

gagcgctacg ccaagacgga gactcccgtt gtatcaaaca actcggccca ccgttggacg 360

ccagacgtac ccatggtagt ccccgagatt aatcccgaac attatgaagt aatcaagtac 420

cagcgggctc gtcttggtac tacgcgtggc ttcatcgccg tgaaaccaaa ctgctccatc 480

caggcatata cgccggcact cgccgcgtgg cgggagttcg aaccgcgtga agtcgtggta 540

tctacttacc aagcgatttc tggtgctggt aagactttcg cggactggcc agaaatggaa 600

ggcaatatca tccctttcat tagcggtgaa gaggagaagt ccgagcggga accactgcgg 660

gtatttggcc acgtcgatga gagcaagggt cagattgtcc cctttgatgg tccactccgt 720

atcacgtcgc agtgtatccg tgtacccgta ttgaatggtc acactgctac tgtttttatc 780

aacttcggca aaaaaccatc taaggatgaa ctcatcgacc gtcttgtgaa ctacacgtcg 840

gaggcgagcc gtcttggtct ccctcacgct cccaaacagt tcatccaata tctgactgag 900

gatgatcgtc cgcaggtacg tttggacgtt gattacgagg gcggtatggg agtttccatc 960

ggtcgcctgc gcgaggacac gctcttcgac ttcaaattcg tgggactcgc tcataacacg 1020

ctgcgtggag ccgcaggtgg agcacttgaa tccgcagaaa tgttgaaagc actcggatat 1080

atttcggcga aa 1092

<210> 32

<211> 364

<212> PRT

<213> 人工序列

<220>

<223> 来自小鸡双歧杆菌的进行密码子优化的asd

<400> 32

Met Ser Glu Lys Leu Lys Val Gly Ile Ile Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Thr Leu Leu Asp Asn His Pro Trp Phe Glu Val

20 25 30

Thr Thr Leu Ala Ala Ser Ala His Ser Ala Gly Lys Thr Tyr Glu Gln

35 40 45

Ala Val Gly Gly Arg Trp Lys Met Glu Thr Pro Met Pro Ala Ala Val

50 55 60

Lys Asp Met Ile Val Arg Asp Ala Lys Asp Val Glu Ser Val Ala Ala

65 70 75 80

Asp Val Asp Phe Val Phe Ser Ala Val Asn Met Pro Lys Asp Glu Ile

85 90 95

Arg Ala Leu Glu Glu Arg Tyr Ala Lys Thr Glu Thr Pro Val Val Ser

100 105 110

Asn Asn Ser Ala His Arg Trp Thr Pro Asp Val Pro Met Val Val Pro

115 120 125

Glu Ile Asn Pro Glu His Tyr Glu Val Ile Lys Tyr Gln Arg Ala Arg

130 135 140

Leu Gly Thr Thr Arg Gly Phe Ile Ala Val Lys Pro Asn Cys Ser Ile

145 150 155 160

Gln Ala Tyr Thr Pro Ala Leu Ala Ala Trp Arg Glu Phe Glu Pro Arg

165 170 175

Glu Val Val Val Ser Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys Thr

180 185 190

Phe Ala Asp Trp Pro Glu Met Glu Gly Asn Ile Ile Pro Phe Ile Ser

195 200 205

Gly Glu Glu Glu Lys Ser Glu Arg Glu Pro Leu Arg Val Phe Gly His

210 215 220

Val Asp Glu Ser Lys Gly Gln Ile Val Pro Phe Asp Gly Pro Leu Arg

225 230 235 240

Ile Thr Ser Gln Cys Ile Arg Val Pro Val Leu Asn Gly His Thr Ala

245 250 255

Thr Val Phe Ile Asn Phe Gly Lys Lys Pro Ser Lys Asp Glu Leu Ile

260 265 270

Asp Arg Leu Val Asn Tyr Thr Ser Glu Ala Ser Arg Leu Gly Leu Pro

275 280 285

His Ala Pro Lys Gln Phe Ile Gln Tyr Leu Thr Glu Asp Asp Arg Pro

290 295 300

Gln Val Arg Leu Asp Val Asp Tyr Glu Gly Gly Met Gly Val Ser Ile

305 310 315 320

Gly Arg Leu Arg Glu Asp Thr Leu Phe Asp Phe Lys Phe Val Gly Leu

325 330 335

Ala His Asn Thr Leu Arg Gly Ala Ala Gly Gly Ala Leu Glu Ser Ala

340 345 350

Glu Met Leu Lys Ala Leu Gly Tyr Ile Ser Ala Lys

355 360

<210> 33

<211> 1059

<212> DNA

<213> 人工序列

<220>

<223> 来自细菌双歧杆菌的进行密码子优化的asd

<400> 33

atgaagcaat tcaatgtggg aattttggga gcaacgggtg cagttggcca gaaattcatc 60

aatctcctcc agggtcatcc ttggttcacg attacggctc tcggagcatc cgaacgttcc 120

gcgggaaaat cctacgctga agcagttaat tggattgaag ccgttgagtt gcctgacgca 180

attgcctcta tgacggtcac tgattgctct cccgcaagca tgaaaggcgt tgatttcgtg 240

ttttctggtt tggacgcgtc tgtagcgacc gaacttgagg gcgatctcgc tcgggctggt 300

attcccgtga tctcaaatgc taagaactat cgcactcacc cgcatgtccc ccttctggta 360

ccagaggtga acgcgaccca caccgagatg attaaggcac aagattttga tccttccggc 420

cgtggcttta tcgtaacgaa tccaaattgt gtcgcggttc ctctcgtgat ggcgctcaag 480

cctctcatgg acgcgtacgg tatccaggca gtcgccctca cgactatgca atcggtgtct 540

ggtgctggtt accccggagt cgcctctttg gacatcctgg gaaatgtgat cccatttatt 600

tccggcgagg agccgaaaat cgccgcggag cctatgaaat tgttgggccg gctgggagga 660

gaccaaaccg tcaccgaggc ccgtttccct attgacgcta ccgcaactcg tgtgcctacc 720

atcgagggac atcttttgag cgtgaagatt aagttcgaac aaaagccagc gtctgctgac 780

gaaattaagg ctgtgctccg taactggaag cacgaggttt caggtttgga tcttccgtct 840

tctccgcgta ctgcgctcaa agtttacgat gacgatcggt ttccacaacc acgcaaaaac 900

gcttacaacg agaacggaat gcaagtcggc gtgggtcgtg tgcgtatgct cgagtttttt 960

gacgcgggtc ttgttgcatt gggccataat acgtgtcggg gtgcggctgg cgtagctatc 1020

ttgaacgctg agctgctggt aaaacagggt ttcatccaa 1059

<210> 34

<211> 353

<212> PRT

<213> 人工序列

<220>

<223> 来自细菌双歧杆菌的进行密码子优化的asd

<400> 34

Met Lys Gln Phe Asn Val Gly Ile Leu Gly Ala Thr Gly Ala Val Gly

1 5 10 15

Gln Lys Phe Ile Asn Leu Leu Gln Gly His Pro Trp Phe Thr Ile Thr

20 25 30

Ala Leu Gly Ala Ser Glu Arg Ser Ala Gly Lys Ser Tyr Ala Glu Ala

35 40 45

Val Asn Trp Ile Glu Ala Val Glu Leu Pro Asp Ala Ile Ala Ser Met

50 55 60

Thr Val Thr Asp Cys Ser Pro Ala Ser Met Lys Gly Val Asp Phe Val

65 70 75 80

Phe Ser Gly Leu Asp Ala Ser Val Ala Thr Glu Leu Glu Gly Asp Leu

85 90 95

Ala Arg Ala Gly Ile Pro Val Ile Ser Asn Ala Lys Asn Tyr Arg Thr

100 105 110

His Pro His Val Pro Leu Leu Val Pro Glu Val Asn Ala Thr His Thr

115 120 125

Glu Met Ile Lys Ala Gln Asp Phe Asp Pro Ser Gly Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Val Ala Val Pro Leu Val Met Ala Leu Lys

145 150 155 160

Pro Leu Met Asp Ala Tyr Gly Ile Gln Ala Val Ala Leu Thr Thr Met

165 170 175

Gln Ser Val Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ile

180 185 190

Leu Gly Asn Val Ile Pro Phe Ile Ser Gly Glu Glu Pro Lys Ile Ala

195 200 205

Ala Glu Pro Met Lys Leu Leu Gly Arg Leu Gly Gly Asp Gln Thr Val

210 215 220

Thr Glu Ala Arg Phe Pro Ile Asp Ala Thr Ala Thr Arg Val Pro Thr

225 230 235 240

Ile Glu Gly His Leu Leu Ser Val Lys Ile Lys Phe Glu Gln Lys Pro

245 250 255

Ala Ser Ala Asp Glu Ile Lys Ala Val Leu Arg Asn Trp Lys His Glu

260 265 270

Val Ser Gly Leu Asp Leu Pro Ser Ser Pro Arg Thr Ala Leu Lys Val

275 280 285

Tyr Asp Asp Asp Arg Phe Pro Gln Pro Arg Lys Asn Ala Tyr Asn Glu

290 295 300

Asn Gly Met Gln Val Gly Val Gly Arg Val Arg Met Leu Glu Phe Phe

305 310 315 320

Asp Ala Gly Leu Val Ala Leu Gly His Asn Thr Cys Arg Gly Ala Ala

325 330 335

Gly Val Ala Ile Leu Asn Ala Glu Leu Leu Val Lys Gln Gly Phe Ile

340 345 350

Gln

<210> 35

<211> 1086

<212> DNA

<213> 人工序列

<220>

<223> 来自汉氏粘球菌的进行密码子优化的asd

<400> 35

atggctcgtt tgcgtgctgc gttgatcggc gcgacgggac tcgccggtca acagttcatt 60

gcggccctca aagaccaccc ttttattgaa ttgactggac tggcagcgtc gccacggtcg 120

gcgggcaaaa cgtacgctga agccctcaag actgcatcag gtatgactgc atggttcgta 180

ccagaacccc tgccagccgg cattgctggt atgaaagttg ttgcgggaga cgccctcgag 240

gcgaaagatt atgaccttgt tttctctgct gtagaagcgg acgtagcccg cgagcttgag 300

ccaaaactgg cgaaagacat tccagtattt tcggctgcta gcgcgttccg ctacgaagat 360

gacgtaccac tgcttatccc ccccgttaac gccgcacacg cgcctctgat tcgtgaacag 420

cagcgtcgtc gtggttggaa aggttatgtt gttccaatcc caaattgcac gaccaccggc 480

cttgcggtta cgctcgcgcc tcttgtcgag cggtttggag tcaaggctgt cttgatgacc 540

tcacttcaag caatgagcgg agcgggacgg tctcctggcg tgatcggcat ggacattctt 600

gataacgtga ttccgtatat ccccaaagag gaacataaag tagaagtgga gactaagaaa 660

attcttggtg ctcttcgtcc tggtggcgaa ggccttacgc cccacgatat ccgcgtctca 720

tgcacctgca ctcgggtcgc ggtcatggaa ggccatactg aatcagtttt tgtttctctg 780

gaaaaaaaag ctactgttgc agaggttacc caagcgttgc gtgaatggca gggcgcggaa 840

cttgcacgga aattgccgtc cgccgcaccg cgttggattg aagtgcttga tgaccccttc 900

cgcccacaac cgcgtcttga ccgggacacg cacggtggaa tggctaccac ggtgggtcgg 960

attcgtgagg acggtgtttt ggagaacgga tttaagtacg ttttggtttc tcacaacact 1020

aaaatgggag ctgctcgcgg cgcgattttg gtagcagaac tgcttcgggc tcaaggcttg 1080

cttgga 1086

<210> 36

<211> 362

<212> PRT

<213> 人工序列

<220>

<223> 来自汉氏粘球菌的进行密码子优化的asd

<400> 36

Met Ala Arg Leu Arg Ala Ala Leu Ile Gly Ala Thr Gly Leu Ala Gly

1 5 10 15

Gln Gln Phe Ile Ala Ala Leu Lys Asp His Pro Phe Ile Glu Leu Thr

20 25 30

Gly Leu Ala Ala Ser Pro Arg Ser Ala Gly Lys Thr Tyr Ala Glu Ala

35 40 45

Leu Lys Thr Ala Ser Gly Met Thr Ala Trp Phe Val Pro Glu Pro Leu

50 55 60

Pro Ala Gly Ile Ala Gly Met Lys Val Val Ala Gly Asp Ala Leu Glu

65 70 75 80

Ala Lys Asp Tyr Asp Leu Val Phe Ser Ala Val Glu Ala Asp Val Ala

85 90 95

Arg Glu Leu Glu Pro Lys Leu Ala Lys Asp Ile Pro Val Phe Ser Ala

100 105 110

Ala Ser Ala Phe Arg Tyr Glu Asp Asp Val Pro Leu Leu Ile Pro Pro

115 120 125

Val Asn Ala Ala His Ala Pro Leu Ile Arg Glu Gln Gln Arg Arg Arg

130 135 140

Gly Trp Lys Gly Tyr Val Val Pro Ile Pro Asn Cys Thr Thr Thr Gly

145 150 155 160

Leu Ala Val Thr Leu Ala Pro Leu Val Glu Arg Phe Gly Val Lys Ala

165 170 175

Val Leu Met Thr Ser Leu Gln Ala Met Ser Gly Ala Gly Arg Ser Pro

180 185 190

Gly Val Ile Gly Met Asp Ile Leu Asp Asn Val Ile Pro Tyr Ile Pro

195 200 205

Lys Glu Glu His Lys Val Glu Val Glu Thr Lys Lys Ile Leu Gly Ala

210 215 220

Leu Arg Pro Gly Gly Glu Gly Leu Thr Pro His Asp Ile Arg Val Ser

225 230 235 240

Cys Thr Cys Thr Arg Val Ala Val Met Glu Gly His Thr Glu Ser Val

245 250 255

Phe Val Ser Leu Glu Lys Lys Ala Thr Val Ala Glu Val Thr Gln Ala

260 265 270

Leu Arg Glu Trp Gln Gly Ala Glu Leu Ala Arg Lys Leu Pro Ser Ala

275 280 285

Ala Pro Arg Trp Ile Glu Val Leu Asp Asp Pro Phe Arg Pro Gln Pro

290 295 300

Arg Leu Asp Arg Asp Thr His Gly Gly Met Ala Thr Thr Val Gly Arg

305 310 315 320

Ile Arg Glu Asp Gly Val Leu Glu Asn Gly Phe Lys Tyr Val Leu Val

325 330 335

Ser His Asn Thr Lys Met Gly Ala Ala Arg Gly Ala Ile Leu Val Ala

340 345 350

Glu Leu Leu Arg Ala Gln Gly Leu Leu Gly

355 360

<210> 37

<211> 1083

<212> DNA

<213> 人工序列

<220>

<223> 来自固氮类芽孢杆菌的进行密码子优化的asd

<400> 37

atgacggaga aattgcgtgc tggcatcgtc ggcggaactg gaatggtcgg ccagcgcttt 60

attgcgcttc ttgagaatca cccttggttt caggtaaccg ctattgccgc tagcgccaac 120

tctgcgggta aaacgtatga ggaatccgta aaaggccggt ggaagctctc tacgccaatg 180

cctgaaagcg tcaagcacat tccagtgcag gacgcgtcac gtgtcgagga agtagccgca 240

ggcgtggatt tgatcttttg cgcggtcgat atgaaaaaga atgaaatcca ggcactcgag 300

gaagcctatg ccaaagcggg tgtccccgtc atcagcaaca actccgcaca tcggtggact 360

ccagacgttc cgatggtcgt tccagaaatc aacccagaac acctggaggt cattgcagct 420

cagcggaaac gcctgggaac cgaaactggc ttcattgcgg taaagcctaa ttgcagcatc 480

cagtcttatg ttccaatgct gaacgcactt cggggcttta agcctactca agttgtcgca 540

tccacttatc aggcgatttc tggtgccggt aaaacgttca cggattggcc cgaaatgctg 600

gacaacgtaa tcccttacat tggaggtgag gaggaaaaaa gcgaacaaga gccgcttcgc 660

atttggggta ctgtagagga tggccaaatt gttaaagcct ccgcacccca tattacgacg 720

caatgcatcc gggtaccagt gactgacggt cacctggcca ctgttttcgt tagcttcgag 780

aataaaccct caaaggaaga cattctcgaa tcctggaaaa attacaaggg tcggccgcaa 840

gagcttgaac ttccgtcagc acccaaacaa ttcatcactt acttcgaaga ggaaaatcgg 900

ccacagacca acctcgaccg cgacatcgaa aatggaatgg gcatttccgc tggccgcctc 960

cgggaggata gcctttatga ctttaaattc gttggactct cacataacac tctgcgcgga 1020

gctgctggtg gtgcggtact gatcgcagag ttgctcaagg cagagggcta cattactaag 1080

cgc 1083

<210> 38

<211> 361

<212> PRT

<213> 人工序列

<220>

<223> 来自固氮类芽孢杆菌的进行密码子优化的asd

<400> 38

Met Thr Glu Lys Leu Arg Ala Gly Ile Val Gly Gly Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Ala Leu Leu Glu Asn His Pro Trp Phe Gln Val

20 25 30

Thr Ala Ile Ala Ala Ser Ala Asn Ser Ala Gly Lys Thr Tyr Glu Glu

35 40 45

Ser Val Lys Gly Arg Trp Lys Leu Ser Thr Pro Met Pro Glu Ser Val

50 55 60

Lys His Ile Pro Val Gln Asp Ala Ser Arg Val Glu Glu Val Ala Ala

65 70 75 80

Gly Val Asp Leu Ile Phe Cys Ala Val Asp Met Lys Lys Asn Glu Ile

85 90 95

Gln Ala Leu Glu Glu Ala Tyr Ala Lys Ala Gly Val Pro Val Ile Ser

100 105 110

Asn Asn Ser Ala His Arg Trp Thr Pro Asp Val Pro Met Val Val Pro

115 120 125

Glu Ile Asn Pro Glu His Leu Glu Val Ile Ala Ala Gln Arg Lys Arg

130 135 140

Leu Gly Thr Glu Thr Gly Phe Ile Ala Val Lys Pro Asn Cys Ser Ile

145 150 155 160

Gln Ser Tyr Val Pro Met Leu Asn Ala Leu Arg Gly Phe Lys Pro Thr

165 170 175

Gln Val Val Ala Ser Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys Thr

180 185 190

Phe Thr Asp Trp Pro Glu Met Leu Asp Asn Val Ile Pro Tyr Ile Gly

195 200 205

Gly Glu Glu Glu Lys Ser Glu Gln Glu Pro Leu Arg Ile Trp Gly Thr

210 215 220

Val Glu Asp Gly Gln Ile Val Lys Ala Ser Ala Pro His Ile Thr Thr

225 230 235 240

Gln Cys Ile Arg Val Pro Val Thr Asp Gly His Leu Ala Thr Val Phe

245 250 255

Val Ser Phe Glu Asn Lys Pro Ser Lys Glu Asp Ile Leu Glu Ser Trp

260 265 270

Lys Asn Tyr Lys Gly Arg Pro Gln Glu Leu Glu Leu Pro Ser Ala Pro

275 280 285

Lys Gln Phe Ile Thr Tyr Phe Glu Glu Glu Asn Arg Pro Gln Thr Asn

290 295 300

Leu Asp Arg Asp Ile Glu Asn Gly Met Gly Ile Ser Ala Gly Arg Leu

305 310 315 320

Arg Glu Asp Ser Leu Tyr Asp Phe Lys Phe Val Gly Leu Ser His Asn

325 330 335

Thr Leu Arg Gly Ala Ala Gly Gly Ala Val Leu Ile Ala Glu Leu Leu

340 345 350

Lys Ala Glu Gly Tyr Ile Thr Lys Arg

355 360

<210> 39

<211> 1032

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的asd

<400> 39

atgaccacta tcgcggtcgt tggagcaacg ggacaagtag gacaggtgat gcggacgctt 60

ctggaagaac gtaattttcc tgccgatacg gtccggttct ttgcgtcgcc gcggagcgcc 120

ggtcggaaga tcgagttccg gggtaccgaa attgaggtag aggacatcac ccaagcgacc 180

gaggagtctc tcaaagatat tgatgtagca cttttttctg caggcggtac cgcgtcgaag 240

caatatgctc ctctgttcgc ggctgcgggt gcgacggtgg tggacaattc ttcggcctgg 300

cggaaagatg atgaagtacc gttgattgtc tctgaagtaa atccttcgga caaagattct 360

ctcgtgaagg gtatcattgc gaaccctaac tgtaccacca tggctgcaat gcctgtactg 420

aaaccacttc atgatgccgc aggtcttgta aagcttcatg tctcctcgta tcaagcggta 480

tccggtagcg gtctcgcagg cgtcgaaacc ctcgcaaaac aggtcgctgc tgttggtgac 540

cataacgtcg agttcgtcca cgacggtcag gccgccgacg caggagatgt tggcccatac 600

gtcagcccca tcgcttataa tgttttgccc ttcgcaggta acctggttga cgacggaacc 660

tttgagaccg acgaggagca aaaactgcgc aatgaaagcc gtaagatcct cggactgccg 720

gacttgaaag tttccggtac gtgtgtacgt gttcccgtgt ttactggaca taccttgact 780

atccatgctg agttcgataa agcaattacc gtggaccaag ctcaggagat cctcggagca 840

gcgtcgggag taaagttggt agacgtaccg actcctctgg cggctgcggg tattgatgag 900

tcgcttgtag gacgcattcg ccaagactcg acggtggatg acaaccgcgg actcgttctc 960

gtggtatctg gcgacaatct tcggaagggc gcggctttga ataccatcca gatcgccgag 1020

cttctggtaa ag 1032

<210> 40

<211> 344

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的asd

<400> 40

Met Thr Thr Ile Ala Val Val Gly Ala Thr Gly Gln Val Gly Gln Val

1 5 10 15

Met Arg Thr Leu Leu Glu Glu Arg Asn Phe Pro Ala Asp Thr Val Arg

20 25 30

Phe Phe Ala Ser Pro Arg Ser Ala Gly Arg Lys Ile Glu Phe Arg Gly

35 40 45

Thr Glu Ile Glu Val Glu Asp Ile Thr Gln Ala Thr Glu Glu Ser Leu

50 55 60

Lys Asp Ile Asp Val Ala Leu Phe Ser Ala Gly Gly Thr Ala Ser Lys

65 70 75 80

Gln Tyr Ala Pro Leu Phe Ala Ala Ala Gly Ala Thr Val Val Asp Asn

85 90 95

Ser Ser Ala Trp Arg Lys Asp Asp Glu Val Pro Leu Ile Val Ser Glu

100 105 110

Val Asn Pro Ser Asp Lys Asp Ser Leu Val Lys Gly Ile Ile Ala Asn

115 120 125

Pro Asn Cys Thr Thr Met Ala Ala Met Pro Val Leu Lys Pro Leu His

130 135 140

Asp Ala Ala Gly Leu Val Lys Leu His Val Ser Ser Tyr Gln Ala Val

145 150 155 160

Ser Gly Ser Gly Leu Ala Gly Val Glu Thr Leu Ala Lys Gln Val Ala

165 170 175

Ala Val Gly Asp His Asn Val Glu Phe Val His Asp Gly Gln Ala Ala

180 185 190

Asp Ala Gly Asp Val Gly Pro Tyr Val Ser Pro Ile Ala Tyr Asn Val

195 200 205

Leu Pro Phe Ala Gly Asn Leu Val Asp Asp Gly Thr Phe Glu Thr Asp

210 215 220

Glu Glu Gln Lys Leu Arg Asn Glu Ser Arg Lys Ile Leu Gly Leu Pro

225 230 235 240

Asp Leu Lys Val Ser Gly Thr Cys Val Arg Val Pro Val Phe Thr Gly

245 250 255

His Thr Leu Thr Ile His Ala Glu Phe Asp Lys Ala Ile Thr Val Asp

260 265 270

Gln Ala Gln Glu Ile Leu Gly Ala Ala Ser Gly Val Lys Leu Val Asp

275 280 285

Val Pro Thr Pro Leu Ala Ala Ala Gly Ile Asp Glu Ser Leu Val Gly

290 295 300

Arg Ile Arg Gln Asp Ser Thr Val Asp Asp Asn Arg Gly Leu Val Leu

305 310 315 320

Val Val Ser Gly Asp Asn Leu Arg Lys Gly Ala Ala Leu Asn Thr Ile

325 330 335

Gln Ile Ala Glu Leu Leu Val Lys

340

<210> 41

<211> 1341

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌进行密码子优化的gdh

<400> 41

atgactgtag atgaacaggt ttctaactac tacgacatgc ttctcaaacg taatgctgga 60

gagcccgaat ttcatcaggc ggttgctgaa gtgcttgaat ccctcaagat tgttcttgaa 120

aaagatccgc actacgcgga ctatggcctc atccagcggc tgtgtgaacc tgaacgtcaa 180

ctgatcttcc gtgtgccgtg ggtagatgat cagggacaag tgcacgtcaa ccgcggtttt 240

cgtgtacagt ttaattcggc gctcggtccc tacaaaggcg gattgcgttt ccaccctagc 300

gtcaatcttg gcatcgtcaa gtttttgggt ttcgaacaaa tttttaagaa ttcccttacc 360

ggactgccta tcggaggcgg aaagggcggt tcggattttg accctaaagg caagagcgat 420

ctcgaaatca tgcggttttg tcagtctttt atgaccgaac tgcatcgtca catcggcgaa 480

tatcgcgatg tcccggcggg tgatatcggc gtgggtggtc gtgagatcgg atacctcttt 540

ggtcattatc gtcggatggc gaatcagcac gaatcgggag tccttaccgg caaaggtctg 600

acttggggcg gcagcctggt tcggaccgaa gccacgggat acggttgtgt ctatttcgta 660

tcggagatga tcaaagcaaa aggcgagtca atctcgggac agaagattat cgtatccgga 720

tcgggaaatg ttgctaccta tgccattgag aaagctcaag agctgggcgc gacggtgatc 780

ggcttctcgg attcctcagg ctgggtgcat actccgaatg gtgtggacgt ggctaaactt 840

cgcgaaatca aggaagtacg tcgcgcacgc gtaagcgttt atgccgatga agtggaggga 900

gcaacctacc ataccgatgg atccatctgg gatcttaagt gtgacatcgc acttccttgc 960

gctacgcaaa atgaactgaa cggagagaat gcgaaaacgc tggccgataa tggttgccgc 1020

ttcgtcgcgg agggcgctaa catgccgagc accccggagg ccgtcgaagt ttttcgggag 1080

cgcgacatcc ggttcggccc cggcaaagcg gctaatgctg gcggagtggc aacgtcagcg 1140

ttggagatgc agcagaacgc atcccgggac tcatggagct tcgaatacac cgacgaacgc 1200

ctccaggtca ttatgaagaa catttttaag acgtgtgcgg aaaccgcagc cgagtatggc 1260

cacgagaacg attacgtcgt cggagcaaac attgcaggat ttaagaaagt tgctgatgcg 1320

atgctcgccc aaggtgtgat c 1341

<210> 42

<211> 447

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌进行密码子优化的gdh

<400> 42

Met Thr Val Asp Glu Gln Val Ser Asn Tyr Tyr Asp Met Leu Leu Lys

1 5 10 15

Arg Asn Ala Gly Glu Pro Glu Phe His Gln Ala Val Ala Glu Val Leu

20 25 30

Glu Ser Leu Lys Ile Val Leu Glu Lys Asp Pro His Tyr Ala Asp Tyr

35 40 45

Gly Leu Ile Gln Arg Leu Cys Glu Pro Glu Arg Gln Leu Ile Phe Arg

50 55 60

Val Pro Trp Val Asp Asp Gln Gly Gln Val His Val Asn Arg Gly Phe

65 70 75 80

Arg Val Gln Phe Asn Ser Ala Leu Gly Pro Tyr Lys Gly Gly Leu Arg

85 90 95

Phe His Pro Ser Val Asn Leu Gly Ile Val Lys Phe Leu Gly Phe Glu

100 105 110

Gln Ile Phe Lys Asn Ser Leu Thr Gly Leu Pro Ile Gly Gly Gly Lys

115 120 125

Gly Gly Ser Asp Phe Asp Pro Lys Gly Lys Ser Asp Leu Glu Ile Met

130 135 140

Arg Phe Cys Gln Ser Phe Met Thr Glu Leu His Arg His Ile Gly Glu

145 150 155 160

Tyr Arg Asp Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Ile

165 170 175

Gly Tyr Leu Phe Gly His Tyr Arg Arg Met Ala Asn Gln His Glu Ser

180 185 190

Gly Val Leu Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Leu Val Arg

195 200 205

Thr Glu Ala Thr Gly Tyr Gly Cys Val Tyr Phe Val Ser Glu Met Ile

210 215 220

Lys Ala Lys Gly Glu Ser Ile Ser Gly Gln Lys Ile Ile Val Ser Gly

225 230 235 240

Ser Gly Asn Val Ala Thr Tyr Ala Ile Glu Lys Ala Gln Glu Leu Gly

245 250 255

Ala Thr Val Ile Gly Phe Ser Asp Ser Ser Gly Trp Val His Thr Pro

260 265 270

Asn Gly Val Asp Val Ala Lys Leu Arg Glu Ile Lys Glu Val Arg Arg

275 280 285

Ala Arg Val Ser Val Tyr Ala Asp Glu Val Glu Gly Ala Thr Tyr His

290 295 300

Thr Asp Gly Ser Ile Trp Asp Leu Lys Cys Asp Ile Ala Leu Pro Cys

305 310 315 320

Ala Thr Gln Asn Glu Leu Asn Gly Glu Asn Ala Lys Thr Leu Ala Asp

325 330 335

Asn Gly Cys Arg Phe Val Ala Glu Gly Ala Asn Met Pro Ser Thr Pro

340 345 350

Glu Ala Val Glu Val Phe Arg Glu Arg Asp Ile Arg Phe Gly Pro Gly

355 360 365

Lys Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Ala Leu Glu Met Gln

370 375 380

Gln Asn Ala Ser Arg Asp Ser Trp Ser Phe Glu Tyr Thr Asp Glu Arg

385 390 395 400

Leu Gln Val Ile Met Lys Asn Ile Phe Lys Thr Cys Ala Glu Thr Ala

405 410 415

Ala Glu Tyr Gly His Glu Asn Asp Tyr Val Val Gly Ala Asn Ile Ala

420 425 430

Gly Phe Lys Lys Val Ala Asp Ala Met Leu Ala Gln Gly Val Ile

435 440 445

<210> 43

<211> 1350

<212> DNA

<213> 人工序列

<220>

<223> 来自共生梭菌的进行密码子优化的gdh

<400> 43

atgtccaagt acgttgaccg cgtcattgct gaagtcgaga aaaagtacgc cgacgaaccg 60

gaattcgttc aaaccgttga agaggtactc tcttcactcg gcccagtagt cgacgcacac 120

cccgagtatg aagaggttgc gctcttggag cgtatggtca ttccagaacg tgtcattgag 180

tttcgcgtcc cgtgggagga tgacaatggt aaagtacatg tgaatactgg ttaccgcgtc 240

caatttaatg gcgcgatcgg cccttataaa ggtggcttgc gcttcgcccc ttcggtcaac 300

ctttccatta tgaaatttct cggcttcgag caagcattca aagattccct gaccacgctt 360

cctatgggag gagcaaaagg cggttcagac ttcgacccaa acggaaaatc cgatcgcgaa 420

gtaatgcgct tctgccaggc gttcatgact gagttgtatc ggcatattgg tcccgatatc 480

gacgtgcctg ctggtgactt gggcgttggt gcgcgtgaaa ttggttacat gtacggacaa 540

taccggaaga tcgtcggcgg attctacaat ggcgtcctga ccggtaaagc ccggtcattc 600

ggtggaagct tggtccggcc cgaagcaact ggttacggat cggtgtatta tgtggaggct 660

gtgatgaaac atgaaaatga cacgcttgta ggtaaaactg ttgcactggc aggttttggt 720

aacgttgcat ggggtgcagc taagaagctc gcggagttgg gtgcgaaagc agtaactttg 780

tctggcccgg atggctatat ctacgacccc gagggtatca ctaccgagga aaagatcaat 840

tacatgcttg aaatgcgggc gtctggacgt aacaaggtac aggattacgc agacaagttt 900

ggagtgcaat tctttccggg tgaaaagcct tggggccaaa aagttgacat tattatgcct 960

tgtgcaactc agaatgatgt tgacctggaa caggctaaaa agatcgtggc gaacaacgtg 1020

aagtactaca tcgaagtagc caacatgcct actactaatg aagcattgcg gtttcttatg 1080

cagcaaccta acatggtagt cgcccccagc aaggctgtga acgcaggtgg agtactggta 1140

tcgggtttcg agatgtcaca aaattccgaa cgtctgtcat ggaccgccga agaagtcgat 1200

agcaaactgc atcaggtgat gactgacatt catgacggtt cagccgccgc agctgaacgc 1260

tacggacttg gttacaatct tgtcgcaggt gctaatatcg taggttttca gaagatcgcc 1320

gatgccatga tggctcaagg aatcgcttgg 1350

<210> 44

<211> 450

<212> PRT

<213> 人工序列

<220>

<223> 来自共生梭菌的进行密码子优化的gdh

<400> 44

Met Ser Lys Tyr Val Asp Arg Val Ile Ala Glu Val Glu Lys Lys Tyr

1 5 10 15

Ala Asp Glu Pro Glu Phe Val Gln Thr Val Glu Glu Val Leu Ser Ser

20 25 30

Leu Gly Pro Val Val Asp Ala His Pro Glu Tyr Glu Glu Val Ala Leu

35 40 45

Leu Glu Arg Met Val Ile Pro Glu Arg Val Ile Glu Phe Arg Val Pro

50 55 60

Trp Glu Asp Asp Asn Gly Lys Val His Val Asn Thr Gly Tyr Arg Val

65 70 75 80

Gln Phe Asn Gly Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe Ala

85 90 95

Pro Ser Val Asn Leu Ser Ile Met Lys Phe Leu Gly Phe Glu Gln Ala

100 105 110

Phe Lys Asp Ser Leu Thr Thr Leu Pro Met Gly Gly Ala Lys Gly Gly

115 120 125

Ser Asp Phe Asp Pro Asn Gly Lys Ser Asp Arg Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Thr Glu Leu Tyr Arg His Ile Gly Pro Asp Ile

145 150 155 160

Asp Val Pro Ala Gly Asp Leu Gly Val Gly Ala Arg Glu Ile Gly Tyr

165 170 175

Met Tyr Gly Gln Tyr Arg Lys Ile Val Gly Gly Phe Tyr Asn Gly Val

180 185 190

Leu Thr Gly Lys Ala Arg Ser Phe Gly Gly Ser Leu Val Arg Pro Glu

195 200 205

Ala Thr Gly Tyr Gly Ser Val Tyr Tyr Val Glu Ala Val Met Lys His

210 215 220

Glu Asn Asp Thr Leu Val Gly Lys Thr Val Ala Leu Ala Gly Phe Gly

225 230 235 240

Asn Val Ala Trp Gly Ala Ala Lys Lys Leu Ala Glu Leu Gly Ala Lys

245 250 255

Ala Val Thr Leu Ser Gly Pro Asp Gly Tyr Ile Tyr Asp Pro Glu Gly

260 265 270

Ile Thr Thr Glu Glu Lys Ile Asn Tyr Met Leu Glu Met Arg Ala Ser

275 280 285

Gly Arg Asn Lys Val Gln Asp Tyr Ala Asp Lys Phe Gly Val Gln Phe

290 295 300

Phe Pro Gly Glu Lys Pro Trp Gly Gln Lys Val Asp Ile Ile Met Pro

305 310 315 320

Cys Ala Thr Gln Asn Asp Val Asp Leu Glu Gln Ala Lys Lys Ile Val

325 330 335

Ala Asn Asn Val Lys Tyr Tyr Ile Glu Val Ala Asn Met Pro Thr Thr

340 345 350

Asn Glu Ala Leu Arg Phe Leu Met Gln Gln Pro Asn Met Val Val Ala

355 360 365

Pro Ser Lys Ala Val Asn Ala Gly Gly Val Leu Val Ser Gly Phe Glu

370 375 380

Met Ser Gln Asn Ser Glu Arg Leu Ser Trp Thr Ala Glu Glu Val Asp

385 390 395 400

Ser Lys Leu His Gln Val Met Thr Asp Ile His Asp Gly Ser Ala Ala

405 410 415

Ala Ala Glu Arg Tyr Gly Leu Gly Tyr Asn Leu Val Ala Gly Ala Asn

420 425 430

Ile Val Gly Phe Gln Lys Ile Ala Asp Ala Met Met Ala Gln Gly Ile

435 440 445

Ala Trp

450

<210> 45

<211> 744

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的dapB

<400> 45

atgggcatta aagttggagt gctgggagct aagggccggg taggtcagac gatcgtggca 60

gcagtgaacg aatcagacga tctcgagttg gtagcagaaa tcggtgtgga tgacgatctg 120

tctctgctcg tagacaacgg cgcggaggtc gttgttgact tcactacgcc taatgcggtg 180

atgggaaact tggagttctg tatcaacaac ggaatctccg cagtagtagg aaccaccgga 240

tttgatgatg cacgcctcga acaagtacgg gactggctgg aaggtaagga caacgtcgga 300

gtcttgattg cccctaactt tgcgatttca gcagtgctta ccatggtgtt ctctaaacag 360

gcggcgcgct ttttcgaatc cgcagaagtt atcgaacttc accacccgaa taaacttgac 420

gccccttctg gcactgcgat tcatactgct caaggtattg cagcggctcg taaagaggca 480

ggtatggatg cacagcccga tgcaactgag caagcccttg agggtagccg tggcgcgtct 540

gttgatggta ttccggtaca tgccgttcgc atgtcaggca tggtcgcaca tgaacaagta 600

atctttggca cgcaaggcca aacgcttact attaaacaag atagctacga tcgtaactct 660

ttcgcgccgg gtgttttggt tggagtccgc aatatcgcac agcatcccgg attggtggtt 720

ggccttgagc attaccttgg attg 744

<210> 46

<211> 248

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的dapB

<400> 46

Met Gly Ile Lys Val Gly Val Leu Gly Ala Lys Gly Arg Val Gly Gln

1 5 10 15

Thr Ile Val Ala Ala Val Asn Glu Ser Asp Asp Leu Glu Leu Val Ala

20 25 30

Glu Ile Gly Val Asp Asp Asp Leu Ser Leu Leu Val Asp Asn Gly Ala

35 40 45

Glu Val Val Val Asp Phe Thr Thr Pro Asn Ala Val Met Gly Asn Leu

50 55 60

Glu Phe Cys Ile Asn Asn Gly Ile Ser Ala Val Val Gly Thr Thr Gly

65 70 75 80

Phe Asp Asp Ala Arg Leu Glu Gln Val Arg Asp Trp Leu Glu Gly Lys

85 90 95

Asp Asn Val Gly Val Leu Ile Ala Pro Asn Phe Ala Ile Ser Ala Val

100 105 110

Leu Thr Met Val Phe Ser Lys Gln Ala Ala Arg Phe Phe Glu Ser Ala

115 120 125

Glu Val Ile Glu Leu His His Pro Asn Lys Leu Asp Ala Pro Ser Gly

130 135 140

Thr Ala Ile His Thr Ala Gln Gly Ile Ala Ala Ala Arg Lys Glu Ala

145 150 155 160

Gly Met Asp Ala Gln Pro Asp Ala Thr Glu Gln Ala Leu Glu Gly Ser

165 170 175

Arg Gly Ala Ser Val Asp Gly Ile Pro Val His Ala Val Arg Met Ser

180 185 190

Gly Met Val Ala His Glu Gln Val Ile Phe Gly Thr Gln Gly Gln Thr

195 200 205

Leu Thr Ile Lys Gln Asp Ser Tyr Asp Arg Asn Ser Phe Ala Pro Gly

210 215 220

Val Leu Val Gly Val Arg Asn Ile Ala Gln His Pro Gly Leu Val Val

225 230 235 240

Gly Leu Glu His Tyr Leu Gly Leu

245

<210> 47

<211> 819

<212> DNA

<213> 人工序列

<220>

<223> 来自大肠杆菌的进行密码子优化的dapB

<400> 47

atgcacgacg caaacattcg ggtcgccatt gcgggagctg gaggacgtat gggacgccag 60

ctcatccagg cggcgcttgc cctcgaaggc gtgcaattgg gagcagctct ggaacgcgag 120

ggctcttcac tcttgggctc tgatgccggc gagctggctg gtgccggcaa aacgggcgta 180

acggtccagt cttctctcga cgccgtaaag gatgattttg atgtgtttat tgactttacg 240

cgcccggagg gaactctgaa ccatctggca ttctgccggc agcatggtaa gggcatggtt 300

atcggaacca ccggatttga tgaggctgga aaacaggcga ttcgggatgc cgctgccgat 360

attgctatcg tattcgcagc aaacttcagc gtaggcgtta acgttatgct caaactgctg 420

gagaaggcag ctaaggtgat gggtgactat acggacattg agattattga agctcatcat 480

cgtcacaaag tagacgctcc ttcaggaacc gcgctggcaa tgggcgaagc aattgctcat 540

gcgttggaca aagacctcaa agactgcgcg gtgtattcac gggagggaca tactggtgaa 600

cgtgttcctg gtacgattgg ttttgccacc gtccgtgcag gcgacattgt gggagaacat 660

acggccatgt tcgcagacat cggtgaacgt cttgagatca cccacaaggc tagctcgcgg 720

atgacgttcg caaacggagc ggttcggtcc gccctgtggc tgtctggcaa agaatctgga 780

ctcttcgaca tgcgggacgt gttggacctt aacaatttg 819

<210> 48

<211> 273

<212> PRT

<213> 人工序列

<220>

<223> 来自大肠杆菌的进行密码子优化的dapB

<400> 48

Met His Asp Ala Asn Ile Arg Val Ala Ile Ala Gly Ala Gly Gly Arg

1 5 10 15

Met Gly Arg Gln Leu Ile Gln Ala Ala Leu Ala Leu Glu Gly Val Gln

20 25 30

Leu Gly Ala Ala Leu Glu Arg Glu Gly Ser Ser Leu Leu Gly Ser Asp

35 40 45

Ala Gly Glu Leu Ala Gly Ala Gly Lys Thr Gly Val Thr Val Gln Ser

50 55 60

Ser Leu Asp Ala Val Lys Asp Asp Phe Asp Val Phe Ile Asp Phe Thr

65 70 75 80

Arg Pro Glu Gly Thr Leu Asn His Leu Ala Phe Cys Arg Gln His Gly

85 90 95

Lys Gly Met Val Ile Gly Thr Thr Gly Phe Asp Glu Ala Gly Lys Gln

100 105 110

Ala Ile Arg Asp Ala Ala Ala Asp Ile Ala Ile Val Phe Ala Ala Asn

115 120 125

Phe Ser Val Gly Val Asn Val Met Leu Lys Leu Leu Glu Lys Ala Ala

130 135 140

Lys Val Met Gly Asp Tyr Thr Asp Ile Glu Ile Ile Glu Ala His His

145 150 155 160

Arg His Lys Val Asp Ala Pro Ser Gly Thr Ala Leu Ala Met Gly Glu

165 170 175

Ala Ile Ala His Ala Leu Asp Lys Asp Leu Lys Asp Cys Ala Val Tyr

180 185 190

Ser Arg Glu Gly His Thr Gly Glu Arg Val Pro Gly Thr Ile Gly Phe

195 200 205

Ala Thr Val Arg Ala Gly Asp Ile Val Gly Glu His Thr Ala Met Phe

210 215 220

Ala Asp Ile Gly Glu Arg Leu Glu Ile Thr His Lys Ala Ser Ser Arg

225 230 235 240

Met Thr Phe Ala Asn Gly Ala Val Arg Ser Ala Leu Trp Leu Ser Gly

245 250 255

Lys Glu Ser Gly Leu Phe Asp Met Arg Asp Val Leu Asp Leu Asn Asn

260 265 270

Leu

<210> 49

<211> 1266

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的aspK

<400> 49

atggccctgg tcgtacagaa atatggcggt tcctcgcttg agagtgcgga acgcattaga 60

aacgtcgctg aacggatcgt tgccaccaag aaggctggaa atgatgtcgt ggttgtctgc 120

tccgcaatgg gagacaccac ggatgaactt ctagaacttg cagcggcagt gaatcccgtt 180

ccgccagctc gtgaaatgga tatgctcctg actgctggtg agcgtatttc taacgctctc 240

gtcgccatgg ctattgagtc ccttggcgca gaagctcaat ctttcactgg ctctcaggct 300

ggtgtgctca ccaccgagcg ccacggaaac gcacgcattg ttgacgtcac accgggtcgt 360

gtgcgtgaag cactcgatga gggcaagatc tgcattgttg ctggttttca gggtgttaat 420

aaagaaaccc gcgatgtcac cacgttgggt cgtggtggtt ctgacaccac tgcagttgcg 480

ttggcagctg ctttgaacgc tgatgtgtgt gagatttact cggacgttga cggtgtgtat 540

accgctgacc cgcgcatcgt tcctaatgca cagaagctgg aaaagctcag cttcgaagaa 600

atgctggaac ttgctgctgt tggctccaag attttggtgc tgcgcagtgt tgaatacgct 660

cgtgcattca atgtgccact tcgcgtacgc tcgtcttata gtaatgatcc cggcactttg 720

attgccggct ctatggagga tattcctgtg gaagaagcag tccttaccgg tgtcgcaacc 780

gacaagtccg aagccaaagt aaccgttctg ggtatttccg ataagccagg cgagactgcc 840

aaggttttcc gtgcgttggc tgatgcagaa atcaacattg acatggttct gcagaacgtc 900

ttctctgtgg aagacggcac caccgacatc acgttcacct gccctcgcgc tgacggacgc 960

cgtgcgatgg agatcttgaa gaagcttcag gttcagggca actggaccaa tgtgctttac 1020

gacgaccagg tcggcaaagt ctccctcgtg ggtgctggca tgaagtctca cccaggtgtt 1080

accgcagagt tcatggaagc tctgcgcgat gtcaacgtga acatcgaatt gatttccacc 1140

tctgagatcc gcatttccgt gctgatccgt gaagatgatc tggatgctgc tgcacgtgca 1200

ttgcatgagc agttccagct gggcggcgaa gacgaagccg tcgtttatgc aggcaccgga 1260

cgctaa 1266

<210> 50

<211> 421

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的进行密码子优化的aspK

<400> 50

Met Ala Leu Val Val Gln Lys Tyr Gly Gly Ser Ser Leu Glu Ser Ala

1 5 10 15

Glu Arg Ile Arg Asn Val Ala Glu Arg Ile Val Ala Thr Lys Lys Ala

20 25 30

Gly Asn Asp Val Val Val Val Cys Ser Ala Met Gly Asp Thr Thr Asp

35 40 45

Glu Leu Leu Glu Leu Ala Ala Ala Val Asn Pro Val Pro Pro Ala Arg

50 55 60

Glu Met Asp Met Leu Leu Thr Ala Gly Glu Arg Ile Ser Asn Ala Leu

65 70 75 80

Val Ala Met Ala Ile Glu Ser Leu Gly Ala Glu Ala Gln Ser Phe Thr

85 90 95

Gly Ser Gln Ala Gly Val Leu Thr Thr Glu Arg His Gly Asn Ala Arg

100 105 110

Ile Val Asp Val Thr Pro Gly Arg Val Arg Glu Ala Leu Asp Glu Gly

115 120 125

Lys Ile Cys Ile Val Ala Gly Phe Gln Gly Val Asn Lys Glu Thr Arg

130 135 140

Asp Val Thr Thr Leu Gly Arg Gly Gly Ser Asp Thr Thr Ala Val Ala

145 150 155 160

Leu Ala Ala Ala Leu Asn Ala Asp Val Cys Glu Ile Tyr Ser Asp Val

165 170 175

Asp Gly Val Tyr Thr Ala Asp Pro Arg Ile Val Pro Asn Ala Gln Lys

180 185 190

Leu Glu Lys Leu Ser Phe Glu Glu Met Leu Glu Leu Ala Ala Val Gly

195 200 205

Ser Lys Ile Leu Val Leu Arg Ser Val Glu Tyr Ala Arg Ala Phe Asn

210 215 220

Val Pro Leu Arg Val Arg Ser Ser Tyr Ser Asn Asp Pro Gly Thr Leu

225 230 235 240

Ile Ala Gly Ser Met Glu Asp Ile Pro Val Glu Glu Ala Val Leu Thr

245 250 255

Gly Val Ala Thr Asp Lys Ser Glu Ala Lys Val Thr Val Leu Gly Ile

260 265 270

Ser Asp Lys Pro Gly Glu Thr Ala Lys Val Phe Arg Ala Leu Ala Asp

275 280 285

Ala Glu Ile Asn Ile Asp Met Val Leu Gln Asn Val Phe Ser Val Glu

290 295 300

Asp Gly Thr Thr Asp Ile Thr Phe Thr Cys Pro Arg Ala Asp Gly Arg

305 310 315 320

Arg Ala Met Glu Ile Leu Lys Lys Leu Gln Val Gln Gly Asn Trp Thr

325 330 335

Asn Val Leu Tyr Asp Asp Gln Val Gly Lys Val Ser Leu Val Gly Ala

340 345 350

Gly Met Lys Ser His Pro Gly Val Thr Ala Glu Phe Met Glu Ala Leu

355 360 365

Arg Asp Val Asn Val Asn Ile Glu Leu Ile Ser Thr Ser Glu Ile Arg

370 375 380

Ile Ser Val Leu Ile Arg Glu Asp Asp Leu Asp Ala Ala Ala Arg Ala

385 390 395 400

Leu His Glu Gln Phe Gln Leu Gly Gly Glu Asp Glu Ala Val Val Tyr

405 410 415

Ala Gly Thr Gly Arg

420

<210> 51

<211> 9972

<212> DNA

<213> 人工序列

<220>

<223> 质粒骨架序列

<400> 51

gtctgctcac aaatctcagc gaccgcattg atgaaggaaa tcttggtggc caagaacgca 60

ttcgcggaaa ctttcaccag ctcagcggta gcaagatcag tgaccaaaaa cggcgtatca 120

gcagcaatcg cggtggcgta aacctcccga gcgatcgcct ctgctgtcgc cacctcacgc 180

acacccacca cgatgcggtc cggagtgatg gtgtctttga ccgcgtagcc ctcacgcaag 240

aactccggat tccacgcgat ctccacgtgc gaaccaggct tgaccagaga atcagcaagc 300

tcctgcaact gctcagcggt accaaccgga accgtagact tgccgaaaat aatgtgctcg 360

ccctcaagca gcggcaccaa atcctcaaca acctgacgaa catacgtcag atccgccgca 420

taagtaccct tctgctgagg agtacccacg cccaagaaat gcacctgcgc gaaagccgca 480

gcctccgcat aaccagtagt gaagttcagg cgaccatttt ccagattgcg ctccaaaacc 540

tcaggcaaac ccggctcaaa aaatgggacc ttgctgtcct tcaacgacgc aatctttacc 600

tcatcgacat cgacaccaag aacctcatgg ccaagctcag ccatgcaggc cgcgtgcgta 660

gcgccaaggt aacccgtacc aatcactgtc atccgcatgt agggtgattc ctttcaatga 720

agagtggact ggagattatc tcaacacgtt ttgatacagc ccgcgaccgg aacacatgat 780

tgcttacttg ttggggaaat tcaggtacgc cttcgaagga gtaggaccac gctgcccctg 840

atacttcgaa ccaagcttgc cggaaccata cggagtctcc gcaggggaac tcatctggaa 900

caaagccaac tgccccacct tcatacccgg ccacaacgtg atcggcagat tagccacatt 960

ggacaactcc aacgtgatgt aaccgctaaa accaggatca atgaaaccag cagtagagtg 1020

cgtcaacagt ccaagacgac caagagacga cttgccctcc aaacgaccag ccagatgcgc 1080

aggcaaagtg aacttttcca gcgtggacgc cagcacaaac tcacccggat gcagcacaaa 1140

gccctcgccg tcctcaacct caacaaggct ggtcagctca tcctgattca acttagggtc 1200

aatgtgggtg tacttagagt tattgaaaac ccggaagtag cggtccatgc ggacatcgac 1260

actcgacggc tgaatcagct cagcgtcgaa aggttcaatt cccaagtcgc ctgcgtcaat 1320

tgatttacga atgtcacgat ctgaaagaag cacgtcaacc agtgtagcgt tcagcgttca 1380

gggttgggcc acgggttgct gcgatgaggt tcctggggcg cgtggtgctg tcgctgattt 1440

ttatcgtgct agccttattc tcggccctta attaagcgcc accttttaca ctgccggtgt 1500

agttcaatgg tagaactcct gcttcccaag caggcggcgc gggttcgatt cccgtcaccg 1560

gctccaaata agcccctgac ctctcaatac gcaatgaagg tcaagggctt aattctatgg 1620

aaactacaaa aagtacccac ttaaatacac gctttaaatc ccctcacgat gcgatcatca 1680

gcccttttac atttttagaa aaagctttgc agttttccat cgcagccgaa aaccgctcca 1740

gtgcgaaatt gcacactaca tcacaccgaa cactcgacga catctttaat tttcaacata 1800

tccccacccc cagaaattat tcaccactta cacttcacat actcaccaat gcataaccca 1860

aaaagcgtta gatgaaactc cccacccgaa tccacaagaa ctcgggtgcc ctcagtttca 1920

cataccccta agcgcaacac tgtgcgagct ttccgccagt aggccaagca ccctttcgat 1980

taaccccgac aaacttttaa ggcaagccta aattaggtaa accttaaaca gtcgccattg 2040

aagaaattga tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgtat 2100

tatggaaacg tctgtatcgg ataagtagcg aggagtgttc gttaaaaatg gccctggtcg 2160

tacagaaata tggcggttcc tcgcttgaga gtgcggaacg cattagaaac gtcgctgaac 2220

ggatcgttgc caccaagaag gctggaaatg atgtcgtggt tgtctgctcc gcaatgggag 2280

acaccacgga tgaacttcta gaacttgcag cggcagtgaa tcccgttccg ccagctcgtg 2340

aaatggatat gctcctgact gctggtgagc gtatttctaa cgctctcgtc gccatggcta 2400

ttgagtccct tggcgcagaa gctcaatctt tcactggctc tcaggctggt gtgctcacca 2460

ccgagcgcca cggaaacgca cgcattgttg acgtcacacc gggtcgtgtg cgtgaagcac 2520

tcgatgaggg caagatctgc attgttgctg gttttcaggg tgttaataaa gaaacccgcg 2580

atgtcaccac gttgggtcgt ggtggttctg acaccactgc agttgcgttg gcagctgctt 2640

tgaacgctga tgtgtgtgag atttactcgg acgttgacgg tgtgtatacc gctgacccgc 2700

gcatcgttcc taatgcacag aagctggaaa agctcagctt cgaagaaatg ctggaacttg 2760

ctgctgttgg ctccaagatt ttggtgctgc gcagtgttga atacgctcgt gcattcaatg 2820

tgccacttcg cgtacgctcg tcttatagta atgatcccgg cactttgatt gccggctcta 2880

tggaggatat tcctgtggaa gaagcagtcc ttaccggtgt cgcaaccgac aagtccgaag 2940

ccaaagtaac cgttctgggt atttccgata agccaggcga gactgccaag gttttccgtg 3000

cgttggctga tgcagaaatc aacattgaca tggttctgca gaacgtctcc tctgtggaag 3060

acggcaccac cgacatcttg ttcacctgcc ctcgcgctga cggacgccgt gcgatggaga 3120

tcttgaagaa gcttcaggtt cagggcaact ggaccaatgt gctttacgac gaccaggtcg 3180

gcaaagtctc cctcgtgggt gctggcatga agtctcaccc aggtgttacc gcagagttca 3240

tggaagctct gcgcgatgtc aacgtgaaca tcgaattgat ttccacctct gagatccgca 3300

tttccgtgct gatccgtgaa gatgatctgg atgctgctgc acgtgcattg catgagcagt 3360

tccagctggg cggcgaagac gaagccgtcg tttatgcagg caccggacgc taatagagtt 3420

ttaaaggagt agttttacaa tgtccaaggg agaaaaaatg aagatcaagg ttggcgtatt 3480

gggtgctacc ggatcggttg gccaacgctt tgtgcagctg cttgcagacc accccatgtt 3540

cgaattgact gctctggcag caagcgaacg gtccgcgggt aaaaaataca aagatgcttg 3600

ttactggttt caagatcggg acattccaga aaatattaag gatatggttg taattccgac 3660

ggatccgaag cacgaagaat tcgaagacgt tgatattgtt tttagcgcgc tgccctcgga 3720

tctggctaaa aaattcgaac ccgaattcgc gaaagaagga aagctgatct tcagcaacgc 3780

atcagcctat cgtatggagg aagatgtgcc gcttgtaatt ccagaggtaa acgctgatca 3840

cctcgaattg attgaaattc agcgcgagaa gcggggttgg gacggagcca ttatcactaa 3900

cccaaactgt tcaaccattt gcgccgtaat cacccttaag ccaattatgg acaaattcgg 3960

tcttgaagcg gtgtttatcg ctaccatgca ggctgtatcg ggcgcaggat acaacggtgt 4020

cccgagcatg gctattctgg ataacttgat tccctttatt aagaatgagg aggagaagat 4080

gcagactgaa tcgcttaagt tgctgggcac gcttaaggat ggaaaagtgg aactcgctaa 4140

cttcaaaatc agcgcatcat gcaatcgtgt ggctgtgatc gacggccaca ccgaatcgat 4200

cttcgtgaag accaaggagg gtgcggaacc tgaggaaatt aaagaagtga tggacaaatt 4260

tgatcctctt aaagacctta accttccgac gtatgccaaa ccaatcgtaa ttcgcgaaga 4320

gatcgatcgc ccacagccac gtcttgaccg caatgagggt aatggcatgt ctattgtcgt 4380

tggtcgtatc cgtaaagatc cgatttttga tgttaagtac accgccctgg aacataacac 4440

tatccgtggc gccgcgggcg catcagtgtt gaatgcggag tatttcgtaa agaaatacat 4500

ctaggcattt ttagtacgtg caataaccac tctggttttt ccagggtggt tttttgatgc 4560

cctttttgga gtcttcaact gcttagcttt gacctgcaca aatagttgca aattgtccca 4620

catacacata aagtagcttg cgtatttaaa attatgaacc taaggggttt agcaatgact 4680

gtagatgaac aggtttctaa ctactacgac atgcttctca aacgtaatgc tggagagccc 4740

gaatttcatc aggcggttgc tgaagtgctt gaatccctca agattgttct tgaaaaagat 4800

ccgcactacg cggactatgg cctcatccag cggctgtgtg aacctgaacg tcaactgatc 4860

ttccgtgtgc cgtgggtaga tgatcaggga caagtgcacg tcaaccgcgg ttttcgtgta 4920

cagtttaatt cggcgctcgg tccctacaaa ggcggattgc gtttccaccc tagcgtcaat 4980

cttggcatcg tcaagttttt gggtttcgaa caaattttta agaattccct taccggactg 5040

cctatcggag gcggaaaggg cggttcggat tttgacccta aaggcaagag cgatctcgaa 5100

atcatgcggt tttgtcagtc ttttatgacc gaactgcatc gtcacatcgg cgaatatcgc 5160

gatgtcccgg cgggtgatat cggcgtgggt ggtcgtgaga tcggatacct ctttggtcat 5220

tatcgtcgga tggcgaatca gcacgaatcg ggagtcctta ccggcaaagg tctgacttgg 5280

ggcggcagcc tggttcggac cgaagccacg ggatacggtt gtgtctattt cgtatcggag 5340

atgatcaaag caaaaggcga gtcaatctcg ggacagaaga ttatcgtatc cggatcggga 5400

aatgttgcta cctatgccat tgagaaagct caagagctgg gcgcgacggt gatcggcttc 5460

tcggattcct caggctgggt gcatactccg aatggtgtgg acgtggctaa acttcgcgaa 5520

atcaaggaag tacgtcgcgc acgcgtaagc gtttatgccg atgaagtgga gggagcaacc 5580

taccataccg atggatccat ctgggatctt aagtgtgaca tcgcacttcc ttgcgctacg 5640

caaaatgaac tgaacggaga gaatgcgaaa acgctggccg ataatggttg ccgcttcgtc 5700

gcggagggcg ctaacatgcc gagcaccccg gaggccgtcg aagtttttcg ggagcgcgac 5760

atccggttcg gccccggcaa agcggctaat gctggcggag tggcaacgtc agcgttggag 5820

atgcagcaga acgcatcccg ggactcatgg agcttcgaat acaccgacga acgcctccag 5880

gtcattatga agaacatttt taagacgtgt gcggaaaccg cagccgagta tggccacgag 5940

aacgattacg tcgtcggagc aaacattgca ggatttaaga aagttgctga tgcgatgctc 6000

gcccaaggtg tgatctaggc ttttcgacgt ctcctccggc gaaacccaaa aaaggaaccc 6060

tcacagttcg tgagggttcc ttttactatt gtctacaatc caaggtaatg ctcaaggcca 6120

accaccaatc cgggatgctg tgcgatattg cggactccaa ccaaaacacc cggcgcgaaa 6180

gagttacgat cgtagctatc ttgtttaata gtaagcgttt ggccttgcgt gccaaagatt 6240

acttgttcat gtgcgaccat gcctgacatg cgaacggcat gtaccggaat accatcaaca 6300

gacgcgccac ggctaccctc aagggcttgc tcagttgcat cgggctgtgc atccatacct 6360

gcctctttac gagccgctgc aataccttga gcagtatgaa tcgcagtgcc agaaggggcg 6420

tcaagtttat tcgggtggtg aagttcgata acttctgcgg attcgaaaaa gcgcgccgcc 6480

tgtttagaga acaccatggt aagcactgct gaaatcgcaa agttaggggc aatcaagact 6540

ccgacgttgt ccttaccttc cagccagtcc cgtacttgtt cgaggcgtgc atcatcaaat 6600

ccggtggttc ctactactgc ggagattccg ttgttgatac agaactccaa gtttcccatc 6660

accgcattag gcgtagtgaa gtcaacaacg acctccgcgc cgttgtctac gagcagagac 6720

agatcgtcat ccacaccgat ttctgctacc aactcgagat cgtctgattc gttcactgct 6780

gccacgatcg tctgacctac ccggccctta gctcccagca ctccaacttt aatgcccatt 6840

gtaaaactac tcctttaaaa ctctacacat cccgggcaat caagtcgtcc aagttttccg 6900

ggctcaacaa atatggggca acttccagaa cggtaaaggc acctgattgg ccttgttgtt 6960

tcatgcggtg cgctgcacgg ccgaacgcga tttgagagga cgcagtgaag tcgggattac 7020

ggtccagctt gagaatatac tcaacggtgt ggttaaaacc gccagtgtcg ccagtcgtga 7080

tcacgtgacc gccgtggggc atacccgtat gttcagagtc gaaggttgct tcatcgatga 7140

aattgacttc cacctcatag cctacgaagt aatcgggcat agtccgaata tcgttctcga 7200

tgcgctcatg gtcggccgcg tcggccacaa cgaagcactg gcgtttatgc gtttgcttgc 7260

cggtcaaatc tccggcctcg ccgcgccgtg ctttttcaag cgcatcttca gaaggcaacg 7320

tgtactgaac agccttttga acacccggga tgcggcgaag ggcatcgcta tggccctggc 7380

taagtcctgg accccaaaac gtgtgttgct ggtgctcggc caaaacagca gcggcataaa 7440

cacgattaat cgaaaacatg cctgggtccc agcccgtaga aaccaacgct acattgcctg 7500

ctgcggttgc agcttcattc ataacttggc gatgccgggg gatatcacga tgattatcgt 7560

acgtgtccac cgtacaggca aattgagcga acttgggggc ttgctccggg atatccgttg 7620

cagatcccat gcacaaaaag agtacatcga catcatcggc atgtttatca acgtccgcca 7680

catcgaagac gggggtcttc gtatcgagag tcgcccggcg cgagaaaatt ccaacaagat 7740

ccatgtctgg ttgcttggcg atgagttttt ctacgctgcg tcccagatta ccgtatccga 7800

cgattgctac acggatgttc gtcattttta acgaacactc ctcgctactt atccgataca 7860

gacgtttcca taatacacgc ttaggtcccc acgtagtacc acacacaacg cgagaaacgg 7920

caagaatttt aaaaacaaac aaccttcaac gcgctaacaa gcatcttccc actctcgtta 7980

ccggagtttc tcacatgtct cacaagtttt cccgccgcgc tttcgcagta ctgaccgctg 8040

ccgcgatttc cacttccgct ttcgcaacca ctgctccgtc tgcgattgca gaaccagttt 8100

ccatagtcac caccgcagac gattctagcg tcgcaacttc agaaaactcc cttgactggg 8160

gtttcaagtc ttcctggcgc acctatgtca ccggaccttg gactggtgga accgttgacg 8220

caactggcgg tgcaactgtc aacgaagatg gaacctacaa cttcaccctc ggaactggct 8280

ccacttacga catcgacacc gagaagggcc agctgaacta cgaaggaact gttgccttcg 8340

ccagtgacgc tcacggcttc aacatcacct tgtccaaccc gcagatcacc gtcgagggcg 8400

acactgcaac tttgagcgcc gagctgtctg acaatgccgc tccagaagag acctccacta 8460

ctcgcgttga tgtcgctgag ttcgagctga ctgctcctgc ggtttcagaa accgatgcaa 8520

acaccactta cacttggatc gatgtttccg gcactttcct agaatccctg ccgcctgaag 8580

aattgagccg ttacgcaggc caggaagcgg atgcgctgag cttctccatc accgtggaca 8640

aggcttcaga gaacccttcc gatgatgttg ctaccggatc ttcctccagc ttcctctcca 8700

ccatcttgaa cttccttcag cagctggcga gcccactact caagctcttc ggttcgcttt 8760

cttcctaaat aatcagtaat gccccaccag atctggtggg gcattttgtt ttaggagcag 8820

accacgtttg gtgaaggatc gtaaaccgtg gtgacggttt ctctggtgat ttcgttgcca 8880

gaaagatcgc tgatgattcg ggtgtcggag gtggtaaatc ctggtgcacc ggttgatggc 8940

acacaatctg aacccgatac tcgaactgtg ttgggctggg tggtggacca acgtccgttg 9000

ttgatggatt ccacggaggt ggtgtccaca cccatgatgc gcacggtcac gtttgaggcg 9060

tcggcagagg tgctgatcat gacggggtat ggggagttgt tgcggaattg aaggtcgatg 9120

gcaccatcga aaatagtagc ttcacgtccg gctgggtagc gggaaatgta gtagctgtgc 9180

ggggtgtgcg tgatgtcttc cagacctgcg aagtagtacg cgttgtacaa ggtggtggcg 9240

aactgactga tgccaccgcc gactgcggtg tcggaacgac cattcaaaat gatgccggaa 9300

tcaacaaagc cttgggctgc gccacgtggg ccggtgtagt tgttgaggga gaacgtatcg 9360

ccaggtgaaa cgactgcgcc gtcgaccatt tgcgcggtga ggcggatgtt tgttccggag 9420

gcagcagaga agccgccggt ggtgaactcg cccatgacct cattgaaggt agcgttttgg 9480

gcgtcggtgg cggtgaatgt tgctggggtg tcctcataga cagcgtcgat ggtgcggggg 9540

ccatcgccgg tgaggttgtt gggcagatcg gccagggttt cttcccagtt gattccgtgt 9600

ccggtgactt ctggggtgac tacgcgggag cctgaggaga aactgatttg agcgttggtg 9660

ggctcgatct ctgtttcttt gaggccttcg gccagcattg ctgtggctgc ttctgcattg 9720

atgtcgacgc ggatggtgcc gttttcttct gggaaactca ccacttcacc catgcgctcg 9780

acggggatgg tgccttcaat tccgtcatcg cctcggacga cgaaggggct agatacggct 9840

ttagctccgg cgccttctgc aagttcatcg atggtgtctt ggctgatcgc agcgggaaca 9900

acgtatggct cggcttctac gccctctggg ttgagccagt tttctgtgac tgcttgttcc 9960

aaaacggtgc gg 9972

<210> 52

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 52

tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgtat tatggaaacg 60

tctgtatcgg ataagtagcg aggagtgttc gttaaaa 97

<210> 53

<211> 26

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 53

tagagtttta aaggagtagt tttaca 26

<210> 54

<211> 173

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 54

taggcatttt tagtacgtgc aataaccact ctggtttttc cagggtggtt ttttgatgcc 60

ctttttggag tcttcaactg cttagctttg acctgcacaa atagttgcaa attgtcccac 120

atacacataa agtagcttgc gtatttaaaa ttatgaacct aaggggttta gca 173

<210> 55

<211> 80

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 55

taggcttttc gacgtctcct ccggcgaaac ccaaaaaagg aaccctcaca gttcgtgagg 60

gttcctttta ctattgtcta 80

<210> 56

<211> 26

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 56

tgtaaaacta ctcctttaaa actcta 26

<210> 57

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 启动子序列

<400> 57

ttttaacgaa cactcctcgc tacttatccg atacagacgt ttccataata cacgcttagg 60

tccccacgta gtaccacaca caacgcgaga aacggca 97

<210> 58

<211> 334

<212> PRT

<213> 谷氨酸棒状杆菌

<400> 58

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Asp Leu Thr Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Pro

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 59

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg0007_lib_39的表达启动子P1

<400> 59

tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgtat tatggaaacg 60

tctgtatcgg ataagtagcg aggagtgttc gttaaaa 97

<210> 60

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg0007的表达启动子P2

<400> 60

tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgtaa gatggaaacg 60

tctgtatcgg ataagtagcg aggagtgttc gttaaaa 97

<210> 61

<211> 93

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg1860的表达启动子P3

<400> 61

cttagctttg acctgcacaa atagttgcaa attgtcccac atacacataa agtagcttgc 60

gtatttaaaa ttatgaacct aaggggttta gca 93

<210> 62

<211> 98

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg0755的表达启动子P4

<400> 62

aataaattta taccacacag tctattgcaa tagaccaagc tgttcagtag ggtgcatggg 60

agaagaattt cctaataaaa actcttaagg acctccaa 98

<210> 63

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg0007_265的表达启动子P5

<400> 63

tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgtac gctggaaacg 60

tctgtatcgg ataagtagcg aggagtgttc gttaaaa 97

<210> 64

<211> 86

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg3381的表达启动子P6

<400> 64

cgccggataa atgaattgat tattttaggc tcccagggat taagtctagg gtggaatgca 60

gaaatatttc ctacggaagg tccgtt 86

<210> 65

<211> 97

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg0007_119的表达启动子P7

<400> 65

tgccgtttct cgcgttgtgt gtggtactac gtggggacct aagcgtgttg catggaaacg 60

tctgtatcgg ataagtagcg aggagtgttc gttaaaa 97

<210> 66

<211> 87

<212> DNA

<213> 人工序列

<220>

<223> 来源于Pcg312的表达启动子P8

<400> 66

gtggctaaaa cttttggaaa cttaagttac ctttaatcgg aaacttattg aattcgggtg 60

aggcaactgc aactctggac ttaaagc 87

<210> 67

<211> 331

<212> PRT

<213> 大肠杆菌

<400> 67

Met Thr Ile Lys Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Ile

1 5 10 15

Val Phe Arg Ala Ala Gln Lys Arg Ser Asp Ile Glu Ile Val Ala Ile

20 25 30

Asn Asp Leu Leu Asp Ala Asp Tyr Met Ala Tyr Met Leu Lys Tyr Asp

35 40 45

Ser Thr His Gly Arg Phe Asp Gly Thr Val Glu Val Lys Asp Gly His

50 55 60

Leu Ile Val Asn Gly Lys Lys Ile Arg Val Thr Ala Glu Arg Asp Pro

65 70 75 80

Ala Asn Leu Lys Trp Asp Glu Val Gly Val Asp Val Val Ala Glu Ala

85 90 95

Thr Gly Leu Phe Leu Thr Asp Glu Thr Ala Arg Lys His Ile Thr Ala

100 105 110

Gly Ala Lys Lys Val Val Met Thr Gly Pro Ser Lys Asp Asn Thr Pro

115 120 125

Met Phe Val Lys Gly Ala Asn Phe Asp Lys Tyr Ala Gly Gln Asp Ile

130 135 140

Val Ser Asn Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro Leu Ala Lys

145 150 155 160

Val Ile Asn Asp Asn Phe Gly Ile Ile Glu Gly Leu Met Thr Thr Val

165 170 175

His Ala Thr Thr Ala Thr Gln Lys Thr Val Asp Gly Pro Ser His Lys

180 185 190

Asp Trp Arg Gly Gly Arg Gly Ala Ser Gln Asn Ile Ile Pro Ser Ser

195 200 205

Thr Gly Ala Ala Lys Ala Val Gly Lys Val Leu Pro Glu Leu Asn Gly

210 215 220

Lys Leu Thr Gly Met Ala Phe Arg Val Pro Thr Pro Asn Val Ser Val

225 230 235 240

Val Asp Leu Thr Val Arg Leu Glu Lys Ala Ala Thr Tyr Glu Gln Ile

245 250 255

Lys Ala Ala Val Lys Ala Ala Ala Glu Gly Glu Met Lys Gly Val Leu

260 265 270

Gly Tyr Thr Glu Asp Asp Val Val Ser Thr Asp Phe Asn Gly Glu Val

275 280 285

Cys Thr Ser Val Phe Asp Ala Lys Ala Gly Ile Ala Leu Asn Asp Asn

290 295 300

Phe Val Lys Leu Val Ser Trp Tyr Asp Asn Glu Thr Gly Tyr Ser Asn

305 310 315 320

Lys Val Leu Asp Leu Ile Ala His Ile Ser Lys

325 330

<210> 68

<211> 996

<212> DNA

<213> 大肠杆菌

<400> 68

atgactatca aagtaggtat caacggtttt ggccgtatcg gtcgcattgt tttccgtgct 60

gctcagaaac gttctgacat cgagatcgtt gcaatcaacg acctgttaga cgctgattac 120

atggcataca tgctgaaata tgactccact cacggccgtt tcgacggtac cgttgaagtg 180

aaagacggtc atctgatcgt taacggtaaa aaaatccgtg ttaccgctga acgtgatccg 240

gctaacctga aatgggacga agttggtgtt gacgttgtcg ctgaagcaac tggtctgttc 300

ctgactgacg aaactgctcg taaacacatc accgctggtg cgaagaaagt ggttatgact 360

ggtccgtcta aagacaacac tccgatgttc gttaaaggcg ctaacttcga caaatatgct 420

ggccaggaca tcgtttccaa cgcttcctgc accaccaact gcctggctcc gctggctaaa 480

gttatcaacg ataacttcgg catcatcgaa ggtctgatga ccaccgttca cgctactacc 540

gctactcaga aaaccgttga tggcccgtct cacaaagact ggcgcggcgg ccgcggcgct 600

tcccagaaca tcatcccgtc ctctaccggt gctgctaaag ctgtaggtaa agtactgcca 660

gaactgaatg gcaaactgac tggtatggcg ttccgcgttc cgaccccgaa cgtatctgta 720

gttgacctga ccgttcgtct ggaaaaagct gcaacttacg agcagatcaa agctgccgtt 780

aaagctgctg ctgaaggcga aatgaaaggc gttctgggct acaccgaaga tgacgtagta 840

tctaccgatt tcaacggcga agtttgcact tccgtgttcg atgctaaagc tggtatcgct 900

ctgaacgaca acttcgtgaa actggtatcc tggtacgaca acgaaaccgg ttactccaac 960

aaagttctgg acctgatcgc tcacatctcc aaataa 996

<210> 69

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的D35G L36T突变gapAv5

<400> 69

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Thr Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Pro

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 70

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的D35G L36T突变gapAv5

<400> 70

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccac cgacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcacctcacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 71

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的L36T T37K突变gapAv7

<400> 71

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Asp Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Pro

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 72

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的L36T T37K突变gapAv7

<400> 72

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acgacaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcacctcacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 73

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的D35G L36T T37K突变gapAv8

<400> 73

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Pro

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 74

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的D35G L36T T37K突变gapAv8

<400> 74

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcacctcacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 75

<211> 63

<212> DNA

<213> 人工序列

<220>

<223> pMB085启动子

<400> 75

accgtgcgtg tttacaattt tacctctggc ggtgataatg gttgcatgta ctaaggaggt 60

tgt 63

<210> 76

<211> 5118

<212> DNA

<213> 大肠杆菌

<400> 76

ggcttttaga gcaacgagac acggcaatgt tgcaccgttt gctgcatgat attgaaaaaa 60

atatcaccaa ataaaaaacg ccttagtaag tatttttcag cttttcattc tgactgcaac 120

gggcaatatg tctctgtgtg gattaaaaaa agagtgtctg atagcagctt ctgaactggt 180

tacctgccgt gagtaaatta aaattttatt gacttaggtc actaaatact ttaaccaata 240

taggcatagc gcacagacag ataaaaatta cagagtacac aacatccatg aaacgcatta 300

gcaccaccat taccaccacc atcaccatta ccacaggtaa cggtgcgggc tgacgcgtac 360

aggaaacaca gaaaaaagcc cgcacctgac agtgcgggct ttttttttcg accaaaggta 420

acgaggtaac aaccatgcga gtgttgaagt tcggcggtac atcagtggca aatgcagaac 480

gttttctgcg tgttgccgat attctggaaa gcaatgccag gcaggggcag gtggccaccg 540

tcctctctgc ccccgccaaa atcaccaacc acctggtggc gatgattgaa aaaaccatta 600

gcggccagga tgctttaccc aatatcagcg atgccgaacg tatttttgcc gaacttttga 660

cgggactcgc cgccgcccag ccggggttcc cgctggcgca attgaaaact ttcgtcgatc 720

aggaatttgc ccaaataaaa catgtcctgc atggcattag tttgttgggg cagtgcccgg 780

atagcatcaa cgctgcgctg atttgccgtg gcgagaaaat gtcgatcgcc attatggccg 840

gcgtattaga agcgcgcggt cacaacgtta ctgttatcga tccggtcgaa aaactgctgg 900

cagtggggca ttacctcgaa tctaccgtcg atattgctga gtccacccgc cgtattgcgg 960

caagccgcat tccggctgat cacatggtgc tgatggcagg tttcaccgcc ggtaatgaaa 1020

aaggcgaact ggtggtgctt ggacgcaacg gttccgacta ctctgctgcg gtgctggctg 1080

cctgtttacg cgccgattgt tgcgagattt ggacggacgt tgacggggtc tatacctgcg 1140

acccgcgtca ggtgcccgat gcgaggttgt tgaagtcgat gtcctaccag gaagcgatgg 1200

agctttccta cttcggcgct aaagttcttc acccccgcac cattaccccc atcgcccagt 1260

tccagatccc ttgcctgatt aaaaataccg gaaatcctca agcaccaggt acgctcattg 1320

gtgccagccg tgatgaagac gaattaccgg tcaagggcat ttccaatctg aataacatgg 1380

caatgttcag cgtttctggt ccggggatga aagggatggt cggcatggcg gcgcgcgtct 1440

ttgcagcgat gtcacgcgcc cgtatttccg tggtgctgat tacgcaatca tcttccgaat 1500

acagcatcag tttctgcgtt ccacaaagcg actgtgtgcg agctgaacgg gcaatgcagg 1560

aagagttcta cctggaactg aaagaaggct tactggagcc gctggcagtg acggaacggc 1620

tggccattat ctcggtggta ggtgatggta tgcgcacctt gcgtgggatc tcggcgaaat 1680

tctttgccgc actggcccgc gccaatatca acattgtcgc cattgctcag ggatcttctg 1740

aacgctcaat ctctgtcgtg gtaaataacg atgatgcgac cactggcgtg cgcgttactc 1800

atcagatgct gttcaatacc gatcaggtta tcgaagtgtt tgtgattggc gtcggtggcg 1860

ttggcggtgc gctgctggag caactgaagc gtcagcaaag ctggctgaag aataaacata 1920

tcgacttacg tgtctgcggt gttgccaact cgaaggctct gctcaccaat gtacatggcc 1980

ttaatctgga aaactggcag gaagaactgg cgcaagccaa agagccgttt aatctcgggc 2040

gcttaattcg cctcgtgaaa gaatatcatc tgctgaaccc ggtcattgtt gactgcactt 2100

ccagccaggc agtggcggat caatatgccg acttcctgcg cgaaggtttc cacgttgtca 2160

cgccgaacaa aaaggccaac acctcgtcga tggattacta ccatcagttg cgttatgcgg 2220

cggaaaaatc gcggcgtaaa ttcctctatg acaccaacgt tggggctgga ttaccggtta 2280

ttgagaacct gcaaaatctg ctcaatgcag gtgatgaatt gatgaagttc tccggcattc 2340

tttctggttc gctttcttat atcttcggca agttagacga aggcatgagt ttctccgagg 2400

cgaccacgct ggcgcgggaa atgggttata ccgaaccgga cccgcgagat gatctttctg 2460

gtatggatgt ggcgcgtaaa ctattgattc tcgctcgtga aacgggacgt gaactggagc 2520

tggcggatat tgaaattgaa cctgtgctgc ccgcagagtt taacgccgag ggtgatgttg 2580

ccgcttttat ggcgaatctg tcacaactcg acgatctctt tgccgcgcgc gtggcgaagg 2640

cccgtgatga aggaaaagtt ttgcgctatg ttggcaatat tgatgaagat ggcgtctgcc 2700

gcgtgaagat tgccgaagtg gatggtaatg atccgctgtt caaagtgaaa aatggcgaaa 2760

acgccctggc cttctatagc cactattatc agccgctgcc gttggtactg cgcggatatg 2820

gtgcgggcaa tgacgttaca gctgccggtg tctttgctga tctgctacgt accctctcat 2880

ggaagttagg agtctgacat ggttaaagtt tatgccccgg cttccagtgc caatatgagc 2940

gtcgggtttg atgtgctcgg ggcggcggtg acacctgttg atggtgcatt gctcggagat 3000

gtagtcacgg ttgaggcggc agagacattc agtctcaaca acctcggacg ctttgccgat 3060

aagctgccgt cagaaccacg ggaaaatatc gtttatcagt gctgggagcg tttttgccag 3120

gaactgggta agcaaattcc agtggcgatg accctggaaa agaatatgcc gatcggttcg 3180

ggcttaggct ccagtgcctg ttcggtggtc gcggcgctga tggcgatgaa tgaacactgc 3240

ggcaagccgc ttaatgacac tcgtttgctg gctttgatgg gcgagctgga aggccgtatc 3300

tccggcagca ttcattacga caacgtggca ccgtgttttc tcggtggtat gcagttgatg 3360

atcgaagaaa acgacatcat cagccagcaa gtgccagggt ttgatgagtg gctgtgggtg 3420

ctggcgtatc cggggattaa agtctcgacg gcagaagcca gggctatttt accggcgcag 3480

tatcgccgcc aggattgcat tgcgcacggg cgacatctgg caggcttcat tcacgcctgc 3540

tattcccgtc agcctgagct tgccgcgaag ctgatgaaag atgttatcgc tgaaccctac 3600

cgtgaacggt tactgccagg cttccggcag gcgcggcagg cggtcgcgga aatcggcgcg 3660

gtagcgagcg gtatctccgg ctccggcccg accttgttcg ctctgtgtga caagccggaa 3720

accgcccagc gcgttgccga ctggttgggt aagaactacc tgcaaaatca ggaaggtttt 3780

gttcatattt gccggctgga tacggcgggc gcacgagtac tggaaaacta aatgaaactc 3840

tacaatctga aagatcacaa cgagcaggtc agctttgcgc aagccgtaac ccaggggttg 3900

ggcaaaaatc aggggctgtt ttttccgcac gacctgccgg aattcagcct gactgaaatt 3960

gatgagatgc tgaagctgga ttttgtcacc cgcagtgcga agatcctctc ggcgtttatt 4020

ggtgatgaaa tcccacagga aatcctggaa gagcgcgtgc gcgcggcgtt tgccttcccg 4080

gctccggtcg ccaatgttga aagcgatgtc ggttgtctgg aattgttcca cgggccaacg 4140

ctggcattta aagatttcgg cggtcgcttt atggcacaaa tgctgaccca tattgcgggt 4200

gataagccag tgaccattct gaccgcgacc tccggtgata ccggagcggc agtggctcat 4260

gctttctacg gtttaccgaa tgtgaaagtg gttatcctct atccacgagg caaaatcagt 4320

ccactgcaag aaaaactgtt ctgtacattg ggcggcaata tcgaaactgt tgccatcgac 4380

ggcgatttcg atgcctgtca ggcgctggtg aagcaggcgt ttgatgatga agaactgaaa 4440

gtggcgctag ggttaaactc ggctaactcg attaacatca gccgtttgct ggcgcagatt 4500

tgctactact ttgaagctgt tgcgcagctg ccgcaggaga cgcgcaacca gctggttgtc 4560

tcggtgccaa gcggaaactt cggcgatttg acggcgggtc tgctggcgaa gtcactcggt 4620

ctgccggtga aacgttttat tgctgcgacc aacgtgaacg ataccgtgcc acgtttcctg 4680

cacgacggtc agtggtcacc caaagcgact caggcgacgt tatccaacgc gatggacgtg 4740

agtcagccga acaactggcc gcgtgtggaa gagttgttcc gccgcaaaat ctggcaactg 4800

aaagagctgg gttatgcagc cgtggatgat gaaaccacgc aacagacaat gcgtgagtta 4860

aaagaactgg gctacacttc ggagccgcac gctgccgtag cttatcgtgc gctgcgtgat 4920

cagttgaatc caggcgaata tggcttgttc ctcggcaccg cgcatccggc gaaatttaaa 4980

gagagcgtgg aagcgattct cggtgaaacg ttggatctgc caaaagagct ggcagaacgt 5040

gctgatttac ccttgctttc acataatctg cccgccgatt ttgctgcgtt gcgtaaattg 5100

atgatgaatc atcagtaa 5118

<210> 77

<211> 4684

<212> DNA

<213> 大肠杆菌

<400> 77

atgcgagtgt tgaagttcgg cggtacatca gtggcaaatg cagaacgttt tctgcgtgtt 60

gccgatattc tggaaagcaa tgccaggcag gggcaggtgg ccaccgtcct ctctgccccc 120

gccaaaatca ccaaccacct ggtggcgatg attgaaaaaa ccattagcgg ccaggatgct 180

ttacccaata tcagcgatgc cgaacgtatt tttgccgaac ttttgacggg actcgccgcc 240

gcccagccgg ggttcccgct ggcgcaattg aaaactttcg tcgatcagga atttgcccaa 300

ataaaacatg tcctgcatgg cattagtttg ttggggcagt gcccggatag catcaacgct 360

gcgctgattt gccgtggcga gaaaatgtcg atcgccatta tggccggcgt attagaagcg 420

cgcggtcaca acgttactgt tatcgatccg gtcgaaaaac tgctggcagt ggggcattac 480

ctcgaatcta ccgtcgatat tgctgagtcc acccgccgta ttgcggcaag ccgcattccg 540

gctgatcaca tggtgctgat ggcaggtttc accgccggta atgaaaaagg cgaactggtg 600

gtgcttggac gcaacggttc cgactactct gctgcggtgc tggctgcctg tttacgcgcc 660

gattgttgcg agatttggac ggacgttgac ggggtctata cctgcgaccc gcgtcaggtg 720

cccgatgcga ggttgttgaa gtcgatgtcc taccaggaag cgatggagct ttcctacttc 780

ggcgctaaag ttcttcaccc ccgcaccatt acccccatcg cccagttcca gatcccttgc 840

ctgattaaaa ataccggaaa tcctcaagca ccaggtacgc tcattggtgc cagccgtgat 900

gaagacgaat taccggtcaa gggcatttcc aatctgaata acatggcaat gttcagcgtt 960

tctggtccgg ggatgaaagg gatggtcggc atggcggcgc gcgtctttgc agcgatgtca 1020

cgcgcccgta tttccgtggt gctgattacg caatcatctt ccgaatacag catcagtttc 1080

tgcgttccac aaagcgactg tgtgcgagct gaacgggcaa tgcaggaaga gttctacctg 1140

gaactgaaag aaggcttact ggagccgctg gcagtgacgg aacggctggc cattatctcg 1200

gtggtaggtg atggtatgcg caccttgcgt gggatctcgg cgaaattctt tgccgcactg 1260

gcccgcgcca atatcaacat tgtcgccatt gctcagggat cttctgaacg ctcaatctct 1320

gtcgtggtaa ataacgatga tgcgaccact ggcgtgcgcg ttactcatca gatgctgttc 1380

aataccgatc aggttatcga agtgtttgtg attggcgtcg gtggcgttgg cggtgcgctg 1440

ctggagcaac tgaagcgtca gcaaagctgg ctgaagaata aacatatcga cttacgtgtc 1500

tgcggtgttg ccaactcgaa ggctctgctc accaatgtac atggccttaa tctggaaaac 1560

tggcaggaag aactggcgca agccaaagag ccgtttaatc tcgggcgctt aattcgcctc 1620

gtgaaagaat atcatctgct gaacccggtc attgttgact gcacttccag ccaggcagtg 1680

gcggatcaat atgccgactt cctgcgcgaa ggtttccacg ttgtcacgcc gaacaaaaag 1740

gccaacacct cgtcgatgga ttactaccat cagttgcgtt atgcggcgga aaaatcgcgg 1800

cgtaaattcc tctatgacac caacgttggg gctggattac cggttattga gaacctgcaa 1860

aatctgctca atgcaggtga tgaattgatg aagttctccg gcattctttc tggttcgctt 1920

tcttatatct tcggcaagtt agacgaaggc atgagtttct ccgaggcgac cacgctggcg 1980

cgggaaatgg gttataccga accggacccg cgagatgatc tttctggtat ggatgtggcg 2040

cgtaaactat tgattctcgc tcgtgaaacg ggacgtgaac tggagctggc ggatattgaa 2100

attgaacctg tgctgcccgc agagtttaac gccgagggtg atgttgccgc ttttatggcg 2160

aatctgtcac aactcgacga tctctttgcc gcgcgcgtgg cgaaggcccg tgatgaagga 2220

aaagttttgc gctatgttgg caatattgat gaagatggcg tctgccgcgt gaagattgcc 2280

gaagtggatg gtaatgatcc gctgttcaaa gtgaaaaatg gcgaaaacgc cctggccttc 2340

tatagccact attatcagcc gctgccgttg gtactgcgcg gatatggtgc gggcaatgac 2400

gttacagctg ccggtgtctt tgctgatctg ctacgtaccc tctcatggaa gttaggagtc 2460

tgacatggtt aaagtttatg ccccggcttc cagtgccaat atgagcgtcg ggtttgatgt 2520

gctcggggcg gcggtgacac ctgttgatgg tgcattgctc ggagatgtag tcacggttga 2580

ggcggcagag acattcagtc tcaacaacct cggacgcttt gccgataagc tgccgtcaga 2640

accacgggaa aatatcgttt atcagtgctg ggagcgtttt tgccaggaac tgggtaagca 2700

aattccagtg gcgatgaccc tggaaaagaa tatgccgatc ggttcgggct taggctccag 2760

tgcctgttcg gtggtcgcgg cgctgatggc gatgaatgaa cactgcggca agccgcttaa 2820

tgacactcgt ttgctggctt tgatgggcga gctggaaggc cgtatctccg gcagcattca 2880

ttacgacaac gtggcaccgt gttttctcgg tggtatgcag ttgatgatcg aagaaaacga 2940

catcatcagc cagcaagtgc cagggtttga tgagtggctg tgggtgctgg cgtatccggg 3000

gattaaagtc tcgacggcag aagccagggc tattttaccg gcgcagtatc gccgccagga 3060

ttgcattgcg cacgggcgac atctggcagg cttcattcac gcctgctatt cccgtcagcc 3120

tgagcttgcc gcgaagctga tgaaagatgt tatcgctgaa ccctaccgtg aacggttact 3180

gccaggcttc cggcaggcgc ggcaggcggt cgcggaaatc ggcgcggtag cgagcggtat 3240

ctccggctcc ggcccgacct tgttcgctct gtgtgacaag ccggaaaccg cccagcgcgt 3300

tgccgactgg ttgggtaaga actacctgca aaatcaggaa ggttttgttc atatttgccg 3360

gctggatacg gcgggcgcac gagtactgga aaactaaatg aaactctaca atctgaaaga 3420

tcacaacgag caggtcagct ttgcgcaagc cgtaacccag gggttgggca aaaatcaggg 3480

gctgtttttt ccgcacgacc tgccggaatt cagcctgact gaaattgatg agatgctgaa 3540

gctggatttt gtcacccgca gtgcgaagat cctctcggcg tttattggtg atgaaatccc 3600

acaggaaatc ctggaagagc gcgtgcgcgc ggcgtttgcc ttcccggctc cggtcgccaa 3660

tgttgaaagc gatgtcggtt gtctggaatt gttccacggg ccaacgctgg catttaaaga 3720

tttcggcggt cgctttatgg cacaaatgct gacccatatt gcgggtgata agccagtgac 3780

cattctgacc gcgacctccg gtgataccgg agcggcagtg gctcatgctt tctacggttt 3840

accgaatgtg aaagtggtta tcctctatcc acgaggcaaa atcagtccac tgcaagaaaa 3900

actgttctgt acattgggcg gcaatatcga aactgttgcc atcgacggcg atttcgatgc 3960

ctgtcaggcg ctggtgaagc aggcgtttga tgatgaagaa ctgaaagtgg cgctagggtt 4020

aaactcggct aactcgatta acatcagccg tttgctggcg cagatttgct actactttga 4080

agctgttgcg cagctgccgc aggagacgcg caaccagctg gttgtctcgg tgccaagcgg 4140

aaacttcggc gatttgacgg cgggtctgct ggcgaagtca ctcggtctgc cggtgaaacg 4200

ttttattgct gcgaccaacg tgaacgatac cgtgccacgt ttcctgcacg acggtcagtg 4260

gtcacccaaa gcgactcagg cgacgttatc caacgcgatg gacgtgagtc agccgaacaa 4320

ctggccgcgt gtggaagagt tgttccgccg caaaatctgg caactgaaag agctgggtta 4380

tgcagccgtg gatgatgaaa ccacgcaaca gacaatgcgt gagttaaaag aactgggcta 4440

cacttcggag ccgcacgctg ccgtagctta tcgtgcgctg cgtgatcagt tgaatccagg 4500

cgaatatggc ttgttcctcg gcaccgcgca tccggcgaaa tttaaagaga gcgtggaagc 4560

gattctcggt gaaacgttgg atctgccaaa agagctggca gaacgtgctg atttaccctt 4620

gctttcacat aatctgcccg ccgattttgc tgcgttgcgt aaattgatga tgaatcatca 4680

gtaa 4684

<210> 78

<211> 2815

<212> DNA

<213> 人工序列

<220>

<223> pUC19载体骨架

<220>

<221> misc_feature

<222> (340)..(340)

<223> n为a、c、g或t

<220>

<221> misc_feature

<222> (342)..(342)

<223> n为a、c、g或t

<220>

<221> misc_feature

<222> (345)..(346)

<223> n为a、c、g或t

<400> 78

aatctattca ttatctcaat caggccgggt ttgcttttat gcagcccggc ttttttatga 60

agaaattatg gagaaaaatg acagggaaaa aggagaaatt ctcaataaat gcggtaactt 120

agagattagg attgcggaga ataacaaccg ccgttactgg ccgtcgtttt acaacgtcgt 180

gactgggaaa accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc 240

agctggcgta atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg 300

aatggcgaat ggcgcctgat gcggtatttt chttsymrgn bnchnngcms sayhhtrrkb 360

tbtgtcctta cgcatctgtg cggtatttca caccgcatat ggtgcactct cagtacaatc 420

tgctctgatg ccgcatagtt aagccagccc cgacacccgc caacacccgc tgacgcgccc 480

tgacgggctt gtctgctccc ggcatccgct tacagacaag ctgtgaccgt ctccgggagc 540

tgcatgtgtc agaggttttc accgtcatca ccgaaacgcg cgagacgaaa gggcctcgtg 600

atacgcctat ttttataggt taatgtcatg ataataatgg tttcttagac gtcaggtggc 660

acttttcggg gaaatgtgcg cggaacccct atttgtttat ttttctaaat acattcaaat 720

atgtatccgc tcatgagaca ataaccctga taaatgcttc aataatattg aaaaaggaag 780

agtatgagta ttcaacattt ccgtgtcgcc cttattccct tttttgcggc attttgcctt 840

cctgtttttg ctcacccaga aacgctggtg aaagtaaaag atgctgaaga tcagttgggt 900

gcacgagtgg gttacatcga actggatctc aacagcggta agatccttga gagttttcgc 960

cccgaagaac gttttccaat gatgagcact tttaaagttc tgctatgtgg cgcggtatta 1020

tcccgtattg acgccgggca agagcaactc ggtcgccgca tacactattc tcagaatgac 1080

ttggttgagt actcaccagt cacagaaaag catcttacgg atggcatgac agtaagagaa 1140

ttatgcagtg ctgccataac catgagtgat aacactgcgg ccaacttact tctgacaacg 1200

atcggaggac cgaaggagct aaccgctttt ttgcacaaca tgggggatca tgtaactcgc 1260

cttgatcgtt gggaaccgga gctgaatgaa gccataccaa acgacgagcg tgacaccacg 1320

atgcctgtag caatggcaac aacgttgcgc aaactattaa ctggcgaact acttactcta 1380

gcttcccggc aacaattaat agactggatg gaggcggata aagttgcagg accacttctg 1440

cgctcggccc ttccggctgg ctggtttatt gctgataaat ctggagccgg tgagcgtggg 1500

tctcgcggta tcattgcagc actggggcca gatggtaagc cctcccgtat cgtagttatc 1560

tacacgacgg ggagtcaggc aactatggat gaacgaaata gacagatcgc tgagataggt 1620

gcctcactga ttaagcattg gtaactgtca gaccaagttt actcatatat actttagatt 1680

gatttaaaac ttcattttta atttaaaagg atctaggtga agatcctttt tgataatctc 1740

atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc cgtagaaaag 1800

atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa 1860

aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg 1920

aaggtaactg gcttcagcag agcgcagata ccaaatactg ttcttctagt gtagccgtag 1980

ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg 2040

ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga ctcaagacga 2100

tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc 2160

ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc 2220

acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt cggaacagga 2280

gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc tgtcgggttt 2340

cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg gagcctatgg 2400

aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac 2460

atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc ctttgagtga 2520

gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg 2580

gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt ggccgattca ttaatgcagc 2640

tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc gcaacgcaat taatgtgagt 2700

tagctcactc attaggcacc ccaggcttta cactttatgc ttccggctcg tatgttgtgt 2760

ggaattgtga gcggataaca atttcacaca ggaaacagct atgaccatga ttacg 2815

<210> 79

<211> 1089

<212> DNA

<213> 敏捷乳杆菌

<400> 79

atggatgaaa aactccgtgc cggtgttctg ggcgccacgg gtatggtagg acagcggttc 60

gtagcgatgt tggagaatca cccgtggttc gaagtaacca ctcttgcagc ttcgccgcgc 120

tcagcaggta aaacgtacgc acaggctgtg gatggccggt ggaaaatgga aactcccatt 180

ccagaggccg tcaaggatct caagattctt gatgtatcgg aagttgagaa agtcgcagct 240

caagtcgatt ttgtgttttc cgcagtttct atgtccaaag acaagattaa agcgattgaa 300

gaagcctacg cgaaaaccga aactccggta gtatcgaaca attcggcgca ccgttggacc 360

ccagatgttc ctatggtcgt gcccgaaatt aacccggagc atttcaaggt aattgattac 420

cagcggaaac ggctcggcac gaagcgcggc ttcattgccg ttaagccgaa ctgttctatc 480

cagagctacg ccccggctct cagcgcatgg ttgaaattcg aaccgtacga ggtaatcgct 540

tcaacttatc aggctatctc gggagctggt aagaacttcg acgactggcc ggagatgaag 600

ggaaacatca tcccttttat ttctggcgag gaggaaaaat cagagaagga gcccctcaag 660

atctggggac aacttgacga agctaaggga gagatcgtcc cagccactag ccctgttatt 720

acgagccaat gtattcgggt cccgatcctt tacggacaca ccgcgaccgt ctttgttaaa 780

ttcaagcaga acccaacgaa agaggaactg gtagctgctt tggaatcata tcagggactg 840

cctcaatcct tgaatttgcc gtctacccct aagcaattta ttcagtatct cagcgaagac 900

gaccgtccgc aggttgcgaa ggacgttaac tttgagaatg gtatgggtat ctctattggc 960

cgccttcgta aagattcggt ttacgattgg aagttcgtag gactctcgca caacaccgcg 1020

cgtggcgccg caggaggcgg cgtcctttcg gccgaattgc tgacggctca gggctatatt 1080

accaaaaag 1089

<210> 80

<211> 363

<212> PRT

<213> 敏捷乳杆菌

<400> 80

Met Asp Glu Lys Leu Arg Ala Gly Val Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Ala Met Leu Glu Asn His Pro Trp Phe Glu Val

20 25 30

Thr Thr Leu Ala Ala Ser Pro Arg Ser Ala Gly Lys Thr Tyr Ala Gln

35 40 45

Ala Val Asp Gly Arg Trp Lys Met Glu Thr Pro Ile Pro Glu Ala Val

50 55 60

Lys Asp Leu Lys Ile Leu Asp Val Ser Glu Val Glu Lys Val Ala Ala

65 70 75 80

Gln Val Asp Phe Val Phe Ser Ala Val Ser Met Ser Lys Asp Lys Ile

85 90 95

Lys Ala Ile Glu Glu Ala Tyr Ala Lys Thr Glu Thr Pro Val Val Ser

100 105 110

Asn Asn Ser Ala His Arg Trp Thr Pro Asp Val Pro Met Val Val Pro

115 120 125

Glu Ile Asn Pro Glu His Phe Lys Val Ile Asp Tyr Gln Arg Lys Arg

130 135 140

Leu Gly Thr Lys Arg Gly Phe Ile Ala Val Lys Pro Asn Cys Ser Ile

145 150 155 160

Gln Ser Tyr Ala Pro Ala Leu Ser Ala Trp Leu Lys Phe Glu Pro Tyr

165 170 175

Glu Val Ile Ala Ser Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys Asn

180 185 190

Phe Asp Asp Trp Pro Glu Met Lys Gly Asn Ile Ile Pro Phe Ile Ser

195 200 205

Gly Glu Glu Glu Lys Ser Glu Lys Glu Pro Leu Lys Ile Trp Gly Gln

210 215 220

Leu Asp Glu Ala Lys Gly Glu Ile Val Pro Ala Thr Ser Pro Val Ile

225 230 235 240

Thr Ser Gln Cys Ile Arg Val Pro Ile Leu Tyr Gly His Thr Ala Thr

245 250 255

Val Phe Val Lys Phe Lys Gln Asn Pro Thr Lys Glu Glu Leu Val Ala

260 265 270

Ala Leu Glu Ser Tyr Gln Gly Leu Pro Gln Ser Leu Asn Leu Pro Ser

275 280 285

Thr Pro Lys Gln Phe Ile Gln Tyr Leu Ser Glu Asp Asp Arg Pro Gln

290 295 300

Val Ala Lys Asp Val Asn Phe Glu Asn Gly Met Gly Ile Ser Ile Gly

305 310 315 320

Arg Leu Arg Lys Asp Ser Val Tyr Asp Trp Lys Phe Val Gly Leu Ser

325 330 335

His Asn Thr Ala Arg Gly Ala Ala Gly Gly Gly Val Leu Ser Ala Glu

340 345 350

Leu Leu Thr Ala Gln Gly Tyr Ile Thr Lys Lys

355 360

<210> 81

<211> 1104

<212> DNA

<213> 大肠杆菌

<400> 81

atgaaaaatg ttggttttat cggctggcgc ggtatggtcg gctccgttct catgcaacgc 60

atggttgaag agcgcgactt cgacgccatt cgccctgtct tcttttctac ttctcagctt 120

ggccaggctg cgccgtcttt tggcggaacc actggcacac ttcaggatgc ctttgatctg 180

gaggcgctaa aggccctcga tatcattgtg acctgtcagg gcggcgatta taccaacgaa 240

atctatccaa agcttcgtga aagcggatgg caaggttact ggattgacgc agcatcgtct 300

ctgcgcatga aagatgacgc catcatcatt cttgaccccg tcaatcagga cgtcattacc 360

gacggattaa ataatggcat caggactttt gttggcggta actgtaccgt aagcctgatg 420

ttgatgtcgt tgggtggttt attcgccaat gatcttgttg attgggtgtc cgttgcaacc 480

taccaggccg cttccggcgg tggtgcgcga catatgcgtg agttattaac ccagatgggc 540

catctgtatg gccatgtggc agatgaactc gcgaccccgt cctctgctat tctcgatatc 600

gaacgcaaag tcacaacctt aacccgtagc ggtgagctgc cggtggataa ctttggcgtg 660

ccgctggcgg gtagcctgat tccgtggatc gacaaacagc tcgataacgg tcagagccgc 720

gaagagtgga aagggcaggc ggaaaccaac aagatcctca acacatcttc cgtaattccg 780

gtagatggtt tatgtgtgcg tgtcggggca ttgcgctgcc acagccaggc attcactatt 840

aaattgaaaa aagatgtgtc tattccgacc gtggaagaac tgctggctgc gcacaatccg 900

tgggcgaaag tcgttccgaa cgatcgggaa atcactatgc gtgagctaac cccagctgcc 960

gttaccggca cgctgaccac gccggtaggc cgcctgcgta agctgaatat gggaccagag 1020

ttcctgtcag cctttaccgt gggcgaccag ctgctgtggg gggccgcgga gccgctgcgt 1080

cggatgcttc gtcaactggc gtaa 1104

<210> 82

<211> 367

<212> PRT

<213> 大肠杆菌

<400> 82

Met Lys Asn Val Gly Phe Ile Gly Trp Arg Gly Met Val Gly Ser Val

1 5 10 15

Leu Met Gln Arg Met Val Glu Glu Arg Asp Phe Asp Ala Ile Arg Pro

20 25 30

Val Phe Phe Ser Thr Ser Gln Leu Gly Gln Ala Ala Pro Ser Phe Gly

35 40 45

Gly Thr Thr Gly Thr Leu Gln Asp Ala Phe Asp Leu Glu Ala Leu Lys

50 55 60

Ala Leu Asp Ile Ile Val Thr Cys Gln Gly Gly Asp Tyr Thr Asn Glu

65 70 75 80

Ile Tyr Pro Lys Leu Arg Glu Ser Gly Trp Gln Gly Tyr Trp Ile Asp

85 90 95

Ala Ala Ser Ser Leu Arg Met Lys Asp Asp Ala Ile Ile Ile Leu Asp

100 105 110

Pro Val Asn Gln Asp Val Ile Thr Asp Gly Leu Asn Asn Gly Ile Arg

115 120 125

Thr Phe Val Gly Gly Asn Cys Thr Val Ser Leu Met Leu Met Ser Leu

130 135 140

Gly Gly Leu Phe Ala Asn Asp Leu Val Asp Trp Val Ser Val Ala Thr

145 150 155 160

Tyr Gln Ala Ala Ser Gly Gly Gly Ala Arg His Met Arg Glu Leu Leu

165 170 175

Thr Gln Met Gly His Leu Tyr Gly His Val Ala Asp Glu Leu Ala Thr

180 185 190

Pro Ser Ser Ala Ile Leu Asp Ile Glu Arg Lys Val Thr Thr Leu Thr

195 200 205

Arg Ser Gly Glu Leu Pro Val Asp Asn Phe Gly Val Pro Leu Ala Gly

210 215 220

Ser Leu Ile Pro Trp Ile Asp Lys Gln Leu Asp Asn Gly Gln Ser Arg

225 230 235 240

Glu Glu Trp Lys Gly Gln Ala Glu Thr Asn Lys Ile Leu Asn Thr Ser

245 250 255

Ser Val Ile Pro Val Asp Gly Leu Cys Val Arg Val Gly Ala Leu Arg

260 265 270

Cys His Ser Gln Ala Phe Thr Ile Lys Leu Lys Lys Asp Val Ser Ile

275 280 285

Pro Thr Val Glu Glu Leu Leu Ala Ala His Asn Pro Trp Ala Lys Val

290 295 300

Val Pro Asn Asp Arg Glu Ile Thr Met Arg Glu Leu Thr Pro Ala Ala

305 310 315 320

Val Thr Gly Thr Leu Thr Thr Pro Val Gly Arg Leu Arg Lys Leu Asn

325 330 335

Met Gly Pro Glu Phe Leu Ser Ala Phe Thr Val Gly Asp Gln Leu Leu

340 345 350

Trp Gly Ala Ala Glu Pro Leu Arg Arg Met Leu Arg Gln Leu Ala

355 360 365

<210> 83

<211> 1059

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_1序列

<400> 83

atgaaaccga tcgaggtggg cctgctcggc gcgacaggga tggtcggaca gcagttcgtg 60

cgccagctgc gcgcgcaccc ctggttccgg ctcacgtggc tcggagcgag cgatcgttcc 120

gccggccggc ggtacggcga tctctcgtgg cggctttccg atccgatgcc cgacgccgta 180

cgggatctca ccgtcgagtc gtgtcggccc ggaagcgcgc cgcgggtgct gttctcggcg 240

ctggatgccg cggccgccga cgagatcgaa gcggtcttcg cccaggccgg ccacgtcgtc 300

gtcagcaacg cccgatccca ccgcatgcgc cccgacgtcc cgctgctcgt gccggaaatc 360

aacccggatc atctgggtct tctcgccgtg cagcggcgga aggcgggcct ctcgggcacg 420

ccaagcagcg gcgcaatcgt cacgaacccg aactgctcaa ccgtgtttct cgctatggcg 480

ctcggcgcgc tgcgtccgct tcgcccggcg cgtgcgatcg tgaccacact gcaggccgcc 540

tccggcgcgg gctatccggg cgtgccctct ctcgatctgc tcggcaatgt catccctttc 600

atttctgggg aggaggaaaa gatcgagacc gagacgcgca agatcctcgg ccagctccgc 660

ggcgatgcca tcaccccgca cccgattgcg cttagcgcgc aggtgaaccg cgtaccggtc 720

gtgaacggcc acaccgaggc ggtctcggtg gcgttcgacg aggctccgcc gcgggatgcg 780

gtcctggaag cgctcacgcg attcaccggg ctcccgcagc agcaacagtt gcccagcgct 840

cccgcgaatc ctttgatcta tatgtccgag acggatcgtc cgcagcctcg tctcgacgtg 900

gagcgcgatg gcggcatgac ggtgtgtgtg gggcgcctcc gcgcctgtcc agtgctgcac 960

tggaagttcg tcctgctggg ccacaacacg attagaggcg ccgcgggtgc ggccgttctg 1020

aacgccgagc tgatggtagc cggcggatgg ctggattga 1059

<210> 84

<211> 352

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_1序列

<400> 84

Met Lys Pro Ile Glu Val Gly Leu Leu Gly Ala Thr Gly Met Val Gly

1 5 10 15

Gln Gln Phe Val Arg Gln Leu Arg Ala His Pro Trp Phe Arg Leu Thr

20 25 30

Trp Leu Gly Ala Ser Asp Arg Ser Ala Gly Arg Arg Tyr Gly Asp Leu

35 40 45

Ser Trp Arg Leu Ser Asp Pro Met Pro Asp Ala Val Arg Asp Leu Thr

50 55 60

Val Glu Ser Cys Arg Pro Gly Ser Ala Pro Arg Val Leu Phe Ser Ala

65 70 75 80

Leu Asp Ala Ala Ala Ala Asp Glu Ile Glu Ala Val Phe Ala Gln Ala

85 90 95

Gly His Val Val Val Ser Asn Ala Arg Ser His Arg Met Arg Pro Asp

100 105 110

Val Pro Leu Leu Val Pro Glu Ile Asn Pro Asp His Leu Gly Leu Leu

115 120 125

Ala Val Gln Arg Arg Lys Ala Gly Leu Ser Gly Thr Pro Ser Ser Gly

130 135 140

Ala Ile Val Thr Asn Pro Asn Cys Ser Thr Val Phe Leu Ala Met Ala

145 150 155 160

Leu Gly Ala Leu Arg Pro Leu Arg Pro Ala Arg Ala Ile Val Thr Thr

165 170 175

Leu Gln Ala Ala Ser Gly Ala Gly Tyr Pro Gly Val Pro Ser Leu Asp

180 185 190

Leu Leu Gly Asn Val Ile Pro Phe Ile Ser Gly Glu Glu Glu Lys Ile

195 200 205

Glu Thr Glu Thr Arg Lys Ile Leu Gly Gln Leu Arg Gly Asp Ala Ile

210 215 220

Thr Pro His Pro Ile Ala Leu Ser Ala Gln Val Asn Arg Val Pro Val

225 230 235 240

Val Asn Gly His Thr Glu Ala Val Ser Val Ala Phe Asp Glu Ala Pro

245 250 255

Pro Arg Asp Ala Val Leu Glu Ala Leu Thr Arg Phe Thr Gly Leu Pro

260 265 270

Gln Gln Gln Gln Leu Pro Ser Ala Pro Ala Asn Pro Leu Ile Tyr Met

275 280 285

Ser Glu Thr Asp Arg Pro Gln Pro Arg Leu Asp Val Glu Arg Asp Gly

290 295 300

Gly Met Thr Val Cys Val Gly Arg Leu Arg Ala Cys Pro Val Leu His

305 310 315 320

Trp Lys Phe Val Leu Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly

325 330 335

Ala Ala Val Leu Asn Ala Glu Leu Met Val Ala Gly Gly Trp Leu Asp

340 345 350

<210> 85

<211> 1179

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_2序列

<400> 85

atggctcaga tagcggagtt gacgcgagac gcggagggcg cggccgccgt cgcggggggc 60

aagttgcggg tggcattgct gggcgcgacc gggatggtcg gccagcagtt catacgcgtg 120

ctcagaaatc atccctggtt cgagatcgcc gtcctcgcgg cgtccgagtc ctcggcgggc 180

aagacgtacc gcgaggcgtt gcgcggccgc tgggcgatgg agttcgccgt gccggaagag 240

ttggccgggg tcaaagtcct ggacgtgcaa tcggtggacg agatcgccgc gcaggcggac 300

gtggccttct gcgcgctcaa cctcgaaaag gaggccgtcc gcgcgctcga agacgcctac 360

gcgcgcaggg gcgtgtgggt gacctcgaac aactccgcct tccggcaaga ccccctcgtg 420

ccgatggtca tcccggcggc caacgcgcac cacctcggcg tcgtcccgca ccagcgccgc 480

gcgcgcggct acgacacggg cgcgatcatc gtcaagagca actgctcgat ccagagttac 540

gtcatcgcgc tcgaaccgct cagggatttc ggcgttacgc gaatcaacgt cttcagcgcg 600

caggccatct cgggcgcggg caagactttc aagacctggc ccgagatgcg cgacaacctc 660

atcccacaca tcggcggcga ggaggagaag tccgagaccg agcccctgaa gatttggggc 720

gaggcgacga gcgacggcat cgtgccggcg aacggcccga agattcgcgc gcgttgcgtg 780

cgcgtcggag tcgccgacgg ccacaccgcc caggtcaccg tcgctttcaa ggccgtcccc 840

acggccgcgc agattctgga gcgctgggag cggcacaggg gccgcgccgc cgacctgccc 900

tcggccccgc gccgcctcat acactaccgc ccggagaagg atcgcccaca gcccgcgctc 960

gacgtgatga ccgagaacgg catggccgtc accgtcggcc acctacgggt cgagccggat 1020

gaagccacgg cctccttcac cgccctcgca cacaacgcca tcctcggcgc ggcgggcggc 1080

gcggtctggg cgaccgaggc ggccctggcc cgcgggctcc tctaccgccg catcgccccg 1140

caacgaaaga ccaggcccca gacggcgaag gcgctctga 1179

<210> 86

<211> 392

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_2序列

<400> 86

Met Ala Gln Ile Ala Glu Leu Thr Arg Asp Ala Glu Gly Ala Ala Ala

1 5 10 15

Val Ala Gly Gly Lys Leu Arg Val Ala Leu Leu Gly Ala Thr Gly Met

20 25 30

Val Gly Gln Gln Phe Ile Arg Val Leu Arg Asn His Pro Trp Phe Glu

35 40 45

Ile Ala Val Leu Ala Ala Ser Glu Ser Ser Ala Gly Lys Thr Tyr Arg

50 55 60

Glu Ala Leu Arg Gly Arg Trp Ala Met Glu Phe Ala Val Pro Glu Glu

65 70 75 80

Leu Ala Gly Val Lys Val Leu Asp Val Gln Ser Val Asp Glu Ile Ala

85 90 95

Ala Gln Ala Asp Val Ala Phe Cys Ala Leu Asn Leu Glu Lys Glu Ala

100 105 110

Val Arg Ala Leu Glu Asp Ala Tyr Ala Arg Arg Gly Val Trp Val Thr

115 120 125

Ser Asn Asn Ser Ala Phe Arg Gln Asp Pro Leu Val Pro Met Val Ile

130 135 140

Pro Ala Ala Asn Ala His His Leu Gly Val Val Pro His Gln Arg Arg

145 150 155 160

Ala Arg Gly Tyr Asp Thr Gly Ala Ile Ile Val Lys Ser Asn Cys Ser

165 170 175

Ile Gln Ser Tyr Val Ile Ala Leu Glu Pro Leu Arg Asp Phe Gly Val

180 185 190

Thr Arg Ile Asn Val Phe Ser Ala Gln Ala Ile Ser Gly Ala Gly Lys

195 200 205

Thr Phe Lys Thr Trp Pro Glu Met Arg Asp Asn Leu Ile Pro His Ile

210 215 220

Gly Gly Glu Glu Glu Lys Ser Glu Thr Glu Pro Leu Lys Ile Trp Gly

225 230 235 240

Glu Ala Thr Ser Asp Gly Ile Val Pro Ala Asn Gly Pro Lys Ile Arg

245 250 255

Ala Arg Cys Val Arg Val Gly Val Ala Asp Gly His Thr Ala Gln Val

260 265 270

Thr Val Ala Phe Lys Ala Val Pro Thr Ala Ala Gln Ile Leu Glu Arg

275 280 285

Trp Glu Arg His Arg Gly Arg Ala Ala Asp Leu Pro Ser Ala Pro Arg

290 295 300

Arg Leu Ile His Tyr Arg Pro Glu Lys Asp Arg Pro Gln Pro Ala Leu

305 310 315 320

Asp Val Met Thr Glu Asn Gly Met Ala Val Thr Val Gly His Leu Arg

325 330 335

Val Glu Pro Asp Glu Ala Thr Ala Ser Phe Thr Ala Leu Ala His Asn

340 345 350

Ala Ile Leu Gly Ala Ala Gly Gly Ala Val Trp Ala Thr Glu Ala Ala

355 360 365

Leu Ala Arg Gly Leu Leu Tyr Arg Arg Ile Ala Pro Gln Arg Lys Thr

370 375 380

Arg Pro Gln Thr Ala Lys Ala Leu

385 390

<210> 87

<211> 1245

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_3序列

<400> 87

atgtattatg gatttgaagt aatcatcccg ctccggctaa aactggcggg taacatcaac 60

aaaggcggaa acaacgaggt agtaatggct catatagaag aaatcgctcg tcttgaggcc 120

gggagcgccc gactctcagg cggaaagttg aaggtcgctg tgctgggcgc aaccgggatg 180

gtgggccaac agttaataag acttctatcg gatcatccct ggtttgaagt tgttgtggtt 240

gccgcctcaa ctaattctga aggtagctct tattcggaag ctgtccgggg tcgttggaca 300

atggcagctc gcatcccgga tgaaattgcc gcaatgaacg tttgggacgt tcagtcggtg 360

gacgagatag cggctcaagt tgatatcgcg ttctgcgcaa tcaaccttga taaggaaggt 420

gtgttgaagt tggaacacgc atacgcagct gcgggagtat gggttacttc caacaattcg 480

gcgtaccggc cggacccgtt cgtgcctatg gtgattcccg ccgtcaatcc gcatcacctg 540

gacttgatcc cccaccagcg ccaaacaaag ggctatcgaa ccggcgcgct tatagtaaag 600

agcaattgct ctattcagag ttatgtaatc gcactggacc cgctccggga attcggcatc 660

gaaaacgtaa gtatccacag cgaacaggcg atctccggcg cgggtaagac gtttgagact 720

tttccggata ttgagcgcaa cctgattcca ctaattaacg gcgaagaaaa gaagtcagag 780

gtcgagccgc tgaagatctg gggccagctc gaggctggag gaattgtgcc agcaacagga 840

ccacgcatta gggcgaagtg cgttagagta ggtgtcctcc atggacatac agcttatgcg 900

acagtgagat tccgagacac tccaactgtg gcccagatcc tggaacgatg ggagaactac 960

aaatcaccga accaacttcc atcgtcgcct cggaaattga ttcactactt gccagaaccc 1020

gaccggcccc agccacgtct ggatgtgatg acggagaacg gcatggcagt aagtatcgga 1080

caattgaaga ttgacagcga taagtctgtt tcctttaccg gcctttctca taatctgatc 1140

ctgggagctg ctggtggtgc cgtacttgcc accgaagcag ccgttgccag ggaacttgtc 1200

tatcgcagaa tcttatctcg ccaggagata ccgcagccgg catag 1245

<210> 88

<211> 414

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_3序列

<400> 88

Met Tyr Tyr Gly Phe Glu Val Ile Ile Pro Leu Arg Leu Lys Leu Ala

1 5 10 15

Gly Asn Ile Asn Lys Gly Gly Asn Asn Glu Val Val Met Ala His Ile

20 25 30

Glu Glu Ile Ala Arg Leu Glu Ala Gly Ser Ala Arg Leu Ser Gly Gly

35 40 45

Lys Leu Lys Val Ala Val Leu Gly Ala Thr Gly Met Val Gly Gln Gln

50 55 60

Leu Ile Arg Leu Leu Ser Asp His Pro Trp Phe Glu Val Val Val Val

65 70 75 80

Ala Ala Ser Thr Asn Ser Glu Gly Ser Ser Tyr Ser Glu Ala Val Arg

85 90 95

Gly Arg Trp Thr Met Ala Ala Arg Ile Pro Asp Glu Ile Ala Ala Met

100 105 110

Asn Val Trp Asp Val Gln Ser Val Asp Glu Ile Ala Ala Gln Val Asp

115 120 125

Ile Ala Phe Cys Ala Ile Asn Leu Asp Lys Glu Gly Val Leu Lys Leu

130 135 140

Glu His Ala Tyr Ala Ala Ala Gly Val Trp Val Thr Ser Asn Asn Ser

145 150 155 160

Ala Tyr Arg Pro Asp Pro Phe Val Pro Met Val Ile Pro Ala Val Asn

165 170 175

Pro His His Leu Asp Leu Ile Pro His Gln Arg Gln Thr Lys Gly Tyr

180 185 190

Arg Thr Gly Ala Leu Ile Val Lys Ser Asn Cys Ser Ile Gln Ser Tyr

195 200 205

Val Ile Ala Leu Asp Pro Leu Arg Glu Phe Gly Ile Glu Asn Val Ser

210 215 220

Ile His Ser Glu Gln Ala Ile Ser Gly Ala Gly Lys Thr Phe Glu Thr

225 230 235 240

Phe Pro Asp Ile Glu Arg Asn Leu Ile Pro Leu Ile Asn Gly Glu Glu

245 250 255

Lys Lys Ser Glu Val Glu Pro Leu Lys Ile Trp Gly Gln Leu Glu Ala

260 265 270

Gly Gly Ile Val Pro Ala Thr Gly Pro Arg Ile Arg Ala Lys Cys Val

275 280 285

Arg Val Gly Val Leu His Gly His Thr Ala Tyr Ala Thr Val Arg Phe

290 295 300

Arg Asp Thr Pro Thr Val Ala Gln Ile Leu Glu Arg Trp Glu Asn Tyr

305 310 315 320

Lys Ser Pro Asn Gln Leu Pro Ser Ser Pro Arg Lys Leu Ile His Tyr

325 330 335

Leu Pro Glu Pro Asp Arg Pro Gln Pro Arg Leu Asp Val Met Thr Glu

340 345 350

Asn Gly Met Ala Val Ser Ile Gly Gln Leu Lys Ile Asp Ser Asp Lys

355 360 365

Ser Val Ser Phe Thr Gly Leu Ser His Asn Leu Ile Leu Gly Ala Ala

370 375 380

Gly Gly Ala Val Leu Ala Thr Glu Ala Ala Val Ala Arg Glu Leu Val

385 390 395 400

Tyr Arg Arg Ile Leu Ser Arg Gln Glu Ile Pro Gln Pro Ala

405 410

<210> 89

<211> 1089

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_4序列

<400> 89

atgagctcct ctcgcatccc ggtccttata ctcggtgcga ccggcatggt cggacagaga 60

ttcgtctccc tcttgagcga tcatccctgg tttcaaatag ctggtgttgc ggcgtccccg 120

aattcggcag gggttcctta ccgcgaagcc gttcgtggca gatggctcct tgagtctgac 180

attccagata gtgtgggtgg tctcacggtg taccgtgtag aggaagatgc cgaattcctt 240

tcgaagttgg gccaggtggc tttttgtgcg ctcgatctac cgaaggaacg ggtccaggct 300

atcgagtgtg attatgcgcg tcggggcgtg gctgttatat cgaacaattc tgcacaccgt 360

cttacgagtg acgtgccggt cttaatgcca gagataaatc ctgatcacag tgagattata 420

actcagcaaa gaaaaaaccg agggtggtca tgcgggctta tcgcggtgaa gccaaactgc 480

tcaatccagt cttatgtccc agtccttgcc gccctttctg agctcaaggt cgagcgagta 540

tcagtcacga cgctgcaagc agtctccggc gccggaaaga ccctaaacag ctggccagag 600

atggtcgaca atgtgattcc tttcattcgt ggcgaagaag aaaagagcga gattgaacct 660

ctcaaagttc tcggtaccgt aaccgaggct ggaatcgcac cgagacggga cctcaagatc 720

tctgcaactt gtattcgagt cccggtctct gacgggcata tggccagtct caccttttcg 780

ctcggcgtat cggcctcagc gggggagatt gttgagcgat taaaggcttt caaaccaaga 840

tctaaaggct tagaactccc ctcgtcgccg gaactgtttc ttgcatattc gtccgatgac 900

gaccggccgc agactcgtct tgatcgggac actgaaagag gaatgggagt ttcagttgga 960

cgtctccgag aagattctgt tctggggtgg aagtgtgttg cactttcgca caatacggtt 1020

cgcggggctg ccggcggtgc cgtactgatg gcagagcttc ttcataagca gggctatatc 1080

aaaggctaa 1089

<210> 90

<211> 362

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_4序列

<400> 90

Met Ser Ser Ser Arg Ile Pro Val Leu Ile Leu Gly Ala Thr Gly Met

1 5 10 15

Val Gly Gln Arg Phe Val Ser Leu Leu Ser Asp His Pro Trp Phe Gln

20 25 30

Ile Ala Gly Val Ala Ala Ser Pro Asn Ser Ala Gly Val Pro Tyr Arg

35 40 45

Glu Ala Val Arg Gly Arg Trp Leu Leu Glu Ser Asp Ile Pro Asp Ser

50 55 60

Val Gly Gly Leu Thr Val Tyr Arg Val Glu Glu Asp Ala Glu Phe Leu

65 70 75 80

Ser Lys Leu Gly Gln Val Ala Phe Cys Ala Leu Asp Leu Pro Lys Glu

85 90 95

Arg Val Gln Ala Ile Glu Cys Asp Tyr Ala Arg Arg Gly Val Ala Val

100 105 110

Ile Ser Asn Asn Ser Ala His Arg Leu Thr Ser Asp Val Pro Val Leu

115 120 125

Met Pro Glu Ile Asn Pro Asp His Ser Glu Ile Ile Thr Gln Gln Arg

130 135 140

Lys Asn Arg Gly Trp Ser Cys Gly Leu Ile Ala Val Lys Pro Asn Cys

145 150 155 160

Ser Ile Gln Ser Tyr Val Pro Val Leu Ala Ala Leu Ser Glu Leu Lys

165 170 175

Val Glu Arg Val Ser Val Thr Thr Leu Gln Ala Val Ser Gly Ala Gly

180 185 190

Lys Thr Leu Asn Ser Trp Pro Glu Met Val Asp Asn Val Ile Pro Phe

195 200 205

Ile Arg Gly Glu Glu Glu Lys Ser Glu Ile Glu Pro Leu Lys Val Leu

210 215 220

Gly Thr Val Thr Glu Ala Gly Ile Ala Pro Arg Arg Asp Leu Lys Ile

225 230 235 240

Ser Ala Thr Cys Ile Arg Val Pro Val Ser Asp Gly His Met Ala Ser

245 250 255

Leu Thr Phe Ser Leu Gly Val Ser Ala Ser Ala Gly Glu Ile Val Glu

260 265 270

Arg Leu Lys Ala Phe Lys Pro Arg Ser Lys Gly Leu Glu Leu Pro Ser

275 280 285

Ser Pro Glu Leu Phe Leu Ala Tyr Ser Ser Asp Asp Asp Arg Pro Gln

290 295 300

Thr Arg Leu Asp Arg Asp Thr Glu Arg Gly Met Gly Val Ser Val Gly

305 310 315 320

Arg Leu Arg Glu Asp Ser Val Leu Gly Trp Lys Cys Val Ala Leu Ser

325 330 335

His Asn Thr Val Arg Gly Ala Ala Gly Gly Ala Val Leu Met Ala Glu

340 345 350

Leu Leu His Lys Gln Gly Tyr Ile Lys Gly

355 360

<210> 91

<211> 1086

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_5序列

<400> 91

atggcaaaca agctcaaggt aggcgtcctc ggcgcgaccg gcatggtcgg ccagaggttc 60

gtatcgctgc tcgcggacca tccgtggttc gaggtctctg cggtcgctgc aagcgcatcg 120

agcgcaggca aatcgtacgc agacgcagtc tcggggcgct ggacgctcga gacgcccgtc 180

ccgacagccg ttgcaaaaca gacggtcagc gacgcgtcgc agattcaaaa gatcgcggac 240

gcctgcgact tcgtcgtgtg cgcggtcgat atggacaaag cggcgaccgc gaagctcgaa 300

gaagactacg cgcgtgcgga gacgccggtc gtttcgaaca actcggcgca tcgttggacg 360

ccagacgtcc cgatgatgat ccccgagatc aactcgcacc acaccgacgt gatcgaagcg 420

caacgcaagc gcctcggcac gaagcgcggc ttcatcgcgg tgaaaccgaa ctgttcgatc 480

caatcgtacg tccctgcgat ccatccgctc gcggcattca agcccacgaa gatcgcggtc 540

tgcacgtatc aagcgatcag cggtgcgggc aaaacgttcc agtcatggcc cgacatgatc 600

gacaacgtca tccccttcat caaaggtgaa gaggaaaaga gcgagaagga accgctcaag 660

gtgtggggca cggtgaaggg cggagaaatt gtcgccgctc catcgccgac gatcacggcg 720

caatgcattc gcgtgcccgt cagtgacggt cacatggccg cggtgttcgt cgcgttcgag 780

cgcaagccaa cgcgcgagca aatcctctct gcgtggaaag agttcacggg taagccgcag 840

caagcgaagc tgccgagcgc gccaacgccg ttcctcaatt acttcgagga cgacacgcgc 900

ccgcagacca aactcgatcg cgacaacggt gacggccaag ccatctcgat cggtcgtctg 960

cgcgaagacg cgatgttcga ttggaagttc gtcgcgcttt cacacaacac cgtccgcggt 1020

gccgcaggtg gcgctgtact cactgccgag ttcctcaaac atgaaggttt cctggcagcg 1080

aagtag 1086

<210> 92

<211> 361

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_5序列

<400> 92

Met Ala Asn Lys Leu Lys Val Gly Val Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Ser Leu Leu Ala Asp His Pro Trp Phe Glu Val

20 25 30

Ser Ala Val Ala Ala Ser Ala Ser Ser Ala Gly Lys Ser Tyr Ala Asp

35 40 45

Ala Val Ser Gly Arg Trp Thr Leu Glu Thr Pro Val Pro Thr Ala Val

50 55 60

Ala Lys Gln Thr Val Ser Asp Ala Ser Gln Ile Gln Lys Ile Ala Asp

65 70 75 80

Ala Cys Asp Phe Val Val Cys Ala Val Asp Met Asp Lys Ala Ala Thr

85 90 95

Ala Lys Leu Glu Glu Asp Tyr Ala Arg Ala Glu Thr Pro Val Val Ser

100 105 110

Asn Asn Ser Ala His Arg Trp Thr Pro Asp Val Pro Met Met Ile Pro

115 120 125

Glu Ile Asn Ser His His Thr Asp Val Ile Glu Ala Gln Arg Lys Arg

130 135 140

Leu Gly Thr Lys Arg Gly Phe Ile Ala Val Lys Pro Asn Cys Ser Ile

145 150 155 160

Gln Ser Tyr Val Pro Ala Ile His Pro Leu Ala Ala Phe Lys Pro Thr

165 170 175

Lys Ile Ala Val Cys Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys Thr

180 185 190

Phe Gln Ser Trp Pro Asp Met Ile Asp Asn Val Ile Pro Phe Ile Lys

195 200 205

Gly Glu Glu Glu Lys Ser Glu Lys Glu Pro Leu Lys Val Trp Gly Thr

210 215 220

Val Lys Gly Gly Glu Ile Val Ala Ala Pro Ser Pro Thr Ile Thr Ala

225 230 235 240

Gln Cys Ile Arg Val Pro Val Ser Asp Gly His Met Ala Ala Val Phe

245 250 255

Val Ala Phe Glu Arg Lys Pro Thr Arg Glu Gln Ile Leu Ser Ala Trp

260 265 270

Lys Glu Phe Thr Gly Lys Pro Gln Gln Ala Lys Leu Pro Ser Ala Pro

275 280 285

Thr Pro Phe Leu Asn Tyr Phe Glu Asp Asp Thr Arg Pro Gln Thr Lys

290 295 300

Leu Asp Arg Asp Asn Gly Asp Gly Gln Ala Ile Ser Ile Gly Arg Leu

305 310 315 320

Arg Glu Asp Ala Met Phe Asp Trp Lys Phe Val Ala Leu Ser His Asn

325 330 335

Thr Val Arg Gly Ala Ala Gly Gly Ala Val Leu Thr Ala Glu Phe Leu

340 345 350

Lys His Glu Gly Phe Leu Ala Ala Lys

355 360

<210> 93

<211> 1089

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_6序列

<400> 93

atgacagctc acgatctcaa ggttgccgtg ctcggcgcga ccggcatggt cggacagcgg 60

ttcgtctctc tcctcgacgg ccatccctgg ttccaggtga cggtcgtggc cgcgagcgcc 120

cgatcggccg ggcggcccta cggcgatgcg gtcgccggcc gctggtcgct ctcgaaaccg 180

gtcccggccc ggatcgcgga acgggtcgtg cgggacgcct cgaacatcgc cgccatcgcg 240

gacgaggtgg acctcgtctt ctgcgcggtc gatctatcga aggaggagac gcgcgccctc 300

gaggacgggt atgcccggcg cgagacgccg gtcatctcga acaactccgc gcaccgcggc 360

accccggacg tcccgatgat gatcccggag gtcaacccgg agcacgccga gatcatcgcg 420

gcgcagcggc ggcgtctcgg cacccgacgc ggcttcgtgg cggtgaagcc gaactgctcg 480

ctccagtctt acctgcccgc gctccatccg ctccgcgatc tcggcctcga gaaggtcatg 540

gtggcgacct accaggccat ctcgggggcc ggcaagacct tcgcgtcctg gccggagatg 600

accgacaacg tcatcccctt catcaagggc gaggaggaga agagcgagca ggagccgctc 660

aagatctggg gtcgcgtgga cggcgaccgg atcgctccgg cccgcgagcc gatcatctcc 720

gcgcaatgca tccgcgtccc ggtgaccgat ggacacctgg cggcggtctc tctgtcgctg 780

gctcgcaagc agacgcccga ggcgatcatc cggcgctggc gcgaatacga gggcaagccg 840

cagcgcctcg gattgccgag cgccccgcgt ccgttcctgg tctatcacga cgacgactcg 900

agaccgcaga cccggctcga ccgcgacgcc ggcaatggaa tggcgatcag catgggccgg 960

ctccgccccg atccgctgtt cgactatcgc ttcgtcgcgc tgtcgcacaa cacggtgcgc 1020

ggcgccgccg gcggcggggt actcacggcc gagctgctcg tcgcggacgg ctacatcgag 1080

cgcaagtag 1089

<210> 94

<211> 362

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_6序列

<400> 94

Met Thr Ala His Asp Leu Lys Val Ala Val Leu Gly Ala Thr Gly Met

1 5 10 15

Val Gly Gln Arg Phe Val Ser Leu Leu Asp Gly His Pro Trp Phe Gln

20 25 30

Val Thr Val Val Ala Ala Ser Ala Arg Ser Ala Gly Arg Pro Tyr Gly

35 40 45

Asp Ala Val Ala Gly Arg Trp Ser Leu Ser Lys Pro Val Pro Ala Arg

50 55 60

Ile Ala Glu Arg Val Val Arg Asp Ala Ser Asn Ile Ala Ala Ile Ala

65 70 75 80

Asp Glu Val Asp Leu Val Phe Cys Ala Val Asp Leu Ser Lys Glu Glu

85 90 95

Thr Arg Ala Leu Glu Asp Gly Tyr Ala Arg Arg Glu Thr Pro Val Ile

100 105 110

Ser Asn Asn Ser Ala His Arg Gly Thr Pro Asp Val Pro Met Met Ile

115 120 125

Pro Glu Val Asn Pro Glu His Ala Glu Ile Ile Ala Ala Gln Arg Arg

130 135 140

Arg Leu Gly Thr Arg Arg Gly Phe Val Ala Val Lys Pro Asn Cys Ser

145 150 155 160

Leu Gln Ser Tyr Leu Pro Ala Leu His Pro Leu Arg Asp Leu Gly Leu

165 170 175

Glu Lys Val Met Val Ala Thr Tyr Gln Ala Ile Ser Gly Ala Gly Lys

180 185 190

Thr Phe Ala Ser Trp Pro Glu Met Thr Asp Asn Val Ile Pro Phe Ile

195 200 205

Lys Gly Glu Glu Glu Lys Ser Glu Gln Glu Pro Leu Lys Ile Trp Gly

210 215 220

Arg Val Asp Gly Asp Arg Ile Ala Pro Ala Arg Glu Pro Ile Ile Ser

225 230 235 240

Ala Gln Cys Ile Arg Val Pro Val Thr Asp Gly His Leu Ala Ala Val

245 250 255

Ser Leu Ser Leu Ala Arg Lys Gln Thr Pro Glu Ala Ile Ile Arg Arg

260 265 270

Trp Arg Glu Tyr Glu Gly Lys Pro Gln Arg Leu Gly Leu Pro Ser Ala

275 280 285

Pro Arg Pro Phe Leu Val Tyr His Asp Asp Asp Ser Arg Pro Gln Thr

290 295 300

Arg Leu Asp Arg Asp Ala Gly Asn Gly Met Ala Ile Ser Met Gly Arg

305 310 315 320

Leu Arg Pro Asp Pro Leu Phe Asp Tyr Arg Phe Val Ala Leu Ser His

325 330 335

Asn Thr Val Arg Gly Ala Ala Gly Gly Gly Val Leu Thr Ala Glu Leu

340 345 350

Leu Val Ala Asp Gly Tyr Ile Glu Arg Lys

355 360

<210> 95

<211> 1059

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_7序列

<400> 95

gtgaagcaac gcgtcgggat cctcggagcg acggggctcg tcgggcagcg gctcgtccgc 60

atgctcgagg gccatccgct cttcgaagtg agcgcactgg ctgcctccga ccgctccgag 120

ggacggcctt acgcggacgc gtgcccgtgg cggctcgccg accccatgcc cgaatcggtc 180

gcgggcttgc ggctggcccc ctgctcgccg ccgctcgact gcgacttcgt catcgcgagc 240

ctgccgtccg aggtcgccct cgaggccgag acggccttcg ccgcggccgg ctacccggtc 300

gtcagcaact cctcggccct ccgcatggcc gaggacgtac cgctcgtcgt ccccgaggtc 360

aaccccgacc acacggccct gctggccgag cagcgacgcc ggcgcggctg ggatcgtggc 420

ttcgtcctcg cgaacccgaa ctgctcgacg atcgcgctcg cgctcgcgct cgccccgctc 480

gaacgccgct tcggcctcga ggccgtcgtc gtgacgacga tgcaggcgat ctcgggcgcc 540

ggatacccgg gcgtctcggc cgtcgacatc gccgacaacg tcctgcccca catcgcgggc 600

gaggaggaga agctcgagac cgagccgctc aagatcttcg ggcgcttcac cggcgccggg 660

atcgagccgg cgagctttgc cgtcagcggc caatgccacc gcgtcgccgt ccaggacggc 720

cacctcgaag cggtccgcgt caagctcgcg cggcgcgcct cggtcgcgga ggtcgtcgag 780

gccctcgaaa ccttccgggg cctgccgcag gagctgcgcc tgccgacggc gcccgagcgc 840

cccgtcgtcg tccggcgtga aacggaccgg cctcagcccc gcctcgaccg cgacgccgag 900

ggcggcatgg ccacggtcgt cggccggatc gccgccgacc gcgtcctcga cttcaagctc 960

acgctcctcg gccacaacac gatccggggc gccgcgggcg gggcgctcct caacgccgaa 1020

ctgctcgacg cccagggcct gctcgggccc cgcgcgtga 1059

<210> 96

<211> 352

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_7序列

<400> 96

Val Lys Gln Arg Val Gly Ile Leu Gly Ala Thr Gly Leu Val Gly Gln

1 5 10 15

Arg Leu Val Arg Met Leu Glu Gly His Pro Leu Phe Glu Val Ser Ala

20 25 30

Leu Ala Ala Ser Asp Arg Ser Glu Gly Arg Pro Tyr Ala Asp Ala Cys

35 40 45

Pro Trp Arg Leu Ala Asp Pro Met Pro Glu Ser Val Ala Gly Leu Arg

50 55 60

Leu Ala Pro Cys Ser Pro Pro Leu Asp Cys Asp Phe Val Ile Ala Ser

65 70 75 80

Leu Pro Ser Glu Val Ala Leu Glu Ala Glu Thr Ala Phe Ala Ala Ala

85 90 95

Gly Tyr Pro Val Val Ser Asn Ser Ser Ala Leu Arg Met Ala Glu Asp

100 105 110

Val Pro Leu Val Val Pro Glu Val Asn Pro Asp His Thr Ala Leu Leu

115 120 125

Ala Glu Gln Arg Arg Arg Arg Gly Trp Asp Arg Gly Phe Val Leu Ala

130 135 140

Asn Pro Asn Cys Ser Thr Ile Ala Leu Ala Leu Ala Leu Ala Pro Leu

145 150 155 160

Glu Arg Arg Phe Gly Leu Glu Ala Val Val Val Thr Thr Met Gln Ala

165 170 175

Ile Ser Gly Ala Gly Tyr Pro Gly Val Ser Ala Val Asp Ile Ala Asp

180 185 190

Asn Val Leu Pro His Ile Ala Gly Glu Glu Glu Lys Leu Glu Thr Glu

195 200 205

Pro Leu Lys Ile Phe Gly Arg Phe Thr Gly Ala Gly Ile Glu Pro Ala

210 215 220

Ser Phe Ala Val Ser Gly Gln Cys His Arg Val Ala Val Gln Asp Gly

225 230 235 240

His Leu Glu Ala Val Arg Val Lys Leu Ala Arg Arg Ala Ser Val Ala

245 250 255

Glu Val Val Glu Ala Leu Glu Thr Phe Arg Gly Leu Pro Gln Glu Leu

260 265 270

Arg Leu Pro Thr Ala Pro Glu Arg Pro Val Val Val Arg Arg Glu Thr

275 280 285

Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Glu Gly Gly Met Ala

290 295 300

Thr Val Val Gly Arg Ile Ala Ala Asp Arg Val Leu Asp Phe Lys Leu

305 310 315 320

Thr Leu Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Gly Ala Leu

325 330 335

Leu Asn Ala Glu Leu Leu Asp Ala Gln Gly Leu Leu Gly Pro Arg Ala

340 345 350

<210> 97

<211> 1146

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_8序列

<400> 97

atgaacaaga aattccgagt cggcattctc ggggcaaccg gcatggtcgg tcagcgattc 60

gtccaactgc tggagaatca tccgcttttt gaaatcacgg cgctggcggc gtctggtcgt 120

tcgcaaggaa agacttacgc cgaagcctgc acctggcgtt tgcccggcga attgccggat 180

ggcgtgaaac agatcgtcgt gcagccgccc gcgccgccac tcgactgcga tttcgttttc 240

tccagtttgc cgggcgaggt cgcggctgat gccgagctta agttcgcacg aatggatttt 300

ccagtgatca gtaattcttc atcgcatcgc atggcgccgg atgttccgtt gctgattccg 360

gaagtcaatc ctgaacacgt cgaactgatc gacgcgcagc gcattaaccg cgaatacaat 420

cgcgggttca tcgtcacaaa tcccaactgc tcagcgatcg cggttgtgtt ggcactggcg 480

ccgttgcatg caaagtttgg cgtgagcgag tgcgtcgtga ccacgatgca agccctctcg 540

ggcgccggtt atccaggtgt tgcttctctc gacgccattg acaacgtaat tccattcatc 600

ggcggcgagg acgagaaggt cgagatcgaa accaaaaagc tcctcggcgt cgtgagccag 660

ggcacaatcg cggacgctaa cctgaaagtc agcgcgcaat gtaaccgcgt gaatgtgacc 720

gacggtcaca tggcttcgat tcgggtgaaa ctggcgcagc cggcatccac cagcgaagtt 780

atcgacgtgc tcgcatcgtt caccgccgag ccccaaaagc tgaaacttca ctcagcgccg 840

gcgaaaccac tcatcgtccg cgacgaaatt gatcggccac agcctcgact tgatcgtgat 900

gcgggaaatg gaatgagcgt taccgtgggg cgactcgcga aagataacgt tttggattat 960

cgcttcgtgg cgctgggtca taacacgatt cgcggcgccg cgggggcggc gattctgaat 1020

gcagagttgc tggtggcgaa aggatactcg cgtgactacg cgccgccaag atggcggcga 1080

accgcaggca gggatgcctg cgctcccatc cattccgtaa tcactcgttt caactcgtcc 1140

aactga 1146

<210> 98

<211> 381

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_8序列

<400> 98

Met Asn Lys Lys Phe Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Gln Leu Leu Glu Asn His Pro Leu Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Gly Arg Ser Gln Gly Lys Thr Tyr Ala Glu

35 40 45

Ala Cys Thr Trp Arg Leu Pro Gly Glu Leu Pro Asp Gly Val Lys Gln

50 55 60

Ile Val Val Gln Pro Pro Ala Pro Pro Leu Asp Cys Asp Phe Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Val Ala Ala Asp Ala Glu Leu Lys Phe Ala

85 90 95

Arg Met Asp Phe Pro Val Ile Ser Asn Ser Ser Ser His Arg Met Ala

100 105 110

Pro Asp Val Pro Leu Leu Ile Pro Glu Val Asn Pro Glu His Val Glu

115 120 125

Leu Ile Asp Ala Gln Arg Ile Asn Arg Glu Tyr Asn Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Ala Ile Ala Val Val Leu Ala Leu Ala

145 150 155 160

Pro Leu His Ala Lys Phe Gly Val Ser Glu Cys Val Val Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ala

180 185 190

Ile Asp Asn Val Ile Pro Phe Ile Gly Gly Glu Asp Glu Lys Val Glu

195 200 205

Ile Glu Thr Lys Lys Leu Leu Gly Val Val Ser Gln Gly Thr Ile Ala

210 215 220

Asp Ala Asn Leu Lys Val Ser Ala Gln Cys Asn Arg Val Asn Val Thr

225 230 235 240

Asp Gly His Met Ala Ser Ile Arg Val Lys Leu Ala Gln Pro Ala Ser

245 250 255

Thr Ser Glu Val Ile Asp Val Leu Ala Ser Phe Thr Ala Glu Pro Gln

260 265 270

Lys Leu Lys Leu His Ser Ala Pro Ala Lys Pro Leu Ile Val Arg Asp

275 280 285

Glu Ile Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Asn Gly

290 295 300

Met Ser Val Thr Val Gly Arg Leu Ala Lys Asp Asn Val Leu Asp Tyr

305 310 315 320

Arg Phe Val Ala Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Val Ala Lys Gly Tyr Ser Arg Asp

340 345 350

Tyr Ala Pro Pro Arg Trp Arg Arg Thr Ala Gly Arg Asp Ala Cys Ala

355 360 365

Pro Ile His Ser Val Ile Thr Arg Phe Asn Ser Ser Asn

370 375 380

<210> 99

<211> 1128

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_9序列

<400> 99

atgctcgcag ttgaaaccga aactactgaa ggtcccagct tacaaggtcg gaagaatcgc 60

gtcggaattt taggcgcgac gggaacggtt ggtcagcgct tcattcaact tctggagcac 120

catccgcaat tcgaggtgac ggcgctggcc gcctcggatc gctcgcaggg acggcgatac 180

gcggatgcct gcacctggcg tctgccgggc gcgatgccgg aatcggtacg cgcgctgatg 240

gtggaagccc cggcgccgcc gcttgactgc gatttggttt tctcaagcct gccgtcccag 300

attgcccgcg atgctgaggt tgccttcgcg agagccggct atcccgtcat cagcaactcg 360

tccgcctgcc gcatggatga cgacgtgccg cttttgatcc cggaggtaaa tgccgagcat 420

ctcgggattc tcgatcacca acggaagctg cgccgttttc cggggaacgg tttcattgtt 480

accaatccaa attgtgcggc aattgtattg gcgccggtgc tggccgcgct acatgaacgt 540

ttccaagtcg tttcggtcat tgccactacc atgcaggcga tctccggcgc tggttatccc 600

ggcgtggcct cccttgatat tgtcgataat ctgattcctt ttatcgacgg cgaagaggac 660

aagattgaag cggagactct gaagattctc ggccgactag atacggaaag aattgagccc 720

gcaagaatcc ttattagtgc gcagtgccat cgtgtcaatg tcattgatgg ccacacggta 780

gcggcgcggc tgaagctggc gcgccaacca cagcttgatg aagtgcgcga tgtcctgcgg 840

tcattcagat cgttgccgca agaactgcgc cttcactcag ctccggaaaa accaattgtg 900

gtgcatgacg aggttgaccg gccgcagccg cggctcgatc gcgatgcggg caacggcatg 960

agtattactg tcggtcgcct ggcgggcgat cgcgtcttgg actttcgctt ggtggcgctc 1020

ggtcataaca cgattcgcgg agccgcgggc gcggccatct tgaatgctga attgttattg 1080

gcgaaaggac atttttccaa gacttcggat agggcgatgg ccgcttag 1128

<210> 100

<211> 375

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_9序列

<400> 100

Met Leu Ala Val Glu Thr Glu Thr Thr Glu Gly Pro Ser Leu Gln Gly

1 5 10 15

Arg Lys Asn Arg Val Gly Ile Leu Gly Ala Thr Gly Thr Val Gly Gln

20 25 30

Arg Phe Ile Gln Leu Leu Glu His His Pro Gln Phe Glu Val Thr Ala

35 40 45

Leu Ala Ala Ser Asp Arg Ser Gln Gly Arg Arg Tyr Ala Asp Ala Cys

50 55 60

Thr Trp Arg Leu Pro Gly Ala Met Pro Glu Ser Val Arg Ala Leu Met

65 70 75 80

Val Glu Ala Pro Ala Pro Pro Leu Asp Cys Asp Leu Val Phe Ser Ser

85 90 95

Leu Pro Ser Gln Ile Ala Arg Asp Ala Glu Val Ala Phe Ala Arg Ala

100 105 110

Gly Tyr Pro Val Ile Ser Asn Ser Ser Ala Cys Arg Met Asp Asp Asp

115 120 125

Val Pro Leu Leu Ile Pro Glu Val Asn Ala Glu His Leu Gly Ile Leu

130 135 140

Asp His Gln Arg Lys Leu Arg Arg Phe Pro Gly Asn Gly Phe Ile Val

145 150 155 160

Thr Asn Pro Asn Cys Ala Ala Ile Val Leu Ala Pro Val Leu Ala Ala

165 170 175

Leu His Glu Arg Phe Gln Val Val Ser Val Ile Ala Thr Thr Met Gln

180 185 190

Ala Ile Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ile Val

195 200 205

Asp Asn Leu Ile Pro Phe Ile Asp Gly Glu Glu Asp Lys Ile Glu Ala

210 215 220

Glu Thr Leu Lys Ile Leu Gly Arg Leu Asp Thr Glu Arg Ile Glu Pro

225 230 235 240

Ala Arg Ile Leu Ile Ser Ala Gln Cys His Arg Val Asn Val Ile Asp

245 250 255

Gly His Thr Val Ala Ala Arg Leu Lys Leu Ala Arg Gln Pro Gln Leu

260 265 270

Asp Glu Val Arg Asp Val Leu Arg Ser Phe Arg Ser Leu Pro Gln Glu

275 280 285

Leu Arg Leu His Ser Ala Pro Glu Lys Pro Ile Val Val His Asp Glu

290 295 300

Val Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Asn Gly Met

305 310 315 320

Ser Ile Thr Val Gly Arg Leu Ala Gly Asp Arg Val Leu Asp Phe Arg

325 330 335

Leu Val Ala Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala

340 345 350

Ile Leu Asn Ala Glu Leu Leu Leu Ala Lys Gly His Phe Ser Lys Thr

355 360 365

Ser Asp Arg Ala Met Ala Ala

370 375

<210> 101

<211> 1065

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_10序列

<400> 101

atgaaaaaga aataccgtgt cggaatctta ggcgcgacgg ggacggtcgg gcagcggttc 60

attcaactgc tcgaaggcca cccgcagttc gaggtgacgg cgctcgcggc ctcggaccgc 120

tcgcaggggc ggccttacgc cgaggcgtgc gcgtggcgtc tgccgggcga gatgccggag 180

gccgtgcgct caatcgaggt ccgaacgccc gcgccgccgc tcgactgcga ccttgtcttt 240

tcgagcctcc ccggcgagat ggcgcgcgag gcggaagaat ctttcgcggg cgccggctac 300

gccgtcgtca gcaactcttc ggcgctcagg atggacgagg acgtgccgct actgataccg 360

gaggtcaacc acgagcacct cgcgctgctc gacgcgcaac gcgagcggcg cggctacgaa 420

agaggcttcg tcgtcaccaa cccgaactgc tcgaccgtcg tcgtcgcgct cgcgctcgcg 480

ccgctgcacg cgaggttcgg cgtcgaggcg gtcgcggccg tcaccatgca ggccatttcc 540

ggcgcgggct accccggcgt cgcctcgctc gacatcgccg acaacgtcct gccccacatc 600

tccggcgagg aggaaaaaat agagagcgag accggcaaga tactcggccg cctagcgggc 660

gggggcgcgt cggcgcgcgt cgagcgcgcg cagttccccg tcagcgcgca gtgccaccgc 720

gtcggcgtaa cggacggaca cacggcggcc gtccgcatca aactctcacg ccccgccgaa 780

cccggtgagc tgcgcgaggc cttcgccgcc tacacttcgc tgccgcagga gttgaaactc 840

cacaacgcgc ccgaacgccc cgtcgtcttc cgcgacgaag acgaccgccc gcagcccaaa 900

ctcgaccgcg acgccggagg cgggatgagc gtcaccgtcg gccgcctccg gcgcgaccgc 960

gtgatggact accgcttcgt cgccctcggc cacaacaccg tccgcggcgc ggccggcgcg 1020

gccatcctca acgccgaact gctcgccgcc accggacgac tgtaa 1065

<210> 102

<211> 354

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_10序列

<400> 102

Met Lys Lys Lys Tyr Arg Val Gly Ile Leu Gly Ala Thr Gly Thr Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Gly His Pro Gln Phe Glu Val

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Arg Pro Tyr Ala Glu

35 40 45

Ala Cys Ala Trp Arg Leu Pro Gly Glu Met Pro Glu Ala Val Arg Ser

50 55 60

Ile Glu Val Arg Thr Pro Ala Pro Pro Leu Asp Cys Asp Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Met Ala Arg Glu Ala Glu Glu Ser Phe Ala

85 90 95

Gly Ala Gly Tyr Ala Val Val Ser Asn Ser Ser Ala Leu Arg Met Asp

100 105 110

Glu Asp Val Pro Leu Leu Ile Pro Glu Val Asn His Glu His Leu Ala

115 120 125

Leu Leu Asp Ala Gln Arg Glu Arg Arg Gly Tyr Glu Arg Gly Phe Val

130 135 140

Val Thr Asn Pro Asn Cys Ser Thr Val Val Val Ala Leu Ala Leu Ala

145 150 155 160

Pro Leu His Ala Arg Phe Gly Val Glu Ala Val Ala Ala Val Thr Met

165 170 175

Gln Ala Ile Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ile

180 185 190

Ala Asp Asn Val Leu Pro His Ile Ser Gly Glu Glu Glu Lys Ile Glu

195 200 205

Ser Glu Thr Gly Lys Ile Leu Gly Arg Leu Ala Gly Gly Gly Ala Ser

210 215 220

Ala Arg Val Glu Arg Ala Gln Phe Pro Val Ser Ala Gln Cys His Arg

225 230 235 240

Val Gly Val Thr Asp Gly His Thr Ala Ala Val Arg Ile Lys Leu Ser

245 250 255

Arg Pro Ala Glu Pro Gly Glu Leu Arg Glu Ala Phe Ala Ala Tyr Thr

260 265 270

Ser Leu Pro Gln Glu Leu Lys Leu His Asn Ala Pro Glu Arg Pro Val

275 280 285

Val Phe Arg Asp Glu Asp Asp Arg Pro Gln Pro Lys Leu Asp Arg Asp

290 295 300

Ala Gly Gly Gly Met Ser Val Thr Val Gly Arg Leu Arg Arg Asp Arg

305 310 315 320

Val Met Asp Tyr Arg Phe Val Ala Leu Gly His Asn Thr Val Arg Gly

325 330 335

Ala Ala Gly Ala Ala Ile Leu Asn Ala Glu Leu Leu Ala Ala Thr Gly

340 345 350

Arg Leu

<210> 103

<211> 1071

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_11序列

<400> 103

atgactaaaa agtttcgtgt cggcattctt ggagcgacgg gcgtcgttgg ccagcgcttt 60

attcaattgc tcgaaggcca tccgcaattt gagatcgcgg ctctcgcagc atctgatcga 120

tcgcaggaca aaactttcgc cgaggcgtgc aagtggcgac tgccgggcga catgcctgag 180

cacgtgagag agattgtcgt tcagccgccg gcccccccgc ttgattgcga ctttgttttt 240

tccagtttgc ctaccgacat cgcgactgac gccgaaacac aattcgcgct cgcgggctat 300

ccggtgatca gcaattcatc ttcgcatcgc atgggcgctg acattccttt gttgattccg 360

gaagtgaatt cggatcacat tgcgcttatc gacgtgcagc gaaagaaccg tggctacgag 420

cgcggcttta tcgtcactaa tccaaactgt tccgcgattg ccattgtgat ggcgctcgca 480

ccgcttcatg aaaaattcgg aatcacgtcg tgcgtcgcga ccactatgca ggcgctttcc 540

ggcgccggct atccgggcgt agcgtcgctc gacgcgacgg acaatgtgat cccgtttatt 600

ggcggtgaag aagagaagat cgaagccgag actttaaagt tgctcgggga agtgggcgac 660

ggcgtcatcg acgatgccaa aatgtccgtg agtgcccagt gtaaccgcgt gaatgttacc 720

gacggacatc tcgcgtcgat tcgcgtgaag ctttcgcaat cagcatcact agaccaaata 780

aaagaaacac tgtcttcatt tagggccgta ccgcaggagt tgaaactgca ttcggcgccg 840

gtccggccgg tcatcgttcg cgacgaagtt gatcgccccc agccgcgttt ggatcgggac 900

gcggaaaatg ggatgagcgt cactgtgggc cgaattatgc cggacaacgt gctcgacttc 960

cggtttgtgg cgctcggcca caatacgatc cgtggcgcgg ccggcgcggc gattttgaat 1020

gctgagttgc tggtggcccg aggatatttg agtcagaacc acctgcgata g 1071

<210> 104

<211> 356

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_11序列

<400> 104

Met Thr Lys Lys Phe Arg Val Gly Ile Leu Gly Ala Thr Gly Val Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Gly His Pro Gln Phe Glu Ile

20 25 30

Ala Ala Leu Ala Ala Ser Asp Arg Ser Gln Asp Lys Thr Phe Ala Glu

35 40 45

Ala Cys Lys Trp Arg Leu Pro Gly Asp Met Pro Glu His Val Arg Glu

50 55 60

Ile Val Val Gln Pro Pro Ala Pro Pro Leu Asp Cys Asp Phe Val Phe

65 70 75 80

Ser Ser Leu Pro Thr Asp Ile Ala Thr Asp Ala Glu Thr Gln Phe Ala

85 90 95

Leu Ala Gly Tyr Pro Val Ile Ser Asn Ser Ser Ser His Arg Met Gly

100 105 110

Ala Asp Ile Pro Leu Leu Ile Pro Glu Val Asn Ser Asp His Ile Ala

115 120 125

Leu Ile Asp Val Gln Arg Lys Asn Arg Gly Tyr Glu Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Ala Ile Ala Ile Val Met Ala Leu Ala

145 150 155 160

Pro Leu His Glu Lys Phe Gly Ile Thr Ser Cys Val Ala Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ala

180 185 190

Thr Asp Asn Val Ile Pro Phe Ile Gly Gly Glu Glu Glu Lys Ile Glu

195 200 205

Ala Glu Thr Leu Lys Leu Leu Gly Glu Val Gly Asp Gly Val Ile Asp

210 215 220

Asp Ala Lys Met Ser Val Ser Ala Gln Cys Asn Arg Val Asn Val Thr

225 230 235 240

Asp Gly His Leu Ala Ser Ile Arg Val Lys Leu Ser Gln Ser Ala Ser

245 250 255

Leu Asp Gln Ile Lys Glu Thr Leu Ser Ser Phe Arg Ala Val Pro Gln

260 265 270

Glu Leu Lys Leu His Ser Ala Pro Val Arg Pro Val Ile Val Arg Asp

275 280 285

Glu Val Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Glu Asn Gly

290 295 300

Met Ser Val Thr Val Gly Arg Ile Met Pro Asp Asn Val Leu Asp Phe

305 310 315 320

Arg Phe Val Ala Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Val Ala Arg Gly Tyr Leu Ser Gln

340 345 350

Asn His Leu Arg

355

<210> 105

<211> 1053

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_12序列

<400> 105

atgtccaaaa aattccgagt gggtatcctg ggagcaaccg gcgttgtcgg tcagcgtttt 60

attcaattac tcgaaaacca tccgcaattt gaagtcgcgg cgcttgcggc atctgatcgt 120

tcgcagggaa aaagttatgt caacgcctgc acctggcgtt tacccggcga gatgccggaa 180

gcggtaaaga acattattgt ccaacctcct tcaccgcccc tcaattgcga tttcgtcttt 240

tccagtttgc ccggggagat tgccaggacg gccgaagagg attttgcccg ggccggttac 300

ccggtgatca gcaattcgtc atcgcaccgc atgggattag acattccttt gctgatccca 360

gaagtgaatc ctgaccacct cgaattgatc gatgctcagc gcacaaatca cgaatacaat 420

cgtggcttca ttgtaacgaa tcccaattgt tccgcgatcg ccatcgtaat cgcgctggct 480

ccgctgcatg agaagtttgg ggtcagctcc tgcgtcgtga cgacaatgca ggccctctcc 540

ggcgccggct atcccggtgt accgtctctc gacgcaaccg acaacgtgat tccttttatc 600

ggcggcgaag atgagaaggt cgagtgcgaa acgcggaaga ttctcggcgt ggtgacgcaa 660

ggcgcagtcg ttgatgccga catgacgata agtgcgcagt gcaatcgcgt gaacgttacg 720

gatggacata tggcgtcgat tcgcgtgaag ctggcccgtt ccgccacgct ggaagagatt 780

cgcgaagcgt tggtttcctt taccgctgaa ccgcaaaggt tgaaactgca cacagccccg 840

gctaaaccca tcgtggtccg cgatgagatt gatcggcccc aaccgcgact cgatcgtgat 900

gcaggccgtg ggatgagtat tacggttggt cgcattatgc cggacagcgt gcttgattat 960

cgcttcatgg cgctcggtca caacacgatt cgcggcgcgg ccggcgcggc gattctaaat 1020

gccgagttgc tggtggcgcg aggatattta tga 1053

<210> 106

<211> 350

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_12序列

<400> 106

Met Ser Lys Lys Phe Arg Val Gly Ile Leu Gly Ala Thr Gly Val Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Val

20 25 30

Ala Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Ser Tyr Val Asn

35 40 45

Ala Cys Thr Trp Arg Leu Pro Gly Glu Met Pro Glu Ala Val Lys Asn

50 55 60

Ile Ile Val Gln Pro Pro Ser Pro Pro Leu Asn Cys Asp Phe Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Ile Ala Arg Thr Ala Glu Glu Asp Phe Ala

85 90 95

Arg Ala Gly Tyr Pro Val Ile Ser Asn Ser Ser Ser His Arg Met Gly

100 105 110

Leu Asp Ile Pro Leu Leu Ile Pro Glu Val Asn Pro Asp His Leu Glu

115 120 125

Leu Ile Asp Ala Gln Arg Thr Asn His Glu Tyr Asn Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Ala Ile Ala Ile Val Ile Ala Leu Ala

145 150 155 160

Pro Leu His Glu Lys Phe Gly Val Ser Ser Cys Val Val Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Pro Ser Leu Asp Ala

180 185 190

Thr Asp Asn Val Ile Pro Phe Ile Gly Gly Glu Asp Glu Lys Val Glu

195 200 205

Cys Glu Thr Arg Lys Ile Leu Gly Val Val Thr Gln Gly Ala Val Val

210 215 220

Asp Ala Asp Met Thr Ile Ser Ala Gln Cys Asn Arg Val Asn Val Thr

225 230 235 240

Asp Gly His Met Ala Ser Ile Arg Val Lys Leu Ala Arg Ser Ala Thr

245 250 255

Leu Glu Glu Ile Arg Glu Ala Leu Val Ser Phe Thr Ala Glu Pro Gln

260 265 270

Arg Leu Lys Leu His Thr Ala Pro Ala Lys Pro Ile Val Val Arg Asp

275 280 285

Glu Ile Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Arg Gly

290 295 300

Met Ser Ile Thr Val Gly Arg Ile Met Pro Asp Ser Val Leu Asp Tyr

305 310 315 320

Arg Phe Met Ala Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Val Ala Arg Gly Tyr Leu

340 345 350

<210> 107

<211> 1056

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_13序列

<400> 107

atgtccaaca aatttcgagt cggcattctc ggtgcaaccg gcatggtcgg tcaacgattc 60

gtccaactgc tggagaatca tcccgttttt gaaatcacgg cgctggcagc gtctgatcgt 120

tcgcaaggaa agacttacgc cgcagcctgc acctggcgtt tgccgggcga gatgccggct 180

cgcgtgaaac agatcgtcgt gcagccgccc gccccgccac tcgactgcga tttcgttttc 240

tccagtttgc cgggcgagat cgcgactgat gccgagccga aattcgcgcg aatggatttt 300

ccagtgatca gtaattcttc atcgcatcgc atggcgcctg atgttcctct gctgattccg 360

gaagttaatc ctgaacacgt cgaactgatc gacgcgcagc gcatcaaccg cgaatacaat 420

cgcgggttta tcgtcaccaa cccaaattgc tcggcaattg taatcgtgat ggccctcgca 480

ccgttgcacg cgaagtttgg tgttgaatcg tgcatcgtca ccacgatgca agcgctttcc 540

ggggccggct atccgggggt ggcttcgctg gacgccaccg acaatgtgat tccgttcatc 600

agcggcgagg acgaaaaggt cgagagcgaa acgcgaaaaa ttctcggcgt tgtcagccaa 660

ggtgagatca tcgatgccga catgaaagtc agcgcccagt gcaaccgcgt gaatgtgacc 720

gacggtcacc tggcttcgat tcgggtgaaa ctggcgcggc cggcatctgc gaacgaattt 780

cgcgatgcgc tcgcatcgtt caccgccgag ccccaaaagc tgaaactgca cacggcgcct 840

gcgaacccgc tactcatccg cgatgaaatc gatcggccac agccgcgcct tgatcgtgat 900

gccgaaaatg gaatgagtgt aaccgtgggg aggattgctg aagataacgt gcttgattat 960

cgcttcgtag cgctgggtca caacacgatt cgcggcgccg ccggagcggc gattctgaat 1020

gcagagttgc tggtggcgaa aggatatctc gcgtga 1056

<210> 108

<211> 351

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_13序列

<400> 108

Met Ser Asn Lys Phe Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Val Gln Leu Leu Glu Asn His Pro Val Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Tyr Ala Ala

35 40 45

Ala Cys Thr Trp Arg Leu Pro Gly Glu Met Pro Ala Arg Val Lys Gln

50 55 60

Ile Val Val Gln Pro Pro Ala Pro Pro Leu Asp Cys Asp Phe Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Ile Ala Thr Asp Ala Glu Pro Lys Phe Ala

85 90 95

Arg Met Asp Phe Pro Val Ile Ser Asn Ser Ser Ser His Arg Met Ala

100 105 110

Pro Asp Val Pro Leu Leu Ile Pro Glu Val Asn Pro Glu His Val Glu

115 120 125

Leu Ile Asp Ala Gln Arg Ile Asn Arg Glu Tyr Asn Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Ala Ile Val Ile Val Met Ala Leu Ala

145 150 155 160

Pro Leu His Ala Lys Phe Gly Val Glu Ser Cys Ile Val Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Asp Ala

180 185 190

Thr Asp Asn Val Ile Pro Phe Ile Ser Gly Glu Asp Glu Lys Val Glu

195 200 205

Ser Glu Thr Arg Lys Ile Leu Gly Val Val Ser Gln Gly Glu Ile Ile

210 215 220

Asp Ala Asp Met Lys Val Ser Ala Gln Cys Asn Arg Val Asn Val Thr

225 230 235 240

Asp Gly His Leu Ala Ser Ile Arg Val Lys Leu Ala Arg Pro Ala Ser

245 250 255

Ala Asn Glu Phe Arg Asp Ala Leu Ala Ser Phe Thr Ala Glu Pro Gln

260 265 270

Lys Leu Lys Leu His Thr Ala Pro Ala Asn Pro Leu Leu Ile Arg Asp

275 280 285

Glu Ile Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Glu Asn Gly

290 295 300

Met Ser Val Thr Val Gly Arg Ile Ala Glu Asp Asn Val Leu Asp Tyr

305 310 315 320

Arg Phe Val Ala Leu Gly His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Val Ala Lys Gly Tyr Leu Ala

340 345 350

<210> 109

<211> 1092

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_14序列

<400> 109

atgaaggccc ctaggcctga caaacgtaaa ccaagtatga ccaataaact tcgagttgga 60

attctcggcg caacgggaat ggttgggcag cgctttatcc aacttttgga aaaccatccg 120

caatttcaga ttacggcact ggcagcgtcg gatcgttcgc aaggtaaaac gttctcagaa 180

gcttgtacct ggcgtttgcc cggagagatg ccctcgtttg ttaagtcgat gccggttcac 240

gccccaaacc ccccgctgga ctgtgaattg gtgttctcca gtctgccggg tgagatcgca 300

cgcgaaagcg aacaaagttt tgtcaacgcc ggatttcccg tcatcagcaa ctcctccgct 360

tttcggatgg acgccaacgt tcctttactg attccggaag ttaatcctga gcatctgtcg 420

ttgcttgaat tgcaacaaaa agaaagcaac ggcaagcgcg gatatattgt taccaaccca 480

aattgttcga cgatcatgtt ggcgcttgca ctcgccccgt tgcatgcgcg cttcggtgtg 540

caaaacgttg tcgccaccac cttgcaggct ttatcaggcg ccggataccc gggcgttgcg 600

tcgcttgcca ttagtgacaa cgtgttgccg tttatcgaag gcgaggagca gaagatagag 660

caggaaacgt tgaagattct cggcagcgtc gacggggaaa caattcggca cgcagcgatc 720

agcgtgagcg cgcaatgcac gcgcgtgaac gtctcagacg gccacatggc cgcggtccgc 780

gttaagctga ttgagccggc gacaaaagat gaggttatag atgcactcgc ttcgtttacc 840

gcgctgccgc aaaaactgaa tctccactcc gcgccgccgc agccaatcat cgtgcgcaat 900

gagtccgacc gtccgcagcc acggcttgat cgagatgcgg gcaaaggaat gagcattacg 960

attggacgag tggaccatga ccacgtgatg gactaccgct ttttttcttt gagtcacaac 1020

acagtccgag gcgctgccgg cgcggcaatc cttaacgctg aattgcttct ggcgatgggg 1080

aaaataagat ga 1092

<210> 110

<211> 363

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_14序列

<400> 110

Met Lys Ala Pro Arg Pro Asp Lys Arg Lys Pro Ser Met Thr Asn Lys

1 5 10 15

Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val Gly Gln Arg Phe

20 25 30

Ile Gln Leu Leu Glu Asn His Pro Gln Phe Gln Ile Thr Ala Leu Ala

35 40 45

Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Ser Glu Ala Cys Thr Trp

50 55 60

Arg Leu Pro Gly Glu Met Pro Ser Phe Val Lys Ser Met Pro Val His

65 70 75 80

Ala Pro Asn Pro Pro Leu Asp Cys Glu Leu Val Phe Ser Ser Leu Pro

85 90 95

Gly Glu Ile Ala Arg Glu Ser Glu Gln Ser Phe Val Asn Ala Gly Phe

100 105 110

Pro Val Ile Ser Asn Ser Ser Ala Phe Arg Met Asp Ala Asn Val Pro

115 120 125

Leu Leu Ile Pro Glu Val Asn Pro Glu His Leu Ser Leu Leu Glu Leu

130 135 140

Gln Gln Lys Glu Ser Asn Gly Lys Arg Gly Tyr Ile Val Thr Asn Pro

145 150 155 160

Asn Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala Pro Leu His Ala

165 170 175

Arg Phe Gly Val Gln Asn Val Val Ala Thr Thr Leu Gln Ala Leu Ser

180 185 190

Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Ala Ile Ser Asp Asn Val

195 200 205

Leu Pro Phe Ile Glu Gly Glu Glu Gln Lys Ile Glu Gln Glu Thr Leu

210 215 220

Lys Ile Leu Gly Ser Val Asp Gly Glu Thr Ile Arg His Ala Ala Ile

225 230 235 240

Ser Val Ser Ala Gln Cys Thr Arg Val Asn Val Ser Asp Gly His Met

245 250 255

Ala Ala Val Arg Val Lys Leu Ile Glu Pro Ala Thr Lys Asp Glu Val

260 265 270

Ile Asp Ala Leu Ala Ser Phe Thr Ala Leu Pro Gln Lys Leu Asn Leu

275 280 285

His Ser Ala Pro Pro Gln Pro Ile Ile Val Arg Asn Glu Ser Asp Arg

290 295 300

Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Lys Gly Met Ser Ile Thr

305 310 315 320

Ile Gly Arg Val Asp His Asp His Val Met Asp Tyr Arg Phe Phe Ser

325 330 335

Leu Ser His Asn Thr Val Arg Gly Ala Ala Gly Ala Ala Ile Leu Asn

340 345 350

Ala Glu Leu Leu Leu Ala Met Gly Lys Ile Arg

355 360

<210> 111

<211> 1056

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_15序列

<400> 111

atgacaagta aacttcgagt tggaattctc ggtgcgacgg gaatggttgg gcagcgcttt 60

atccaacttt tggaaaacca tccgcaattt caaattacgg cactggcggc gtcagatcgt 120

tcgcaaggca aaacatttgc agaagcgtgt acctggcgtt tgcccggaga gatgccctcg 180

tttgtgaagt cgatgccggt tcacgcgcca aagccgccgc tggactgtga attggtgttt 240

tccagtctgc cgggcgaaat cgcacgcgaa agcgaacaaa gttttgtcaa cgccggattt 300

cctgtcatca gcaactcctc cgcttttcgg atggacgcca acgttccttt actgattccg 360

gaagttaatc cggagcatct gtcgttgctt gaattgcaac aaaaagaagg caacggcaag 420

cgcggttaca ttgttaccaa cccaaactgt tcgacgatca tgttggcgct ggcactcgcc 480

ccgttgcatg cgcgcttcgg tgtgcgaaac gttgtcgcca ccaccttgca ggctttatcc 540

ggggccggat tcccgggcgt tgcgtcgctt gctattagtg acaacgtgtt gccgtttatc 600

gaaggcgagg agcagaagat agagcaggag acgttgaaga ttctcggcag cgtcgacggg 660

gaaacaattc gtcacgcagc gatcagtgtg agcgcgcaat gtacgcgcgt gaacgtttca 720

gacggccata tggccgctgt gcgcgtcaag ctggataagc cggcgacaaa agatgaggtt 780

atcgatgcgc tcgcgtcgtt taccgcgctg cctcaaaaat tgaatctcca ctcggcgccg 840

ccgcagccaa tcatcgtgcg taatgagtcc gaccgcccgc agcctcggct tgatcgagat 900

gcgggcaaag gaatgagcat tacgattgga cgactggacc atgaccacgt gatggactac 960

cgcttttttt ctttgagtca caacacagtc cgaggcgctg ccggtgcggc aatccttaac 1020

gctgaattgc ttctggcgat ggggaaaata ggatga 1056

<210> 112

<211> 351

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_15序列

<400> 112

Met Thr Ser Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Gln Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Ala Glu

35 40 45

Ala Cys Thr Trp Arg Leu Pro Gly Glu Met Pro Ser Phe Val Lys Ser

50 55 60

Met Pro Val His Ala Pro Lys Pro Pro Leu Asp Cys Glu Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Ile Ala Arg Glu Ser Glu Gln Ser Phe Val

85 90 95

Asn Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Phe Arg Met Asp

100 105 110

Ala Asn Val Pro Leu Leu Ile Pro Glu Val Asn Pro Glu His Leu Ser

115 120 125

Leu Leu Glu Leu Gln Gln Lys Glu Gly Asn Gly Lys Arg Gly Tyr Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala

145 150 155 160

Pro Leu His Ala Arg Phe Gly Val Arg Asn Val Val Ala Thr Thr Leu

165 170 175

Gln Ala Leu Ser Gly Ala Gly Phe Pro Gly Val Ala Ser Leu Ala Ile

180 185 190

Ser Asp Asn Val Leu Pro Phe Ile Glu Gly Glu Glu Gln Lys Ile Glu

195 200 205

Gln Glu Thr Leu Lys Ile Leu Gly Ser Val Asp Gly Glu Thr Ile Arg

210 215 220

His Ala Ala Ile Ser Val Ser Ala Gln Cys Thr Arg Val Asn Val Ser

225 230 235 240

Asp Gly His Met Ala Ala Val Arg Val Lys Leu Asp Lys Pro Ala Thr

245 250 255

Lys Asp Glu Val Ile Asp Ala Leu Ala Ser Phe Thr Ala Leu Pro Gln

260 265 270

Lys Leu Asn Leu His Ser Ala Pro Pro Gln Pro Ile Ile Val Arg Asn

275 280 285

Glu Ser Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Lys Gly

290 295 300

Met Ser Ile Thr Ile Gly Arg Leu Asp His Asp His Val Met Asp Tyr

305 310 315 320

Arg Phe Phe Ser Leu Ser His Asn Thr Val Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Leu Ala Met Gly Lys Ile Gly

340 345 350

<210> 113

<211> 1062

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_16序列

<400> 113

atgactgaaa aacttcgcgt aggaatactt ggcgccactg gaatggtcgg tcagcgcttc 60

attcaactgc tggagaatca tccgcagttt gaaattacgg cgctcgctgc ttcggatcgt 120

tcgcagggca aaactttcga agaagcttgc acctggcgcc tggccggtga aatgcctgcg 180

ctcgttaagt caatgcaggt tcatgcgccc aagccgccgc tcgaatgtca actggtgttc 240

tccagcttgc cgggagacat tgcgcgtgat tgcgaaggaa gttttgtagc ggccggcgtg 300

cctgtgatta gcaactcgtc tgccttccgc atggaccaga atgttcccct gctgatcccg 360

gaagtaaatc ccgagcatct gtcgctgttg gatttgcaac aaagagacag caacggaaaa 420

tccggatccg gattcatagt tactaatcca aattgttcga ccatcatgtt ggcaatgtcc 480

ctcgcgccgt tgcataaacg cttcggtgta aagagtgtcg tggcgacgac tatgcaggct 540

ttgtccggcg caggatatcc gggggttgca tcactggcca tcagtgacaa cgtcctcccg 600

tacatcgacg gcgaagagga aaagatcgaa caggagactt tgaagattct cggtcgttta 660

gacggcgggc aaatacatga tgcgccaatg aatgtcagcg ctcaatgcaa tcgagtgaat 720

gtctctgacg gccacatggc agcggttcgg gtgaagctgg agaaggaggc gacgaaggag 780

gaagtcagcg atgcgctggc gtctttcaca gcactgccac aggaacttgg tctccattca 840

gctcccgaac ggccgatcat tgttcgtaat gaacctgacc ggccccagcc gcgtttggat 900

cgggacgcgg gcaacggaat gagcgtcacg attggacgcc tgcaagaaga tcgcgtgctc 960

gactatcgct tcgtttcttt aagtcacaac accataagag gtgcggccgg cgctgctatt 1020

ctcaacgcgg aactacttat cgcctcagga ttattgttat ga 1062

<210> 114

<211> 353

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_16序列

<400> 114

Met Thr Glu Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Glu Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Ala Leu Val Lys Ser

50 55 60

Met Gln Val His Ala Pro Lys Pro Pro Leu Glu Cys Gln Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Asp Cys Glu Gly Ser Phe Val

85 90 95

Ala Ala Gly Val Pro Val Ile Ser Asn Ser Ser Ala Phe Arg Met Asp

100 105 110

Gln Asn Val Pro Leu Leu Ile Pro Glu Val Asn Pro Glu His Leu Ser

115 120 125

Leu Leu Asp Leu Gln Gln Arg Asp Ser Asn Gly Lys Ser Gly Ser Gly

130 135 140

Phe Ile Val Thr Asn Pro Asn Cys Ser Thr Ile Met Leu Ala Met Ser

145 150 155 160

Leu Ala Pro Leu His Lys Arg Phe Gly Val Lys Ser Val Val Ala Thr

165 170 175

Thr Met Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu

180 185 190

Ala Ile Ser Asp Asn Val Leu Pro Tyr Ile Asp Gly Glu Glu Glu Lys

195 200 205

Ile Glu Gln Glu Thr Leu Lys Ile Leu Gly Arg Leu Asp Gly Gly Gln

210 215 220

Ile His Asp Ala Pro Met Asn Val Ser Ala Gln Cys Asn Arg Val Asn

225 230 235 240

Val Ser Asp Gly His Met Ala Ala Val Arg Val Lys Leu Glu Lys Glu

245 250 255

Ala Thr Lys Glu Glu Val Ser Asp Ala Leu Ala Ser Phe Thr Ala Leu

260 265 270

Pro Gln Glu Leu Gly Leu His Ser Ala Pro Glu Arg Pro Ile Ile Val

275 280 285

Arg Asn Glu Pro Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly

290 295 300

Asn Gly Met Ser Val Thr Ile Gly Arg Leu Gln Glu Asp Arg Val Leu

305 310 315 320

Asp Tyr Arg Phe Val Ser Leu Ser His Asn Thr Ile Arg Gly Ala Ala

325 330 335

Gly Ala Ala Ile Leu Asn Ala Glu Leu Leu Ile Ala Ser Gly Leu Leu

340 345 350

Leu

<210> 115

<211> 1056

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_17序列

<400> 115

atgaccaaga agtttcgcgt aggaatactc ggtgcgaccg gcatggttgg gcagcgcttt 60

atccagttgt tggaggacca tccgcaattt gagataacgg cgctggccgc gtcggatcgt 120

tcgcagggca agacttttca ggaggcatgc acctggcgtt tggccgggga aatgcccacc 180

tttgtgaaat cgatgaaggt tgcggcgccg caaccgcccc ttgattgcga gttgatcttt 240

tccagtttgc ctggtgacat cgcgcgtcgg agtgaagacg cctttgcccg ggctggtttt 300

ccagtaatta gtaattcttc agcttttcgg atggaccgcg acgtaccttt actgattccg 360

gaagtaaacc acgagcatct tgctctgctt gatgtgcaac gaaagcagcg aggcgggcaa 420

cagggataca tcgtcacgaa tccaaactgc tccacaatca tgctggccct ggcgttggcg 480

ccgctacacg caaaatttgg tgtgacgagt gtgatcgcta ctaccatgca ggcgttgtca 540

ggcgcaggtt atcccggtgt tgcttcactt gccatcagcg ataacgtttt gccattcatc 600

gacggcgagg aagaaaagat cgaacaggag acattgaaga tcctgggcaa ggtcaatggc 660

gcgagcattg aagccgcgcc aatgaatgta agcgcacaat gtcaccgcgt gaatgtttca 720

gacggccaca tggcggcggt gcgagtgaaa cttcaccagc ccgcgaccat ccatgaacta 780

agtgctgcgc tcagttcttt tagcgcactg ccgcagaagc taaagcttca ttcagcgcct 840

gagcacccga ttattgtgcg cgaagaagtg gatcgtccgc agccgcggct ggatcgggat 900

gcgggaaatg gaatgagcgt caccgtcggc cgtttacaac ccgacaacgt gtttgactac 960

cggttcgtta ctcttagcca caacaccata cgcggcgcag ctggcgctgc aattctcaat 1020

gcggaattat tgatcgctag cggaaagcta acatga 1056

<210> 116

<211> 351

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_17序列

<400> 116

Met Thr Lys Lys Phe Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asp His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Gln Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Thr Phe Val Lys Ser

50 55 60

Met Lys Val Ala Ala Pro Gln Pro Pro Leu Asp Cys Glu Leu Ile Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Arg Ser Glu Asp Ala Phe Ala

85 90 95

Arg Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Phe Arg Met Asp

100 105 110

Arg Asp Val Pro Leu Leu Ile Pro Glu Val Asn His Glu His Leu Ala

115 120 125

Leu Leu Asp Val Gln Arg Lys Gln Arg Gly Gly Gln Gln Gly Tyr Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala

145 150 155 160

Pro Leu His Ala Lys Phe Gly Val Thr Ser Val Ile Ala Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Ala Ile

180 185 190

Ser Asp Asn Val Leu Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu

195 200 205

Gln Glu Thr Leu Lys Ile Leu Gly Lys Val Asn Gly Ala Ser Ile Glu

210 215 220

Ala Ala Pro Met Asn Val Ser Ala Gln Cys His Arg Val Asn Val Ser

225 230 235 240

Asp Gly His Met Ala Ala Val Arg Val Lys Leu His Gln Pro Ala Thr

245 250 255

Ile His Glu Leu Ser Ala Ala Leu Ser Ser Phe Ser Ala Leu Pro Gln

260 265 270

Lys Leu Lys Leu His Ser Ala Pro Glu His Pro Ile Ile Val Arg Glu

275 280 285

Glu Val Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Asn Gly

290 295 300

Met Ser Val Thr Val Gly Arg Leu Gln Pro Asp Asn Val Phe Asp Tyr

305 310 315 320

Arg Phe Val Thr Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Ile Ala Ser Gly Lys Leu Thr

340 345 350

<210> 117

<211> 1068

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_18序列

<400> 117

atgtccaaag tccaaagtca aaaaaaactt cgagtcggaa tactgggcgc gactgggatg 60

gttggtcagc gcttcattca gttactggag aaccatccgc agtttgagat aaccgcgctt 120

gctgcttcgg atcgttcaca aggaaaaact tttcaggaag catgcacgtg gcgactcgca 180

ggcgaaatgc cggcgaatgt ccgatcgatg aaagtggccg ctccagaagc gccgctcgat 240

tgcgacctcg tcttctcgag tttgccgggc gacatcgcac gaaggagcga agcatcgttc 300

gcgcgcgcag gttttccagt aattagtaat tcttcagctt ttcgtatgga ccaggatgtg 360

ccgttgctga ttccggaagt gaatcacgag catctctctt tgattgagac acaactaaga 420

aaccacaacg ggcagcaagg ttacgtggtc acaaatccaa attgctcgac aatcatgctc 480

gcgctcgcgc tggcgccact tcatgaggcc tttggtgtaa ccagtgtaat tgcaaccacg 540

atgcaggcgc tgtcaggagc gggttatccg ggagtggcat cgctcgcgat tagcgacaac 600

gttctaccgt tcattgacgg tgaagaagaa aagatcgaac aggagacacg gaagattctg 660

gggcgattta accaggacac ggttgacaac gcgccgatga acgtcagcgc ccagtgcaat 720

cgcgttaacg tttcggatgg tcacatggcg gcggtgcgag tgaaactgca agagcctgcg 780

aagctcgaag aagtgtctga agccctcgct tcgtttaccg cgctgccgca agagttgaaa 840

ctccactccg ctcctgagca gccgatcatc ttgcgacagg aaacggatcg accgcagccg 900

cgcctcgatc gtgatgcagg aaacggaatg agtgtgacta ttggtcgcct gcagccagac 960

aacgtacttg attatcggtt cgttgcgctg agtcacaaca cgattcgtgg cgctgcgggt 1020

gctgccatcc tgaatgctga acttatgatt gcaatgaaga ggttgtaa 1068

<210> 118

<211> 355

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_18序列

<400> 118

Met Ser Lys Val Gln Ser Gln Lys Lys Leu Arg Val Gly Ile Leu Gly

1 5 10 15

Ala Thr Gly Met Val Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His

20 25 30

Pro Gln Phe Glu Ile Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly

35 40 45

Lys Thr Phe Gln Glu Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro

50 55 60

Ala Asn Val Arg Ser Met Lys Val Ala Ala Pro Glu Ala Pro Leu Asp

65 70 75 80

Cys Asp Leu Val Phe Ser Ser Leu Pro Gly Asp Ile Ala Arg Arg Ser

85 90 95

Glu Ala Ser Phe Ala Arg Ala Gly Phe Pro Val Ile Ser Asn Ser Ser

100 105 110

Ala Phe Arg Met Asp Gln Asp Val Pro Leu Leu Ile Pro Glu Val Asn

115 120 125

His Glu His Leu Ser Leu Ile Glu Thr Gln Leu Arg Asn His Asn Gly

130 135 140

Gln Gln Gly Tyr Val Val Thr Asn Pro Asn Cys Ser Thr Ile Met Leu

145 150 155 160

Ala Leu Ala Leu Ala Pro Leu His Glu Ala Phe Gly Val Thr Ser Val

165 170 175

Ile Ala Thr Thr Met Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val

180 185 190

Ala Ser Leu Ala Ile Ser Asp Asn Val Leu Pro Phe Ile Asp Gly Glu

195 200 205

Glu Glu Lys Ile Glu Gln Glu Thr Arg Lys Ile Leu Gly Arg Phe Asn

210 215 220

Gln Asp Thr Val Asp Asn Ala Pro Met Asn Val Ser Ala Gln Cys Asn

225 230 235 240

Arg Val Asn Val Ser Asp Gly His Met Ala Ala Val Arg Val Lys Leu

245 250 255

Gln Glu Pro Ala Lys Leu Glu Glu Val Ser Glu Ala Leu Ala Ser Phe

260 265 270

Thr Ala Leu Pro Gln Glu Leu Lys Leu His Ser Ala Pro Glu Gln Pro

275 280 285

Ile Ile Leu Arg Gln Glu Thr Asp Arg Pro Gln Pro Arg Leu Asp Arg

290 295 300

Asp Ala Gly Asn Gly Met Ser Val Thr Ile Gly Arg Leu Gln Pro Asp

305 310 315 320

Asn Val Leu Asp Tyr Arg Phe Val Ala Leu Ser His Asn Thr Ile Arg

325 330 335

Gly Ala Ala Gly Ala Ala Ile Leu Asn Ala Glu Leu Met Ile Ala Met

340 345 350

Lys Arg Leu

355

<210> 119

<211> 1044

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_19序列

<400> 119

atgactagaa aacttcgtgt gggaatactt ggcgctaccg ggatggtggg gcaacgcttt 60

attcaactgc tggagaatca tccccagttt gagatcacag cattggctgc atctgatcga 120

tcgcagggga aggcttttca ggaggcttgt acctggcgtc ttgcaggaga gatgcctgag 180

tttgtgaagt caatgccgat cgcggcgccg caacctccgc tggattgcga attagttttc 240

tcgagtttgc cgggcgacat cgcgcgtcaa agtgaaggtg cctttgccga ggcgggcttt 300

cctgtgatca gcaactcatc agcgtatcgg atggaccagc acgttccctt actaattccc 360

gaggtgaacc atcaacatct cgcgctgctc gagtcacaac agcagaaggg cttcatcgtc 420

actaacccaa actgttcgac gatcatgctg gcgttggccc tcgcgccgct acatgcgagg 480

tttcgcgtga ccagcgtcat cgccacaact ctccaggcat tatcgggcgc cggttatccc 540

ggggttccgt cgctggccat tagtgacaat gttctgccat tcatcgatgg tgaagaggaa 600

aagatcgaga aggaaacact caagattctc gggccaatcg aaaaaggaca tctcatcgac 660

gcgccgatga aggtgagcgc acagtgtcat cgggtgaatg tctctgacgg acacatggcc 720

gcagtgcgag tgaagttgga taagttgact acgattgaag aagtgagtga agcatttgct 780

tcctttagtt cgctgccaca ggaactaaaa ctgcactcag caccagagaa gccaatcgtt 840

gtgctgcaag aacccgatcg tcctcagccg cgcctcgatc gagatgccgg aaaggggatg 900

agcgttacag ttggtcgctt gcgagacgat aacgtgcttg actaccgatt cgtggcgctc 960

agccacaaca cgatcagagg cgccgcggga gctgcgattc tcaatgcgga gctgctgatt 1020

gcgagtggat acttagggaa ataa 1044

<210> 120

<211> 347

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_19序列

<400> 120

Met Thr Arg Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Ala Phe Gln Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Glu Phe Val Lys Ser

50 55 60

Met Pro Ile Ala Ala Pro Gln Pro Pro Leu Asp Cys Glu Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Gln Ser Glu Gly Ala Phe Ala

85 90 95

Glu Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Tyr Arg Met Asp

100 105 110

Gln His Val Pro Leu Leu Ile Pro Glu Val Asn His Gln His Leu Ala

115 120 125

Leu Leu Glu Ser Gln Gln Gln Lys Gly Phe Ile Val Thr Asn Pro Asn

130 135 140

Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala Pro Leu His Ala Arg

145 150 155 160

Phe Arg Val Thr Ser Val Ile Ala Thr Thr Leu Gln Ala Leu Ser Gly

165 170 175

Ala Gly Tyr Pro Gly Val Pro Ser Leu Ala Ile Ser Asp Asn Val Leu

180 185 190

Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu Lys Glu Thr Leu Lys

195 200 205

Ile Leu Gly Pro Ile Glu Lys Gly His Leu Ile Asp Ala Pro Met Lys

210 215 220

Val Ser Ala Gln Cys His Arg Val Asn Val Ser Asp Gly His Met Ala

225 230 235 240

Ala Val Arg Val Lys Leu Asp Lys Leu Thr Thr Ile Glu Glu Val Ser

245 250 255

Glu Ala Phe Ala Ser Phe Ser Ser Leu Pro Gln Glu Leu Lys Leu His

260 265 270

Ser Ala Pro Glu Lys Pro Ile Val Val Leu Gln Glu Pro Asp Arg Pro

275 280 285

Gln Pro Arg Leu Asp Arg Asp Ala Gly Lys Gly Met Ser Val Thr Val

290 295 300

Gly Arg Leu Arg Asp Asp Asn Val Leu Asp Tyr Arg Phe Val Ala Leu

305 310 315 320

Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala Ile Leu Asn Ala

325 330 335

Glu Leu Leu Ile Ala Ser Gly Tyr Leu Gly Lys

340 345

<210> 121

<211> 1056

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_20序列

<400> 121

atgacgagaa aacggcgagt tggaattttg ggtgcaaccg ggatggtcgg ccagcgtttt 60

attcaattgc tggagaacca tccgcaattt gaaatcaccg cgctggctgc ttcggatcgt 120

tcacaaggca aaacctttca acacgcatgc acgtggcgac tggccggcga catgcctgag 180

tttgtgaaac agatggaggt ccaggcgccg caaccgccgc tcgattgtga tctggtcttc 240

tcgagtttgc ctggcgacat cgctcgcgag agtgaaggaa ggtttgcgga tgcgggttat 300

ccggtgatca gcaactcatc tgcgtaccgg atggacgcgg acgttcctct gttaatacct 360

gaagtgaatc acgcgcatct cgatctgctc aaggttcagc gccggcaacc aaaccggcaa 420

cgcggcttca tcgtcacgaa tccaaattgt tcgacgatca tgctggcgtt ggcactcgca 480

cccttgcacg cgaattttgg cgtcacgagc gcggtggcga cgacgatgca ggctttgtcc 540

ggcgccggtt atccgggcgt cgcatcactc gccatcagcg acaacgtgtt gccgttcatc 600

gaaggtgaag aagaaaaaat cgaacaggag actttaaaga ttctgggcca actcaacggt 660

gaaaggatcg tggaggcttc catgaacgtg agcgcgcaat gtcaccgcgt gaatgtttcc 720

gacggacatc tcgctgcggt tcgcgtaaag ctaaacagac aggcgacaaa agatgagttg 780

gttgaagcgc ttgcttcgtt caagtcgctg cctcaggaat tacaacttca ctcggcgccg 840

gagcacccga tcattgttcg caatgagccc gaccgtccgc agccgcgttt ggatcgagag 900

gcgggcaacg gcatgagcgt caccatcgga cggttacagg atgacaacgt gctcgactat 960

cgctttgtcg ctctcagcca caacacaatt cgcggcgcag caggcgccgc cattctcaac 1020

gctgaactcc tcgtcgcttc gggattattg acatga 1056

<210> 122

<211> 351

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_20序列

<400> 122

Met Thr Arg Lys Arg Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Gln His

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Asp Met Pro Glu Phe Val Lys Gln

50 55 60

Met Glu Val Gln Ala Pro Gln Pro Pro Leu Asp Cys Asp Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Glu Ser Glu Gly Arg Phe Ala

85 90 95

Asp Ala Gly Tyr Pro Val Ile Ser Asn Ser Ser Ala Tyr Arg Met Asp

100 105 110

Ala Asp Val Pro Leu Leu Ile Pro Glu Val Asn His Ala His Leu Asp

115 120 125

Leu Leu Lys Val Gln Arg Arg Gln Pro Asn Arg Gln Arg Gly Phe Ile

130 135 140

Val Thr Asn Pro Asn Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala

145 150 155 160

Pro Leu His Ala Asn Phe Gly Val Thr Ser Ala Val Ala Thr Thr Met

165 170 175

Gln Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Ala Ile

180 185 190

Ser Asp Asn Val Leu Pro Phe Ile Glu Gly Glu Glu Glu Lys Ile Glu

195 200 205

Gln Glu Thr Leu Lys Ile Leu Gly Gln Leu Asn Gly Glu Arg Ile Val

210 215 220

Glu Ala Ser Met Asn Val Ser Ala Gln Cys His Arg Val Asn Val Ser

225 230 235 240

Asp Gly His Leu Ala Ala Val Arg Val Lys Leu Asn Arg Gln Ala Thr

245 250 255

Lys Asp Glu Leu Val Glu Ala Leu Ala Ser Phe Lys Ser Leu Pro Gln

260 265 270

Glu Leu Gln Leu His Ser Ala Pro Glu His Pro Ile Ile Val Arg Asn

275 280 285

Glu Pro Asp Arg Pro Gln Pro Arg Leu Asp Arg Glu Ala Gly Asn Gly

290 295 300

Met Ser Val Thr Ile Gly Arg Leu Gln Asp Asp Asn Val Leu Asp Tyr

305 310 315 320

Arg Phe Val Ala Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala

325 330 335

Ala Ile Leu Asn Ala Glu Leu Leu Val Ala Ser Gly Leu Leu Thr

340 345 350

<210> 123

<211> 1053

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_21序列

<400> 123

atggctaaac gacgcgtagg aatattgggc gccaccggca tggtgggaca gcgattcatt 60

caactgctcg aggaccaccc gcagtttgaa atcacagcgc tggccgcatc agatcgatct 120

caaggcagga cttttgccga cgcttgtaca tggcgacttc ctggcgagat gccggcgttt 180

gtccgctcaa tgatcgtcca ggcgccggcg ccaccgctcg attgcgaatt ggtgttttcg 240

agtttgccgg gcgacattgc ccgctcgagc gaagggatgt ttgctgatgc cggttatccc 300

gtgattagca actcgtcagc cttccgcatg gatgcggatg ttcctttact gattcccgaa 360

gtcaacaact cgcatctgga tctgctcgcg gtccaacgaa acaagtccaa ccgtcaacgc 420

ggctttatcg ttacaaatcc caactgttca acgatcatgt tagcccttgc tctggcgccg 480

ctgcatttca agttcggcgt cagcaacgtc gttgccacta ctctacaggc gctctcaggc 540

gccggatatc ccggagtcgc gtcgctggca atcagtgaca atgttcttcc ttttattgac 600

ggggaggagg agaagatcga gaaagaaacg ctcaaaatcc tgggccacgt tcaaaaaggc 660

accattgcgg aagcctcgat gaatgtgagc gcccaatgcc atcgtgtcaa tgtcactgat 720

ggacatatgg cagcagtgcg ggtgaagttg aatcagttag ctaccactga agatgtcgcg 780

caaacgctgg catcgttccg tgcgttgccc caggaattgc atcttcattc cgcgccggag 840

catccaatcg ttgtgcgcaa tgaacctgat cggccgcagc cgcggcttga tcgagatgcg 900

ggaaacggaa tgagtgtaac gatcggtcga atccaacccg ataatgtact tgactaccgg 960

ttcgttgccc ttagccacaa tacaatccgc ggcgctgccg gcgccgccat tctcaatgcc 1020

gaacttctga tcgcgtccgg aatacttata tga 1053

<210> 124

<211> 350

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_21序列

<400> 124

Met Ala Lys Arg Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val Gly

1 5 10 15

Gln Arg Phe Ile Gln Leu Leu Glu Asp His Pro Gln Phe Glu Ile Thr

20 25 30

Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Arg Thr Phe Ala Asp Ala

35 40 45

Cys Thr Trp Arg Leu Pro Gly Glu Met Pro Ala Phe Val Arg Ser Met

50 55 60

Ile Val Gln Ala Pro Ala Pro Pro Leu Asp Cys Glu Leu Val Phe Ser

65 70 75 80

Ser Leu Pro Gly Asp Ile Ala Arg Ser Ser Glu Gly Met Phe Ala Asp

85 90 95

Ala Gly Tyr Pro Val Ile Ser Asn Ser Ser Ala Phe Arg Met Asp Ala

100 105 110

Asp Val Pro Leu Leu Ile Pro Glu Val Asn Asn Ser His Leu Asp Leu

115 120 125

Leu Ala Val Gln Arg Asn Lys Ser Asn Arg Gln Arg Gly Phe Ile Val

130 135 140

Thr Asn Pro Asn Cys Ser Thr Ile Met Leu Ala Leu Ala Leu Ala Pro

145 150 155 160

Leu His Phe Lys Phe Gly Val Ser Asn Val Val Ala Thr Thr Leu Gln

165 170 175

Ala Leu Ser Gly Ala Gly Tyr Pro Gly Val Ala Ser Leu Ala Ile Ser

180 185 190

Asp Asn Val Leu Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu Lys

195 200 205

Glu Thr Leu Lys Ile Leu Gly His Val Gln Lys Gly Thr Ile Ala Glu

210 215 220

Ala Ser Met Asn Val Ser Ala Gln Cys His Arg Val Asn Val Thr Asp

225 230 235 240

Gly His Met Ala Ala Val Arg Val Lys Leu Asn Gln Leu Ala Thr Thr

245 250 255

Glu Asp Val Ala Gln Thr Leu Ala Ser Phe Arg Ala Leu Pro Gln Glu

260 265 270

Leu His Leu His Ser Ala Pro Glu His Pro Ile Val Val Arg Asn Glu

275 280 285

Pro Asp Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Asn Gly Met

290 295 300

Ser Val Thr Ile Gly Arg Ile Gln Pro Asp Asn Val Leu Asp Tyr Arg

305 310 315 320

Phe Val Ala Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala

325 330 335

Ile Leu Asn Ala Glu Leu Leu Ile Ala Ser Gly Ile Leu Ile

340 345 350

<210> 125

<211> 1044

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_22序列

<400> 125

atgactaaaa aacttcgtgt gggaatactc ggtgccacgg gcatggttgg ccagcgcttc 60

attcaactgc tggagaatca cccgcaattc gagattacag cgctggcggc ctctgatcga 120

tcgcagggca agacttttca ggaagcctgc acgtggcgtc tcgctggaga gatgccggaa 180

ttcgtgaagt caatgccgat cgcggcgccg cagccaccgc tcgattgcga actggtattt 240

tcaagtttgc ccggtgaaat cgcacgcgag agtgaaggag cttttgccgc ggcgggattt 300

ccggtcatta gtaactcttc ggcgtatcgc atggacgctg acgttccgct gctgattccc 360

gaagtgaatc atccacacct tgcgctggtc gagttgcaac agcggaaggg cttcatcgtc 420

actaatccaa actgttccac gatcatgttg gcgctggtgc tggcgccgct ccatgcaaag 480

tttcgcgtga ccagcgtagt cgcgacaact ctgcaggcac tatcgggcgc tggttatccg 540

ggagttccct cgctggccat tagtgataat gttttgccgt tcatcgacgg tgaagaggaa 600

aagattgagc aggaaacgct gaagattctt ggctcgatag aagaaggaca tatcatcgac 660

gcccccatac aggtgagtgc gcagtgtcac cgggtaaatg tttcagacgg acacatggcg 720

gcggtgcgcg tgaagctcga tcagttaact acaatcgaag aagtcagtga aacatttgct 780

tccttcacct cgctcccgca ggaactaaaa ctgcactcgg cgccggagca gcctattatt 840

gtggtgcatg aacatgatcg tcctcagccg cgcctggatc gagatgcggg aagcgggatg 900

agcgtcaccg ttggccgagt gcgcgaggat aatgtgctcg actatcgctt cgttgcgcta 960

agccacaaca cgatcagagg cgcagccggc gccgcgattc tcaatgcgga attgctaatt 1020

gcgagtggat acctccagaa gtga 1044

<210> 126

<211> 347

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_22序列

<400> 126

Met Thr Lys Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Gln Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Glu Phe Val Lys Ser

50 55 60

Met Pro Ile Ala Ala Pro Gln Pro Pro Leu Asp Cys Glu Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Glu Ile Ala Arg Glu Ser Glu Gly Ala Phe Ala

85 90 95

Ala Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Tyr Arg Met Asp

100 105 110

Ala Asp Val Pro Leu Leu Ile Pro Glu Val Asn His Pro His Leu Ala

115 120 125

Leu Val Glu Leu Gln Gln Arg Lys Gly Phe Ile Val Thr Asn Pro Asn

130 135 140

Cys Ser Thr Ile Met Leu Ala Leu Val Leu Ala Pro Leu His Ala Lys

145 150 155 160

Phe Arg Val Thr Ser Val Val Ala Thr Thr Leu Gln Ala Leu Ser Gly

165 170 175

Ala Gly Tyr Pro Gly Val Pro Ser Leu Ala Ile Ser Asp Asn Val Leu

180 185 190

Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu Gln Glu Thr Leu Lys

195 200 205

Ile Leu Gly Ser Ile Glu Glu Gly His Ile Ile Asp Ala Pro Ile Gln

210 215 220

Val Ser Ala Gln Cys His Arg Val Asn Val Ser Asp Gly His Met Ala

225 230 235 240

Ala Val Arg Val Lys Leu Asp Gln Leu Thr Thr Ile Glu Glu Val Ser

245 250 255

Glu Thr Phe Ala Ser Phe Thr Ser Leu Pro Gln Glu Leu Lys Leu His

260 265 270

Ser Ala Pro Glu Gln Pro Ile Ile Val Val His Glu His Asp Arg Pro

275 280 285

Gln Pro Arg Leu Asp Arg Asp Ala Gly Ser Gly Met Ser Val Thr Val

290 295 300

Gly Arg Val Arg Glu Asp Asn Val Leu Asp Tyr Arg Phe Val Ala Leu

305 310 315 320

Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala Ile Leu Asn Ala

325 330 335

Glu Leu Leu Ile Ala Ser Gly Tyr Leu Gln Lys

340 345

<210> 127

<211> 1050

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_23序列

<400> 127

atgactaaaa aacttcgtgt gggaatactt ggcgctaccg gcatggttgg ccagcgcttc 60

attcaattgc tggaaaatca tccgcaattt gagattacag cgctggcggc ttctgatcga 120

tcgcagggca agacttttca ggaagcctgc acgtggcgtc tcgctggaga gatgcctgaa 180

ttcgtaaagt cgatgccgat cgcggcgcca cagcccccgc tcgattgcga actagtattt 240

tccagtttgc ccggtgacat cgcacgcgag agtgagggag cttttgccgc ggcgggattt 300

cccgtcatca gcaactcttc ggcttatcgc atggacgctg acgttccact gctgattccc 360

gaagtgaatc atccacacct cgcgctcctc gactcacaac gtcagcagcg gaagggcttc 420

atcgtcacta atccaaactg ttccacaatc atgttggcga tggcgctggc gccgctccat 480

gcaaagtttc gcgtgaccag cgttatcgcg acgacgctcc aggcactatc gggcgccggt 540

tatccgggcg ttccctcgct ggccatcagt gataatgttt tgccgttcat cgatggtgaa 600

gaagaaaaga tcgagcagga aacgttgaag attcttggac caatgaaaga aggacgtata 660

agcgccgcct ctatgcaggt gagcgcgcag tgtcatcggg taaatgtttc agacgggcac 720

atggcggcgg tgcgcgtgaa gctcgatgag ttaactacaa tcgaagaagt ctttgaagca 780

tttgcttcct tcaccgcgct cccgcaggaa ctaaaactgc actcggcgcc ggagcagccg 840

atcattgtgg tgcatgagcc tgatcgtcct cagccgcgac ttgatcgaga tgcaggaagc 900

ggaatgagcg tcacagttgg ccgcgtgcgt gaggataacg tgctcgacta tcgcttcgtt 960

gcgctaagcc acaacacgat cagaggcgca gccggcgcag cgattctcaa tgcggaattg 1020

ttgatcgcga aaggatattt agggcattaa 1050

<210> 128

<211> 349

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_23序列

<400> 128

Met Thr Lys Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Gln Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Glu Phe Val Lys Ser

50 55 60

Met Pro Ile Ala Ala Pro Gln Pro Pro Leu Asp Cys Glu Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Glu Ser Glu Gly Ala Phe Ala

85 90 95

Ala Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Tyr Arg Met Asp

100 105 110

Ala Asp Val Pro Leu Leu Ile Pro Glu Val Asn His Pro His Leu Ala

115 120 125

Leu Leu Asp Ser Gln Arg Gln Gln Arg Lys Gly Phe Ile Val Thr Asn

130 135 140

Pro Asn Cys Ser Thr Ile Met Leu Ala Met Ala Leu Ala Pro Leu His

145 150 155 160

Ala Lys Phe Arg Val Thr Ser Val Ile Ala Thr Thr Leu Gln Ala Leu

165 170 175

Ser Gly Ala Gly Tyr Pro Gly Val Pro Ser Leu Ala Ile Ser Asp Asn

180 185 190

Val Leu Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu Gln Glu Thr

195 200 205

Leu Lys Ile Leu Gly Pro Met Lys Glu Gly Arg Ile Ser Ala Ala Ser

210 215 220

Met Gln Val Ser Ala Gln Cys His Arg Val Asn Val Ser Asp Gly His

225 230 235 240

Met Ala Ala Val Arg Val Lys Leu Asp Glu Leu Thr Thr Ile Glu Glu

245 250 255

Val Phe Glu Ala Phe Ala Ser Phe Thr Ala Leu Pro Gln Glu Leu Lys

260 265 270

Leu His Ser Ala Pro Glu Gln Pro Ile Ile Val Val His Glu Pro Asp

275 280 285

Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Ser Gly Met Ser Val

290 295 300

Thr Val Gly Arg Val Arg Glu Asp Asn Val Leu Asp Tyr Arg Phe Val

305 310 315 320

Ala Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala Ile Leu

325 330 335

Asn Ala Glu Leu Leu Ile Ala Lys Gly Tyr Leu Gly His

340 345

<210> 129

<211> 1050

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_24序列

<400> 129

atgactaaaa aacttcgtgt gggaatactt ggcgctaccg gcatggttgg ccagcgcttc 60

attcaattgc tggaaaatca tccgcaattt gagattacag cgctggcggc ttctgatcga 120

tcgcagggca agacttttca ggaagcctgc acgtggcgtc tcgctggaga gatgcctgaa 180

ttcgtaaagt cgatgccgat cgcggcgccg cagccgccgc tcgattgcga gctagtattt 240

tccagtttgc ccggtgacat cgcacgcgag agtgagggag cttttgccgc ggcgggattt 300

cccgtcatca gcaactcttc ggcttatcgc atggacgctg aggttccact gctgattccc 360

gaagtgaatc atccacacct cgcgctcctc gactcacaac gtcagcagcg gaacggcttc 420

atcgtcacta atccaaactg ttccacaatc atgttagcga tggcgctggc gccgctccat 480

gcaaagtttc gcgtgaccag cgttatcgcg acgacgctcc aggcactatc gggcgccggt 540

tatccgggcg ttccctcgct ggccatcagt gataatgttt tgccgttcat cgatggtgaa 600

gaagaaaaga tcgagcagga aacgttgaag attcttggac caatgaaaga aggacgtata 660

agcgccgcct ctatgcaggt gagcgcgcag tgtcatcggg taaatgtttc agacgggcac 720

atggcggcgg tgcgcgtgaa gctcgatgag ttaactacaa tcgaagaagt cagtgaagca 780

tttgcttcct tcaccgcgct cccgcaggaa ctaaaactgc actcggcgcc ggagcagccg 840

atcattgtgg tgcatgagcc tgatcgtcct cagccgcgac ttgatcgaga tgcaggaagc 900

ggaatgagcg tcacagttgg ccgcgtgcgt gaggataacg tgctcgacta tcgcttcgtt 960

gcgctaagcc acaacacgat cagaggcgca gccggcgcag cgattctcaa tgcggaattg 1020

ttgatcgcga aaggatattt agggcattaa 1050

<210> 130

<211> 349

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID asd_24序列

<400> 130

Met Thr Lys Lys Leu Arg Val Gly Ile Leu Gly Ala Thr Gly Met Val

1 5 10 15

Gly Gln Arg Phe Ile Gln Leu Leu Glu Asn His Pro Gln Phe Glu Ile

20 25 30

Thr Ala Leu Ala Ala Ser Asp Arg Ser Gln Gly Lys Thr Phe Gln Glu

35 40 45

Ala Cys Thr Trp Arg Leu Ala Gly Glu Met Pro Glu Phe Val Lys Ser

50 55 60

Met Pro Ile Ala Ala Pro Gln Pro Pro Leu Asp Cys Glu Leu Val Phe

65 70 75 80

Ser Ser Leu Pro Gly Asp Ile Ala Arg Glu Ser Glu Gly Ala Phe Ala

85 90 95

Ala Ala Gly Phe Pro Val Ile Ser Asn Ser Ser Ala Tyr Arg Met Asp

100 105 110

Ala Glu Val Pro Leu Leu Ile Pro Glu Val Asn His Pro His Leu Ala

115 120 125

Leu Leu Asp Ser Gln Arg Gln Gln Arg Asn Gly Phe Ile Val Thr Asn

130 135 140

Pro Asn Cys Ser Thr Ile Met Leu Ala Met Ala Leu Ala Pro Leu His

145 150 155 160

Ala Lys Phe Arg Val Thr Ser Val Ile Ala Thr Thr Leu Gln Ala Leu

165 170 175

Ser Gly Ala Gly Tyr Pro Gly Val Pro Ser Leu Ala Ile Ser Asp Asn

180 185 190

Val Leu Pro Phe Ile Asp Gly Glu Glu Glu Lys Ile Glu Gln Glu Thr

195 200 205

Leu Lys Ile Leu Gly Pro Met Lys Glu Gly Arg Ile Ser Ala Ala Ser

210 215 220

Met Gln Val Ser Ala Gln Cys His Arg Val Asn Val Ser Asp Gly His

225 230 235 240

Met Ala Ala Val Arg Val Lys Leu Asp Glu Leu Thr Thr Ile Glu Glu

245 250 255

Val Ser Glu Ala Phe Ala Ser Phe Thr Ala Leu Pro Gln Glu Leu Lys

260 265 270

Leu His Ser Ala Pro Glu Gln Pro Ile Ile Val Val His Glu Pro Asp

275 280 285

Arg Pro Gln Pro Arg Leu Asp Arg Asp Ala Gly Ser Gly Met Ser Val

290 295 300

Thr Val Gly Arg Val Arg Glu Asp Asn Val Leu Asp Tyr Arg Phe Val

305 310 315 320

Ala Leu Ser His Asn Thr Ile Arg Gly Ala Ala Gly Ala Ala Ile Leu

325 330 335

Asn Ala Glu Leu Leu Ile Ala Lys Gly Tyr Leu Gly His

340 345

<210> 131

<211> 1353

<212> DNA

<213> 梭菌目

<400> 131

atgtccaagt acgttgaccg cgtcattgct gaagtcgaga aaaagtacgc cgacgaaccg 60

gaattcgttc aaaccgttga agaggtactc tcttcactcg gcccagtagt cgacgcacac 120

cccgagtatg aagaggttgc gctcttggag cgtatggtca ttccagaacg tgtcattgag 180

tttcgcgtcc cgtgggagga tgacaatggt aaagtacatg tgaatactgg ttaccgcgtc 240

caatttaatg gcgcgatcgg cccttataaa ggtggcttgc gcttcgcccc ttcggtcaac 300

ctttccatta tgaaatttct cggcttcgag caagcattca aagattccct gaccacgctt 360

cctatgggag gagcaaaagg cggttcagac ttcgacccaa acggaaaatc cgatcgcgaa 420

gtaatgcgct tctgccaggc gttcatgact gagttgtatc ggcatattgg tcccgatatc 480

gacgtgcctg ctggtgactt gggcgttggt gcgcgtgaaa ttggttacat gtacggacaa 540

taccggaaga tcgtcggcgg attctacaat ggcgtcctga ccggtaaagc ccggtcattc 600

ggtggaagct tggtccggcc cgaagcaact ggttacggat cggtgtatta tgtggaggct 660

gtgatgaaac atgaaaatga cacgcttgta ggtaaaactg ttgcactggc aggttttggt 720

aacgttgcat ggggtgcagc taagaagctc gcggagttgg gtgcgaaagc agtaactttg 780

tctggcccgg atggctatat ctacgacccc gagggtatca ctaccgagga aaagatcaat 840

tacatgcttg aaatgcgggc gtctggacgt aacaaggtac aggattacgc agacaagttt 900

ggagtgcaat tctttccggg tgaaaagcct tggggccaaa aagttgacat tattatgcct 960

tgtgcaactc agaatgatgt tgacctggaa caggctaaaa agatcgtggc gaacaacgtg 1020

aagtactaca tcgaagtagc caacatgcct actactaatg aagcattgcg gtttcttatg 1080

cagcaaccta acatggtagt cgcccccagc aaggctgtga acgcaggtgg agtactggta 1140

tcgggtttcg agatgtcaca aaattccgaa cgtctgtcat ggaccgccga agaagtcgat 1200

agcaaactgc atcaggtgat gactgacatt catgacggtt cagccgccgc agctgaacgc 1260

tacggacttg gttacaatct tgtcgcaggt gctaatatcg taggttttca gaagatcgcc 1320

gatgccatga tggctcaagg aatcgcttgg tag 1353

<210> 132

<211> 450

<212> PRT

<213> 梭菌目

<400> 132

Met Ser Lys Tyr Val Asp Arg Val Ile Ala Glu Val Glu Lys Lys Tyr

1 5 10 15

Ala Asp Glu Pro Glu Phe Val Gln Thr Val Glu Glu Val Leu Ser Ser

20 25 30

Leu Gly Pro Val Val Asp Ala His Pro Glu Tyr Glu Glu Val Ala Leu

35 40 45

Leu Glu Arg Met Val Ile Pro Glu Arg Val Ile Glu Phe Arg Val Pro

50 55 60

Trp Glu Asp Asp Asn Gly Lys Val His Val Asn Thr Gly Tyr Arg Val

65 70 75 80

Gln Phe Asn Gly Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe Ala

85 90 95

Pro Ser Val Asn Leu Ser Ile Met Lys Phe Leu Gly Phe Glu Gln Ala

100 105 110

Phe Lys Asp Ser Leu Thr Thr Leu Pro Met Gly Gly Ala Lys Gly Gly

115 120 125

Ser Asp Phe Asp Pro Asn Gly Lys Ser Asp Arg Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Thr Glu Leu Tyr Arg His Ile Gly Pro Asp Ile

145 150 155 160

Asp Val Pro Ala Gly Asp Leu Gly Val Gly Ala Arg Glu Ile Gly Tyr

165 170 175

Met Tyr Gly Gln Tyr Arg Lys Ile Val Gly Gly Phe Tyr Asn Gly Val

180 185 190

Leu Thr Gly Lys Ala Arg Ser Phe Gly Gly Ser Leu Val Arg Pro Glu

195 200 205

Ala Thr Gly Tyr Gly Ser Val Tyr Tyr Val Glu Ala Val Met Lys His

210 215 220

Glu Asn Asp Thr Leu Val Gly Lys Thr Val Ala Leu Ala Gly Phe Gly

225 230 235 240

Asn Val Ala Trp Gly Ala Ala Lys Lys Leu Ala Glu Leu Gly Ala Lys

245 250 255

Ala Val Thr Leu Ser Gly Pro Asp Gly Tyr Ile Tyr Asp Pro Glu Gly

260 265 270

Ile Thr Thr Glu Glu Lys Ile Asn Tyr Met Leu Glu Met Arg Ala Ser

275 280 285

Gly Arg Asn Lys Val Gln Asp Tyr Ala Asp Lys Phe Gly Val Gln Phe

290 295 300

Phe Pro Gly Glu Lys Pro Trp Gly Gln Lys Val Asp Ile Ile Met Pro

305 310 315 320

Cys Ala Thr Gln Asn Asp Val Asp Leu Glu Gln Ala Lys Lys Ile Val

325 330 335

Ala Asn Asn Val Lys Tyr Tyr Ile Glu Val Ala Asn Met Pro Thr Thr

340 345 350

Asn Glu Ala Leu Arg Phe Leu Met Gln Gln Pro Asn Met Val Val Ala

355 360 365

Pro Ser Lys Ala Val Asn Ala Gly Gly Val Leu Val Ser Gly Phe Glu

370 375 380

Met Ser Gln Asn Ser Glu Arg Leu Ser Trp Thr Ala Glu Glu Val Asp

385 390 395 400

Ser Lys Leu His Gln Val Met Thr Asp Ile His Asp Gly Ser Ala Ala

405 410 415

Ala Ala Glu Arg Tyr Gly Leu Gly Tyr Asn Leu Val Ala Gly Ala Asn

420 425 430

Ile Val Gly Phe Gln Lys Ile Ala Asp Ala Met Met Ala Gln Gly Ile

435 440 445

Ala Trp

450

<210> 133

<211> 1344

<212> DNA

<213> 大肠杆菌

<400> 133

atggatcaga catattctct ggagtcattc ctcaaccatg tccaaaagcg cgacccgaat 60

caaaccgagt tcgcgcaagc cgttcgtgaa gtaatgacca cactctggcc ttttcttgaa 120

caaaatccaa aatatcgcca gatgtcatta ctggagcgtc tggttgaacc ggagcgcgtg 180

atccagtttc gcgtggtatg ggttgatgat cgcaaccaga tacaggtcaa ccgtgcatgg 240

cgtgtgcagt tcagctctgc catcggcccg tacaaaggcg gtatgcgctt ccatccgtca 300

gttaaccttt ccattctcaa attcctcggc tttgaacaaa ccttcaaaaa tgccctgact 360

actctgccga tgggcggtgg taaaggcggc agcgatttcg atccgaaagg aaaaagcgaa 420

ggtgaagtga tgcgtttttg ccaggcgctg atgactgaac tgtatcgcca cctgggcgcg 480

gataccgacg ttccggcagg tgatatcggg gttggtggtc gtgaagtcgg ctttatggcg 540

gggatgatga aaaagctctc caacaatacc gcctgcgtct tcaccggtaa gggcctttca 600

tttggcggca gtcttattcg cccggaagct accggctacg gtctggttta tttcacagaa 660

gcaatgctaa aacgccacgg tatgggtttt gaagggatgc gcgtttccgt ttctggctcc 720

ggcaacgtcg cccagtacgc tatcgaaaaa gcgatggaat ttggtgctcg tgtgatcact 780

gcgtcagact ccagcggcac tgtagttgat gaaagcggat tcacgaaaga gaaactggca 840

cgtcttatcg aaatcaaagc cagccgcgat ggtcgagtgg cagattacgc caaagaattt 900

ggtctggtct atctcgaagg ccaacagccg tggtctctac cggttgatat cgccctgcct 960

tgcgccaccc agaatgaact ggatgttgac gccgcgcatc agcttatcgc taatggcgtt 1020

aaagccgtcg ccgaaggggc aaatatgccg accaccatcg aagcgactga actgttccag 1080

caggcaggcg tactatttgc accgggtaaa gcggctaatg ctggtggcgt cgctacatcg 1140

ggcctggaaa tggcacaaaa cgctgcgcgc ctgggctgga aagccgagaa agttgacgca 1200

cgtttgcatc acatcatgct ggatatccac catgcctgtg ttgagcatgg tggtgaaggt 1260

gagcaaacca actacgtgca gggcgcgaac attgccggtt ttgtgaaggt tgccgatgcg 1320

atgctggcgc agggtgtgat ttaa 1344

<210> 134

<211> 447

<212> PRT

<213> 大肠杆菌

<400> 134

Met Asp Gln Thr Tyr Ser Leu Glu Ser Phe Leu Asn His Val Gln Lys

1 5 10 15

Arg Asp Pro Asn Gln Thr Glu Phe Ala Gln Ala Val Arg Glu Val Met

20 25 30

Thr Thr Leu Trp Pro Phe Leu Glu Gln Asn Pro Lys Tyr Arg Gln Met

35 40 45

Ser Leu Leu Glu Arg Leu Val Glu Pro Glu Arg Val Ile Gln Phe Arg

50 55 60

Val Val Trp Val Asp Asp Arg Asn Gln Ile Gln Val Asn Arg Ala Trp

65 70 75 80

Arg Val Gln Phe Ser Ser Ala Ile Gly Pro Tyr Lys Gly Gly Met Arg

85 90 95

Phe His Pro Ser Val Asn Leu Ser Ile Leu Lys Phe Leu Gly Phe Glu

100 105 110

Gln Thr Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys

115 120 125

Gly Gly Ser Asp Phe Asp Pro Lys Gly Lys Ser Glu Gly Glu Val Met

130 135 140

Arg Phe Cys Gln Ala Leu Met Thr Glu Leu Tyr Arg His Leu Gly Ala

145 150 155 160

Asp Thr Asp Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Val

165 170 175

Gly Phe Met Ala Gly Met Met Lys Lys Leu Ser Asn Asn Thr Ala Cys

180 185 190

Val Phe Thr Gly Lys Gly Leu Ser Phe Gly Gly Ser Leu Ile Arg Pro

195 200 205

Glu Ala Thr Gly Tyr Gly Leu Val Tyr Phe Thr Glu Ala Met Leu Lys

210 215 220

Arg His Gly Met Gly Phe Glu Gly Met Arg Val Ser Val Ser Gly Ser

225 230 235 240

Gly Asn Val Ala Gln Tyr Ala Ile Glu Lys Ala Met Glu Phe Gly Ala

245 250 255

Arg Val Ile Thr Ala Ser Asp Ser Ser Gly Thr Val Val Asp Glu Ser

260 265 270

Gly Phe Thr Lys Glu Lys Leu Ala Arg Leu Ile Glu Ile Lys Ala Ser

275 280 285

Arg Asp Gly Arg Val Ala Asp Tyr Ala Lys Glu Phe Gly Leu Val Tyr

290 295 300

Leu Glu Gly Gln Gln Pro Trp Ser Leu Pro Val Asp Ile Ala Leu Pro

305 310 315 320

Cys Ala Thr Gln Asn Glu Leu Asp Val Asp Ala Ala His Gln Leu Ile

325 330 335

Ala Asn Gly Val Lys Ala Val Ala Glu Gly Ala Asn Met Pro Thr Thr

340 345 350

Ile Glu Ala Thr Glu Leu Phe Gln Gln Ala Gly Val Leu Phe Ala Pro

355 360 365

Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met

370 375 380

Ala Gln Asn Ala Ala Arg Leu Gly Trp Lys Ala Glu Lys Val Asp Ala

385 390 395 400

Arg Leu His His Ile Met Leu Asp Ile His His Ala Cys Val Glu His

405 410 415

Gly Gly Glu Gly Glu Gln Thr Asn Tyr Val Gln Gly Ala Asn Ile Ala

420 425 430

Gly Phe Val Lys Val Ala Asp Ala Met Leu Ala Gln Gly Val Ile

435 440 445

<210> 135

<211> 1377

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_1序列

<400> 135

ttgaaatcat taaacgaaaa ttccgtcaac ggcctggccc gcataaaaag tattactgac 60

ttcgtcgtcg atctcgaaaa acgcaatccg cacgagcccg agttcaaaca ggccgtaact 120

gaagcggtca acgatctcat tccattcatc gaagccaatc cacgttatca gaaccagatg 180

attctcgagc ggctcaccga acccgatcgc gtcattactt tccgcgtttc ctggatggac 240

gatgccggga acattcgcat caatcgcggc tatcgcgtgc agaacagcaa tgcaatcggc 300

ccgtacaaag gcggcattcg ttttcacccg agcgtgaatc tcagcatcct gaaatttctc 360

gcgtttgaac aaaccctgaa aaacagtttg actggcctgc cgatgggcgg cgcgaaaggc 420

ggctccgatt tcgatcccaa aggaaaatcc gatgcggaag tcatgcgctt ctgccaggcg 480

ctgatcaccg aactctggcg tcacatcggt tccgataccg acgtaccggc aggtgacata 540

ggcgtaggag cacgcgaagt cggctacatg ttcggccaat acaaacgtct ttccaattcg 600

ttcacaggtg cgttcaccgg caagggcatc gactacggcg gcagtctcgg ccgcaccgaa 660

gccaccggtt acggcgctgt gtatatgctt gcggaagtca tgacgtacaa caaggaagac 720

ctcgcgggca aacgcgtgct cgtttccggt tcgggcaatg tggcggtata tgcggttgaa 780

aaagccatgc agatgggagc gatcgtcacc acgctctcag actccagcgg cttcgtctac 840

gacaagaatg gcttcactta cgaaaaactc gaatacatca aacagctgaa attcatcgac 900

cgcggccgca tcgaaaaata ttgcgatcat ttcgaagccg aattccacgc cggcaggaaa 960

ccctggggaa tgcctgccga tgtcgcgctg ccctgtgcaa cgcagaacga aatcacgctc 1020

gacgatgcca aaaccctcgt cgccaacggc tgcagatatc tcgtggaagg cgccaacatg 1080

cccaccacca tcgacgcgat tgcattgtta ctcgaaaaca aagtgcatta cgtccccggc 1140

aaagcagcca acgccggcgg tgtggcagtg tcaggcctcg agatgagcca gaactcgctg 1200

cgcatcggct ggaccgcacg agaggtcgat ctcaaactgc acgacatcat gcgtcacatc 1260

catcacaagt gcgtgcagca cggcaaggaa aatggtttcg tcaattattc aaaaggcgcc 1320

aacattgccg gtttcatcaa ggtggcggat gcgatgctcg cgttgggcgt tgtttag 1377

<210> 136

<211> 458

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_1序列

<400> 136

Leu Lys Ser Leu Asn Glu Asn Ser Val Asn Gly Leu Ala Arg Ile Lys

1 5 10 15

Ser Ile Thr Asp Phe Val Val Asp Leu Glu Lys Arg Asn Pro His Glu

20 25 30

Pro Glu Phe Lys Gln Ala Val Thr Glu Ala Val Asn Asp Leu Ile Pro

35 40 45

Phe Ile Glu Ala Asn Pro Arg Tyr Gln Asn Gln Met Ile Leu Glu Arg

50 55 60

Leu Thr Glu Pro Asp Arg Val Ile Thr Phe Arg Val Ser Trp Met Asp

65 70 75 80

Asp Ala Gly Asn Ile Arg Ile Asn Arg Gly Tyr Arg Val Gln Asn Ser

85 90 95

Asn Ala Ile Gly Pro Tyr Lys Gly Gly Ile Arg Phe His Pro Ser Val

100 105 110

Asn Leu Ser Ile Leu Lys Phe Leu Ala Phe Glu Gln Thr Leu Lys Asn

115 120 125

Ser Leu Thr Gly Leu Pro Met Gly Gly Ala Lys Gly Gly Ser Asp Phe

130 135 140

Asp Pro Lys Gly Lys Ser Asp Ala Glu Val Met Arg Phe Cys Gln Ala

145 150 155 160

Leu Ile Thr Glu Leu Trp Arg His Ile Gly Ser Asp Thr Asp Val Pro

165 170 175

Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Val Gly Tyr Met Phe Gly

180 185 190

Gln Tyr Lys Arg Leu Ser Asn Ser Phe Thr Gly Ala Phe Thr Gly Lys

195 200 205

Gly Ile Asp Tyr Gly Gly Ser Leu Gly Arg Thr Glu Ala Thr Gly Tyr

210 215 220

Gly Ala Val Tyr Met Leu Ala Glu Val Met Thr Tyr Asn Lys Glu Asp

225 230 235 240

Leu Ala Gly Lys Arg Val Leu Val Ser Gly Ser Gly Asn Val Ala Val

245 250 255

Tyr Ala Val Glu Lys Ala Met Gln Met Gly Ala Ile Val Thr Thr Leu

260 265 270

Ser Asp Ser Ser Gly Phe Val Tyr Asp Lys Asn Gly Phe Thr Tyr Glu

275 280 285

Lys Leu Glu Tyr Ile Lys Gln Leu Lys Phe Ile Asp Arg Gly Arg Ile

290 295 300

Glu Lys Tyr Cys Asp His Phe Glu Ala Glu Phe His Ala Gly Arg Lys

305 310 315 320

Pro Trp Gly Met Pro Ala Asp Val Ala Leu Pro Cys Ala Thr Gln Asn

325 330 335

Glu Ile Thr Leu Asp Asp Ala Lys Thr Leu Val Ala Asn Gly Cys Arg

340 345 350

Tyr Leu Val Glu Gly Ala Asn Met Pro Thr Thr Ile Asp Ala Ile Ala

355 360 365

Leu Leu Leu Glu Asn Lys Val His Tyr Val Pro Gly Lys Ala Ala Asn

370 375 380

Ala Gly Gly Val Ala Val Ser Gly Leu Glu Met Ser Gln Asn Ser Leu

385 390 395 400

Arg Ile Gly Trp Thr Ala Arg Glu Val Asp Leu Lys Leu His Asp Ile

405 410 415

Met Arg His Ile His His Lys Cys Val Gln His Gly Lys Glu Asn Gly

420 425 430

Phe Val Asn Tyr Ser Lys Gly Ala Asn Ile Ala Gly Phe Ile Lys Val

435 440 445

Ala Asp Ala Met Leu Ala Leu Gly Val Val

450 455

<210> 137

<211> 1569

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_2序列

<400> 137

cagtgtcttg atgcacagga atggggccct tggtggcccc atttttgtgc ctgccgttta 60

cgggtgtggc tgtttgaagg ccagtattgc ggctgttttt gtatacactt atataaaacg 120

caccaaataa attcagaaaa acgacatatc gccctgttcc ggtgcgattg cttcttggta 180

cattcccgaa aaaccaataa gtccaaaatc ttcggtgaga tgtctaccat gatcgaatct 240

gtcgacagtt tcctcgcgcg cctgcaacag cgcgacccag gccagcctga gtttcatcag 300

gcagtggaag aagtgctgcg cacgctgtgg cccttcctgg aagccaaccc tcattacctg 360

cagtccggca tcctggagcg tatggtcgag ccggagcgtg cggtgctgtt tcgcgtgtcg 420

tgggtcgatg accagggcaa agtgcaggtc aaccgcggct accgtatcca gatgagcagc 480

gccatcggcc cgtacaaggg cggcttgcgc ttccacccct cggtcaacct cagcgtgctg 540

aaattcctgg ccttcgagca ggtcttcaag aactccctga cttcgctgcc catggggggt 600

ggcaagggcg ggtcggactt cgaccccaaa ggcaagagcg acgccgaagt gatgcgcttc 660

tgccaggcgt tcatgagcga gctgtaccgc cacatcggcg ccgactgcga cgtgccggcc 720

ggtgacatcg gcgtgggtgc ccgcgaaatc ggcttcatgt ttggccagta caagcggctt 780

gccaaccagt tcacgtcggt actgaccggc aaggggatga cctacggtgg cagcctgatt 840

cgccccgaag ccaccggcta cggttgcgtg tatttcgccg aggaaatgct caagcgccag 900

gacaagcgca tcgacggccg tcgcgtggcg gtgtcgggct cgggcaacgt tgcccagtat 960

gccgcgcgca aggtcatgga cctgggcggc aaagtgatct ccatgtcgga ctccgaaggc 1020

acgctgtatg ccgaagccgg cctgaccgat gcccagtggg aagcactgat ggcgctgaag 1080

aacgtcaagc ggggccgtat cagcgagctg gccgagcagt tcggcctgga gttccgcaag 1140

ggccagaccc cttggagcct ggcatgcgac atcgccttgc cgtgcgccac gcagaatgaa 1200

ctgggcgccg aagacgcccg taccttgctg gccaacggct gtatctgcgt ggccgaaggc 1260

gccaacatgc cgaccaccct ggaagcggtg gatatcttcc tggaagccgg catcctctat 1320

gccccgggca aagcatccaa cgccggcggt gtggccgtgt ctggcctgga gatgtcgcag 1380

aatgccatgc gcctgctgtg gactgccggc gaagtggaca gcaagctgca caacatcatg 1440

caatcgatcc accatgcttg cgtgcattat ggtgaagagg cggatggccg ggtcaattac 1500

gtgaaaggtg cgaacatcgc cggctttgtc aaagtggcgg atgcgatgtt ggctcagggc 1560

gtggtttga 1569

<210> 138

<211> 522

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_2序列

<400> 138

Gln Cys Leu Asp Ala Gln Glu Trp Gly Pro Trp Trp Pro His Phe Cys

1 5 10 15

Ala Cys Arg Leu Arg Val Trp Leu Phe Glu Gly Gln Tyr Cys Gly Cys

20 25 30

Phe Cys Ile His Leu Tyr Lys Thr His Gln Ile Asn Ser Glu Lys Arg

35 40 45

His Ile Ala Leu Phe Arg Cys Asp Cys Phe Leu Val His Ser Arg Lys

50 55 60

Thr Asn Lys Ser Lys Ile Phe Gly Glu Met Ser Thr Met Ile Glu Ser

65 70 75 80

Val Asp Ser Phe Leu Ala Arg Leu Gln Gln Arg Asp Pro Gly Gln Pro

85 90 95

Glu Phe His Gln Ala Val Glu Glu Val Leu Arg Thr Leu Trp Pro Phe

100 105 110

Leu Glu Ala Asn Pro His Tyr Leu Gln Ser Gly Ile Leu Glu Arg Met

115 120 125

Val Glu Pro Glu Arg Ala Val Leu Phe Arg Val Ser Trp Val Asp Asp

130 135 140

Gln Gly Lys Val Gln Val Asn Arg Gly Tyr Arg Ile Gln Met Ser Ser

145 150 155 160

Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Ser Val Asn

165 170 175

Leu Ser Val Leu Lys Phe Leu Ala Phe Glu Gln Val Phe Lys Asn Ser

180 185 190

Leu Thr Ser Leu Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe Asp

195 200 205

Pro Lys Gly Lys Ser Asp Ala Glu Val Met Arg Phe Cys Gln Ala Phe

210 215 220

Met Ser Glu Leu Tyr Arg His Ile Gly Ala Asp Cys Asp Val Pro Ala

225 230 235 240

Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Phe Met Phe Gly Gln

245 250 255

Tyr Lys Arg Leu Ala Asn Gln Phe Thr Ser Val Leu Thr Gly Lys Gly

260 265 270

Met Thr Tyr Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr Gly

275 280 285

Cys Val Tyr Phe Ala Glu Glu Met Leu Lys Arg Gln Asp Lys Arg Ile

290 295 300

Asp Gly Arg Arg Val Ala Val Ser Gly Ser Gly Asn Val Ala Gln Tyr

305 310 315 320

Ala Ala Arg Lys Val Met Asp Leu Gly Gly Lys Val Ile Ser Met Ser

325 330 335

Asp Ser Glu Gly Thr Leu Tyr Ala Glu Ala Gly Leu Thr Asp Ala Gln

340 345 350

Trp Glu Ala Leu Met Ala Leu Lys Asn Val Lys Arg Gly Arg Ile Ser

355 360 365

Glu Leu Ala Glu Gln Phe Gly Leu Glu Phe Arg Lys Gly Gln Thr Pro

370 375 380

Trp Ser Leu Ala Cys Asp Ile Ala Leu Pro Cys Ala Thr Gln Asn Glu

385 390 395 400

Leu Gly Ala Glu Asp Ala Arg Thr Leu Leu Ala Asn Gly Cys Ile Cys

405 410 415

Val Ala Glu Gly Ala Asn Met Pro Thr Thr Leu Glu Ala Val Asp Ile

420 425 430

Phe Leu Glu Ala Gly Ile Leu Tyr Ala Pro Gly Lys Ala Ser Asn Ala

435 440 445

Gly Gly Val Ala Val Ser Gly Leu Glu Met Ser Gln Asn Ala Met Arg

450 455 460

Leu Leu Trp Thr Ala Gly Glu Val Asp Ser Lys Leu His Asn Ile Met

465 470 475 480

Gln Ser Ile His His Ala Cys Val His Tyr Gly Glu Glu Ala Asp Gly

485 490 495

Arg Val Asn Tyr Val Lys Gly Ala Asn Ile Ala Gly Phe Val Lys Val

500 505 510

Ala Asp Ala Met Leu Ala Gln Gly Val Val

515 520

<210> 139

<211> 1353

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_3序列

<400> 139

atgagtggtg ccgttcagaa ggaattagac gagttcatgc acgggctaag gaagcgcaac 60

cccggtgagg aggagtttca ccaggcggtg caggaggtcg tggagtcgac tctgccttac 120

gtgctcgatc atccggagta ccgcaaagcg tccattctcg agcgcatgac cgagcccgat 180

cgcgtgatca tctttcgcgt ggcctggcag gacgacgagg gccatgtgcg cgcccaccgc 240

ggctaccgcg tgcagttcaa taacgccatc ggtccctaca aggggggatt gcggttccac 300

tcttccgtca gcctttccat tttgaagttc ttgggcttcg agcagacgtt caagaacagc 360

cttacgggcc tgcccatggg aggcggcaag ggcggatcga atttcaaccc caagggaaaa 420

tccgacggcg aagtgatgcg cttttgccag gccttcatga ccgagctgta ccggcacatc 480

ggcaaggaca ccgatgttcc ggccggcgac atcggcgtgg gtagccggga aatcagctac 540

ctcttcggcc agtacaagcg gattacgaac gaatttaccg gcgtgctgac gggcaagggg 600

ctctccttcg gaggcagcct cattcggact gaagcgacag gctacggctg cgtctatttc 660

atggaagaga tgttgaaggc caagggcgac gctctcgtcg gcaagactgt gacggtgtcg 720

ggctcgggaa acgtcgcgca gttcacggcg aaaaaactga tcgagctggg cgccaaggtg 780

ctcacgctga gcgattccga cggcttcatc cacgatcgaa acgggatcga tctggaaaag 840

ctcaactgga ttctcgatct gaagaacgta cgtcggggtc gtatcgccga ttatacgcag 900

aagtggggag gggagtatca cgaagggggc cgtccctggg tcgttccctg cgatctggcc 960

tttccctgcg ccacccaaaa cgaggtcacg ggctcggacg cccgcatcct catcgccaac 1020

ggctgcatcg gggttgcgga aggcgccaac atgccttccg acctggacgc catccacgcc 1080

tttctggagg cgagaattct ctacgccccg agcaaagcgt ccaacgcggg cggtgtcgcc 1140

gtctccggct tggagatgac ccaaaactcc cagcggctat cctggtcgtc ggaagaggtg 1200

aacgagcgcc tccacgccat catgaagagc atccacgcga gctgcgtgcg ctacggcacc 1260

gaaagagacg gctacgtgaa ctacgtcaaa ggcgcgaacc tcgcggggtt cgtgaaggtc 1320

gccgatgcga tgctcgcctt cggcgtgctg tga 1353

<210> 140

<211> 450

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_3序列

<400> 140

Met Ser Gly Ala Val Gln Lys Glu Leu Asp Glu Phe Met His Gly Leu

1 5 10 15

Arg Lys Arg Asn Pro Gly Glu Glu Glu Phe His Gln Ala Val Gln Glu

20 25 30

Val Val Glu Ser Thr Leu Pro Tyr Val Leu Asp His Pro Glu Tyr Arg

35 40 45

Lys Ala Ser Ile Leu Glu Arg Met Thr Glu Pro Asp Arg Val Ile Ile

50 55 60

Phe Arg Val Ala Trp Gln Asp Asp Glu Gly His Val Arg Ala His Arg

65 70 75 80

Gly Tyr Arg Val Gln Phe Asn Asn Ala Ile Gly Pro Tyr Lys Gly Gly

85 90 95

Leu Arg Phe His Ser Ser Val Ser Leu Ser Ile Leu Lys Phe Leu Gly

100 105 110

Phe Glu Gln Thr Phe Lys Asn Ser Leu Thr Gly Leu Pro Met Gly Gly

115 120 125

Gly Lys Gly Gly Ser Asn Phe Asn Pro Lys Gly Lys Ser Asp Gly Glu

130 135 140

Val Met Arg Phe Cys Gln Ala Phe Met Thr Glu Leu Tyr Arg His Ile

145 150 155 160

Gly Lys Asp Thr Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ser Arg

165 170 175

Glu Ile Ser Tyr Leu Phe Gly Gln Tyr Lys Arg Ile Thr Asn Glu Phe

180 185 190

Thr Gly Val Leu Thr Gly Lys Gly Leu Ser Phe Gly Gly Ser Leu Ile

195 200 205

Arg Thr Glu Ala Thr Gly Tyr Gly Cys Val Tyr Phe Met Glu Glu Met

210 215 220

Leu Lys Ala Lys Gly Asp Ala Leu Val Gly Lys Thr Val Thr Val Ser

225 230 235 240

Gly Ser Gly Asn Val Ala Gln Phe Thr Ala Lys Lys Leu Ile Glu Leu

245 250 255

Gly Ala Lys Val Leu Thr Leu Ser Asp Ser Asp Gly Phe Ile His Asp

260 265 270

Arg Asn Gly Ile Asp Leu Glu Lys Leu Asn Trp Ile Leu Asp Leu Lys

275 280 285

Asn Val Arg Arg Gly Arg Ile Ala Asp Tyr Thr Gln Lys Trp Gly Gly

290 295 300

Glu Tyr His Glu Gly Gly Arg Pro Trp Val Val Pro Cys Asp Leu Ala

305 310 315 320

Phe Pro Cys Ala Thr Gln Asn Glu Val Thr Gly Ser Asp Ala Arg Ile

325 330 335

Leu Ile Ala Asn Gly Cys Ile Gly Val Ala Glu Gly Ala Asn Met Pro

340 345 350

Ser Asp Leu Asp Ala Ile His Ala Phe Leu Glu Ala Arg Ile Leu Tyr

355 360 365

Ala Pro Ser Lys Ala Ser Asn Ala Gly Gly Val Ala Val Ser Gly Leu

370 375 380

Glu Met Thr Gln Asn Ser Gln Arg Leu Ser Trp Ser Ser Glu Glu Val

385 390 395 400

Asn Glu Arg Leu His Ala Ile Met Lys Ser Ile His Ala Ser Cys Val

405 410 415

Arg Tyr Gly Thr Glu Arg Asp Gly Tyr Val Asn Tyr Val Lys Gly Ala

420 425 430

Asn Leu Ala Gly Phe Val Lys Val Ala Asp Ala Met Leu Ala Phe Gly

435 440 445

Val Leu

450

<210> 141

<211> 1383

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_4序列

<400> 141

atggcgacgc tcctcacgac ctcggaagct ctgacgcgtg acacagggaa agccgtcgat 60

cgattcatgg atggcctcgt cgcgcgcaac ccgaatcagc cggagttcca tcaggccgtt 120

cgcgaggtgt gcgagagcgt catgccactc gtcttggagc ggccggagta cgaggaggcc 180

ggaatcctcg agcgcctcac ggaaccggat cgcattctca cgttccgtgt ggcctggcag 240

gatgacgagg gtcgcgtccg tatcaatcgc gcgtatcgcg tgcagttcaa caacgccatt 300

ggtccttaca aaggcggcct ccgctttcat cccaccgtcg atctctcggt cctcaagttc 360

ctcggcttcg agcagatctt caagaacagt ctgaccggtc ttcccatggg tggcgcgaag 420

ggtggttctg atttcgatcc caaagggaag tcggataacg aggtcatgcg gttctgtcag 480

gcgatgatgt cggagctgtg tcacgacatc ggcgaagacg tcgatgtgcc ggccggcgac 540

attggtgtgg gcgctcgaga gattgggtat ctgtttggcg aatatcgtcg cctcatgcgt 600

cgcgtcgctg gtgtgctcac ggggaagggc ttgtcgtttg gcggcagctt gattcggacc 660

gaagccactg gctatggctg cgtctatttc gtcgagaaca tgttgaatca cattggtgat 720

tccctcgacg gcaagacctg tgtcgtttcc ggatcaggca acgtcgcgct ctacacggtc 780

gagaaggtga cggctctggg cgggaaggtc gtcacgctct cggactcgga cggcttcatc 840

tacgatcgcg acggcatcga tgcggagaag ctggagtggg tcaaggagct caaagaggtt 900

cgtcgcggtc gcatcagcga gtacgcggag cacttcggcg gccagttcca cgccgatgag 960

cggccgtggc atgtcgagtg tcaggcggcg tttccttctg ccacgcagaa cgagttggac 1020

aaagaggatg cggaagtgct ggtcgcgaac ggctgtctcg cggtcggcga aggcgcgaac 1080

atgccgagca cgagcgacgc aacgcgcgtg ttcctcgagg ctggaacgct cttcgcgccg 1140

ggcaaggccg cgaatgccgg tggggtcgcc gtctcaggtc tcgagcagag tcagaatgcg 1200

cagcgcctgt tctggacgcg tgacgaggtg gatcttcgtc ttcaaggcat catgaagacg 1260

atccacgaca agtgcgttga gcagggtcgc gtcgcgaatg gtcagatcaa ctacgtgcag 1320

ggcgccaatc gcgcaggttt cctcaaggtc gccgacgcga tgctcgcgca aggcgtattt 1380

tga 1383

<210> 142

<211> 460

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_4序列

<400> 142

Met Ala Thr Leu Leu Thr Thr Ser Glu Ala Leu Thr Arg Asp Thr Gly

1 5 10 15

Lys Ala Val Asp Arg Phe Met Asp Gly Leu Val Ala Arg Asn Pro Asn

20 25 30

Gln Pro Glu Phe His Gln Ala Val Arg Glu Val Cys Glu Ser Val Met

35 40 45

Pro Leu Val Leu Glu Arg Pro Glu Tyr Glu Glu Ala Gly Ile Leu Glu

50 55 60

Arg Leu Thr Glu Pro Asp Arg Ile Leu Thr Phe Arg Val Ala Trp Gln

65 70 75 80

Asp Asp Glu Gly Arg Val Arg Ile Asn Arg Ala Tyr Arg Val Gln Phe

85 90 95

Asn Asn Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Thr

100 105 110

Val Asp Leu Ser Val Leu Lys Phe Leu Gly Phe Glu Gln Ile Phe Lys

115 120 125

Asn Ser Leu Thr Gly Leu Pro Met Gly Gly Ala Lys Gly Gly Ser Asp

130 135 140

Phe Asp Pro Lys Gly Lys Ser Asp Asn Glu Val Met Arg Phe Cys Gln

145 150 155 160

Ala Met Met Ser Glu Leu Cys His Asp Ile Gly Glu Asp Val Asp Val

165 170 175

Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Tyr Leu Phe

180 185 190

Gly Glu Tyr Arg Arg Leu Met Arg Arg Val Ala Gly Val Leu Thr Gly

195 200 205

Lys Gly Leu Ser Phe Gly Gly Ser Leu Ile Arg Thr Glu Ala Thr Gly

210 215 220

Tyr Gly Cys Val Tyr Phe Val Glu Asn Met Leu Asn His Ile Gly Asp

225 230 235 240

Ser Leu Asp Gly Lys Thr Cys Val Val Ser Gly Ser Gly Asn Val Ala

245 250 255

Leu Tyr Thr Val Glu Lys Val Thr Ala Leu Gly Gly Lys Val Val Thr

260 265 270

Leu Ser Asp Ser Asp Gly Phe Ile Tyr Asp Arg Asp Gly Ile Asp Ala

275 280 285

Glu Lys Leu Glu Trp Val Lys Glu Leu Lys Glu Val Arg Arg Gly Arg

290 295 300

Ile Ser Glu Tyr Ala Glu His Phe Gly Gly Gln Phe His Ala Asp Glu

305 310 315 320

Arg Pro Trp His Val Glu Cys Gln Ala Ala Phe Pro Ser Ala Thr Gln

325 330 335

Asn Glu Leu Asp Lys Glu Asp Ala Glu Val Leu Val Ala Asn Gly Cys

340 345 350

Leu Ala Val Gly Glu Gly Ala Asn Met Pro Ser Thr Ser Asp Ala Thr

355 360 365

Arg Val Phe Leu Glu Ala Gly Thr Leu Phe Ala Pro Gly Lys Ala Ala

370 375 380

Asn Ala Gly Gly Val Ala Val Ser Gly Leu Glu Gln Ser Gln Asn Ala

385 390 395 400

Gln Arg Leu Phe Trp Thr Arg Asp Glu Val Asp Leu Arg Leu Gln Gly

405 410 415

Ile Met Lys Thr Ile His Asp Lys Cys Val Glu Gln Gly Arg Val Ala

420 425 430

Asn Gly Gln Ile Asn Tyr Val Gln Gly Ala Asn Arg Ala Gly Phe Leu

435 440 445

Lys Val Ala Asp Ala Met Leu Ala Gln Gly Val Phe

450 455 460

<210> 143

<211> 1386

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_5序列

<400> 143

atgtccaaca agattcgcgc gttgaacccg acgctgagta cacctatggc tttggtccag 60

gaagtcctcg gggtcgtcag caggcgcaac ccgagcgaac cggagtttct tcaggcggtc 120

accgaagtgc ttgaatcgat cgcgccggtc gtccagcgtc gcaaggatta ccgcgatgca 180

aagattctcg aacgcatcgt tgagccggag cgtatgatcc agttccgtgt tccgtggatc 240

gacgataagg gccaaatcca ggttaatcgc ggtttccgtg tgcagatgaa tagcgcgctc 300

ggcccttaca aaggcgggtt acggtttcat ccgaccgtaa atgccagcat cctcaagttc 360

cttgctttcg aacaggtatt taaaaactcg ctcaccactt tgccgatggg cggaggcaaa 420

ggcggcgccg atttcgaccc gaaaggaaaa tccgataccg aagtgatgca cttctgccaa 480

tcgttcatga ccgagttgtt tcgacacgtc ggtcccgata cggacgtgcc ggccggcgat 540

atcggagttg gcggccggga aatcggttat ctgtttggtc aatacaaacg tctggccaat 600

gagttcactg gcgttctcac cggcaaatcg ctcagttggg gcgggtcgct catccgcccg 660

caagcgaccg gttatggcgc cgtttacttt gccgaagaga tgctcaagac gcgaaagcaa 720

ggtctcgaag gcagggtttg taccgtctcc ggttcgggca acgccgcgca atacacagtt 780

tcaaaattga accaggtggg cgccaaagtc gtgaccatgt ccgattcagg cgggttcatt 840

tatgacaagg atggcatcac cgacgaaaag ctgagctgga tcatggattt gaagaacgtg 900

cggcgtcgcc gcatcaagga gtacgccgat cagtttcaag gaacaactta tacggaaggc 960

cagcggccct ggagcgtacc gtgtgaatgc gcgtttccgt gtgccacgca aaatgaaatc 1020

agtggtgaag atgcgaaagc gttgatcgac aacggctgct ttctggtttc ggaagccgca 1080

aatatgccga ccgcgccagc gggagtggat ctcttcctcg ctaataaggt cctttatggt 1140

cccggcaaag ccgccaatgc cggcggggtg gcggtttccg gcttggagat ggcgcaaaat 1200

tcaatgcgtc tgccgtggcc gcgcgctgaa gtggatcaac ggcttcgcca aatcatggcc 1260

acgatccaca gaaacgcgtg ggagaccgcg gccgagtacg atcaacccgg caatcttgtc 1320

atcggcgcga atatcgccgg tttcgttaaa gtcgccgacg ccatgctcga ccagggtgtg 1380

gtctaa 1386

<210> 144

<211> 461

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_5序列

<400> 144

Met Ser Asn Lys Ile Arg Ala Leu Asn Pro Thr Leu Ser Thr Pro Met

1 5 10 15

Ala Leu Val Gln Glu Val Leu Gly Val Val Ser Arg Arg Asn Pro Ser

20 25 30

Glu Pro Glu Phe Leu Gln Ala Val Thr Glu Val Leu Glu Ser Ile Ala

35 40 45

Pro Val Val Gln Arg Arg Lys Asp Tyr Arg Asp Ala Lys Ile Leu Glu

50 55 60

Arg Ile Val Glu Pro Glu Arg Met Ile Gln Phe Arg Val Pro Trp Ile

65 70 75 80

Asp Asp Lys Gly Gln Ile Gln Val Asn Arg Gly Phe Arg Val Gln Met

85 90 95

Asn Ser Ala Leu Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Thr

100 105 110

Val Asn Ala Ser Ile Leu Lys Phe Leu Ala Phe Glu Gln Val Phe Lys

115 120 125

Asn Ser Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly Ala Asp

130 135 140

Phe Asp Pro Lys Gly Lys Ser Asp Thr Glu Val Met His Phe Cys Gln

145 150 155 160

Ser Phe Met Thr Glu Leu Phe Arg His Val Gly Pro Asp Thr Asp Val

165 170 175

Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Ile Gly Tyr Leu Phe

180 185 190

Gly Gln Tyr Lys Arg Leu Ala Asn Glu Phe Thr Gly Val Leu Thr Gly

195 200 205

Lys Ser Leu Ser Trp Gly Gly Ser Leu Ile Arg Pro Gln Ala Thr Gly

210 215 220

Tyr Gly Ala Val Tyr Phe Ala Glu Glu Met Leu Lys Thr Arg Lys Gln

225 230 235 240

Gly Leu Glu Gly Arg Val Cys Thr Val Ser Gly Ser Gly Asn Ala Ala

245 250 255

Gln Tyr Thr Val Ser Lys Leu Asn Gln Val Gly Ala Lys Val Val Thr

260 265 270

Met Ser Asp Ser Gly Gly Phe Ile Tyr Asp Lys Asp Gly Ile Thr Asp

275 280 285

Glu Lys Leu Ser Trp Ile Met Asp Leu Lys Asn Val Arg Arg Arg Arg

290 295 300

Ile Lys Glu Tyr Ala Asp Gln Phe Gln Gly Thr Thr Tyr Thr Glu Gly

305 310 315 320

Gln Arg Pro Trp Ser Val Pro Cys Glu Cys Ala Phe Pro Cys Ala Thr

325 330 335

Gln Asn Glu Ile Ser Gly Glu Asp Ala Lys Ala Leu Ile Asp Asn Gly

340 345 350

Cys Phe Leu Val Ser Glu Ala Ala Asn Met Pro Thr Ala Pro Ala Gly

355 360 365

Val Asp Leu Phe Leu Ala Asn Lys Val Leu Tyr Gly Pro Gly Lys Ala

370 375 380

Ala Asn Ala Gly Gly Val Ala Val Ser Gly Leu Glu Met Ala Gln Asn

385 390 395 400

Ser Met Arg Leu Pro Trp Pro Arg Ala Glu Val Asp Gln Arg Leu Arg

405 410 415

Gln Ile Met Ala Thr Ile His Arg Asn Ala Trp Glu Thr Ala Ala Glu

420 425 430

Tyr Asp Gln Pro Gly Asn Leu Val Ile Gly Ala Asn Ile Ala Gly Phe

435 440 445

Val Lys Val Ala Asp Ala Met Leu Asp Gln Gly Val Val

450 455 460

<210> 145

<211> 1392

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_6序列

<400> 145

atgctcgtca gccggtccac gcgcgcccct ctctccacct acgtcaccga catcctcgcc 60

ctcgtcaaag ccaagaatcc cgcggaaccg gagttccacc aggcggtcga agaggttctc 120

gaaagcctcg acctggtcgt gcagcggcgg ccggatctcg cgaaagcaaa gattctcgag 180

cggatcgtcg agcccgaacg cgtgatcatg ttccgcgtcc cgtggcagga cgatcgcggc 240

gaggttcata tcaatcgcgg atatcgggtc cagatgaacg gcgcgcttgg tccctataag 300

ggcggtctgc gcttccatca ttcggtgacg ctcggggtgc tgaagttcct cgcgttcgag 360

caggtgttca agaactcact cacgacgctg tcgatgggcg gcggcaaggg tggttccgac 420

ttccatccgc acggccgttc tgacgatgaa gtgatgcgtt tctgtcagag ctttatgacg 480

gagctgatgc gtcacatcgg ccctgacacc gacgtgcccg cgggtgacat cggcgtcggc 540

ggccgcgaga tcgggtacct gttcggccag tatcgccgtc tgcgcaatga attcacgggc 600

gtgctcactg gcaaaggttt gaactggggt ggctcgctga tccgccccga ggcgaccggt 660

tacggcgctg tctacttcac cgccgagatg ctggccaccc gcaacgaaac gctggaaggg 720

aaggtctgtc tcgtctcggg cagcggcaac gtcgcccagt acacgatcga gaagctgctc 780

gatctcggag ccagagcggt aacggtctcc gactccgacg gctacatcta cgacgaagcc 840

ggcttcgacc gcgagaagct cgcgtatctg atggagctga agaacgtccg ccgcggccgc 900

gtgcgcgaat acgccgatcg gttcaagggc gccgtgtacc aggagataaa ggccgccaac 960

gacttcaacc cgctctggat gcaccgtgcg cactgcgcgt tcccaagcgc gacgcagaac 1020

gagatcaatg agaaggatgc cggacatctg gtcgcgagcg gcgtgctcgc cgtcgccgag 1080

ggcgccaaca tgccgtgcac cattgcagcg acaaaggtgt tgatcgacgg cggcgtactc 1140

tacgctcccg gcaaggccgc caatgcgggc ggcgtcgcga cgtcaggctt ggagatggcg 1200

cagaacagcg cgcggatcgc ctggtcgcgc gagcgcgtcg acaccgagct gcaccggatc 1260

atgaaagcga tccacgccgc gtgccgcgaa accgccgacg aatacggcgt ccccggcaac 1320

tacgtccacg gcgccaatat cgcggggttc accaaggtcg cggacgcgat gctcgaccag 1380

ggactgattt ga 1392

<210> 146

<211> 463

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_6序列

<400> 146

Met Leu Val Ser Arg Ser Thr Arg Ala Pro Leu Ser Thr Tyr Val Thr

1 5 10 15

Asp Ile Leu Ala Leu Val Lys Ala Lys Asn Pro Ala Glu Pro Glu Phe

20 25 30

His Gln Ala Val Glu Glu Val Leu Glu Ser Leu Asp Leu Val Val Gln

35 40 45

Arg Arg Pro Asp Leu Ala Lys Ala Lys Ile Leu Glu Arg Ile Val Glu

50 55 60

Pro Glu Arg Val Ile Met Phe Arg Val Pro Trp Gln Asp Asp Arg Gly

65 70 75 80

Glu Val His Ile Asn Arg Gly Tyr Arg Val Gln Met Asn Gly Ala Leu

85 90 95

Gly Pro Tyr Lys Gly Gly Leu Arg Phe His His Ser Val Thr Leu Gly

100 105 110

Val Leu Lys Phe Leu Ala Phe Glu Gln Val Phe Lys Asn Ser Leu Thr

115 120 125

Thr Leu Ser Met Gly Gly Gly Lys Gly Gly Ser Asp Phe His Pro His

130 135 140

Gly Arg Ser Asp Asp Glu Val Met Arg Phe Cys Gln Ser Phe Met Thr

145 150 155 160

Glu Leu Met Arg His Ile Gly Pro Asp Thr Asp Val Pro Ala Gly Asp

165 170 175

Ile Gly Val Gly Gly Arg Glu Ile Gly Tyr Leu Phe Gly Gln Tyr Arg

180 185 190

Arg Leu Arg Asn Glu Phe Thr Gly Val Leu Thr Gly Lys Gly Leu Asn

195 200 205

Trp Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val

210 215 220

Tyr Phe Thr Ala Glu Met Leu Ala Thr Arg Asn Glu Thr Leu Glu Gly

225 230 235 240

Lys Val Cys Leu Val Ser Gly Ser Gly Asn Val Ala Gln Tyr Thr Ile

245 250 255

Glu Lys Leu Leu Asp Leu Gly Ala Arg Ala Val Thr Val Ser Asp Ser

260 265 270

Asp Gly Tyr Ile Tyr Asp Glu Ala Gly Phe Asp Arg Glu Lys Leu Ala

275 280 285

Tyr Leu Met Glu Leu Lys Asn Val Arg Arg Gly Arg Val Arg Glu Tyr

290 295 300

Ala Asp Arg Phe Lys Gly Ala Val Tyr Gln Glu Ile Lys Ala Ala Asn

305 310 315 320

Asp Phe Asn Pro Leu Trp Met His Arg Ala His Cys Ala Phe Pro Ser

325 330 335

Ala Thr Gln Asn Glu Ile Asn Glu Lys Asp Ala Gly His Leu Val Ala

340 345 350

Ser Gly Val Leu Ala Val Ala Glu Gly Ala Asn Met Pro Cys Thr Ile

355 360 365

Ala Ala Thr Lys Val Leu Ile Asp Gly Gly Val Leu Tyr Ala Pro Gly

370 375 380

Lys Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ala

385 390 395 400

Gln Asn Ser Ala Arg Ile Ala Trp Ser Arg Glu Arg Val Asp Thr Glu

405 410 415

Leu His Arg Ile Met Lys Ala Ile His Ala Ala Cys Arg Glu Thr Ala

420 425 430

Asp Glu Tyr Gly Val Pro Gly Asn Tyr Val His Gly Ala Asn Ile Ala

435 440 445

Gly Phe Thr Lys Val Ala Asp Ala Met Leu Asp Gln Gly Leu Ile

450 455 460

<210> 147

<211> 1470

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_7序列

<400> 147

atggagggcg agaacatcat cgacgccgag ccgcaaagcc gcctaacatt cgcgccgcgc 60

agggcacgga gtcagcacgg gggcgcagga ggcaagccgg agaatgctat gagtcaatac 120

gtgtctgatc tgatggcgga ggtaaaagcg aagaaccccg ccgagcctga gttccaccag 180

gcggtcttcg aggtcgcgga gtcgcttacc agtgtgcttg aagcgcatcc gcagttccgc 240

gaagcgaaga tcctcgagcg gatgatcgag cccgagcgcg tgatcatgtt ccgcgtgcca 300

tggcgcgacg ataacggcac gttgcacgtg aaccgcggct tccgcgtgca gatgaacagc 360

gcgatcggcc cgtacaaagg cggcctgcgc tttcatccca cggtcaacct cggcatcctg 420

aaattcctcg ccttcgagca ggtcttcaaa aatgccctaa cgacactgcc gatgggcggc 480

ggaaagggcg gtgcggactt cgatccgaag ggcaagagcg acatggaagt aatgcgcttt 540

tgccaggcgt tcatgtccga gctggcgcgg catatcgggc cggacaccga cgtgccggcg 600

ggcgacatcg gcgtcggcgc gcgggagatc ggtttcctct tcggcatgta caagaagctg 660

aagaacgagt tcaccggcgt gatgactggg aaaggcctca cgtggggtgg ctccgtcatc 720

cgcccggagg caacgggcta tggcgcggtc tacttcgcgg ccgaaatgct caagacgcgc 780

aaagaggaac tgcgcggcaa gacctgcctc gtctccggga gcgggaacgt tgcgcaatac 840

acggtggaga agttgatctc gttaggcgca aagccggtca cgctgtcgga ttcggctggc 900

tacatctacg acgagagcgg catcacgcgc gagaagctcg cgttcgttat ggagctcaag 960

aacgtgcgcc gcggccgcat ctcagaatac gcggagaagt tcactggcgc cgtctacacg 1020

ccgctcgacg gcacgtccga gcacaacccg ctctgggacc acaaggcgga gtgcgccttc 1080

cccagcgcga cccagaacga gatcagtgag cgcgacgcgg cgaacctgct gcgcaacggc 1140

gtctacgttg tctccgaagg cgcgaacatg ccgagcacga tcggcgcgat aaaccagttc 1200

ctggctgccc agattctctt cggccccggg aaagcagcca acgcgggcgg cgtcgcgacc 1260

tctgggctcg agatggcgca gaacagcatg cgcatttcgt ggacgcgcga agaggtggat 1320

aatcgcctct acaacatcat gaaaacgatc cacgaagtct gccaccgcac ggccgacaag 1380

tacggcacgc ccggcaacta cgtgaatggc gccaacatcg ccggcttcct caaggtggcg 1440

aacgcgatga tggaccaggg cctggtctga 1470

<210> 148

<211> 489

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_7序列

<400> 148

Met Glu Gly Glu Asn Ile Ile Asp Ala Glu Pro Gln Ser Arg Leu Thr

1 5 10 15

Phe Ala Pro Arg Arg Ala Arg Ser Gln His Gly Gly Ala Gly Gly Lys

20 25 30

Pro Glu Asn Ala Met Ser Gln Tyr Val Ser Asp Leu Met Ala Glu Val

35 40 45

Lys Ala Lys Asn Pro Ala Glu Pro Glu Phe His Gln Ala Val Phe Glu

50 55 60

Val Ala Glu Ser Leu Thr Ser Val Leu Glu Ala His Pro Gln Phe Arg

65 70 75 80

Glu Ala Lys Ile Leu Glu Arg Met Ile Glu Pro Glu Arg Val Ile Met

85 90 95

Phe Arg Val Pro Trp Arg Asp Asp Asn Gly Thr Leu His Val Asn Arg

100 105 110

Gly Phe Arg Val Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly

115 120 125

Leu Arg Phe His Pro Thr Val Asn Leu Gly Ile Leu Lys Phe Leu Ala

130 135 140

Phe Glu Gln Val Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly

145 150 155 160

Gly Lys Gly Gly Ala Asp Phe Asp Pro Lys Gly Lys Ser Asp Met Glu

165 170 175

Val Met Arg Phe Cys Gln Ala Phe Met Ser Glu Leu Ala Arg His Ile

180 185 190

Gly Pro Asp Thr Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg

195 200 205

Glu Ile Gly Phe Leu Phe Gly Met Tyr Lys Lys Leu Lys Asn Glu Phe

210 215 220

Thr Gly Val Met Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile

225 230 235 240

Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met

245 250 255

Leu Lys Thr Arg Lys Glu Glu Leu Arg Gly Lys Thr Cys Leu Val Ser

260 265 270

Gly Ser Gly Asn Val Ala Gln Tyr Thr Val Glu Lys Leu Ile Ser Leu

275 280 285

Gly Ala Lys Pro Val Thr Leu Ser Asp Ser Ala Gly Tyr Ile Tyr Asp

290 295 300

Glu Ser Gly Ile Thr Arg Glu Lys Leu Ala Phe Val Met Glu Leu Lys

305 310 315 320

Asn Val Arg Arg Gly Arg Ile Ser Glu Tyr Ala Glu Lys Phe Thr Gly

325 330 335

Ala Val Tyr Thr Pro Leu Asp Gly Thr Ser Glu His Asn Pro Leu Trp

340 345 350

Asp His Lys Ala Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile

355 360 365

Ser Glu Arg Asp Ala Ala Asn Leu Leu Arg Asn Gly Val Tyr Val Val

370 375 380

Ser Glu Gly Ala Asn Met Pro Ser Thr Ile Gly Ala Ile Asn Gln Phe

385 390 395 400

Leu Ala Ala Gln Ile Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly

405 410 415

Gly Val Ala Thr Ser Gly Leu Glu Met Ala Gln Asn Ser Met Arg Ile

420 425 430

Ser Trp Thr Arg Glu Glu Val Asp Asn Arg Leu Tyr Asn Ile Met Lys

435 440 445

Thr Ile His Glu Val Cys His Arg Thr Ala Asp Lys Tyr Gly Thr Pro

450 455 460

Gly Asn Tyr Val Asn Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala

465 470 475 480

Asn Ala Met Met Asp Gln Gly Leu Val

485

<210> 149

<211> 1347

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_8序列

<400> 149

atgaacgtca ggcaatacat cgggagcttc atggagcagc tggtggcccg gaaccctgcc 60

cagcccgaat tccaccaggc tgtgaaagag gtggtcgagt ctctggagcc gtgcctcggg 120

cgtcacccgg aatacgtcga gcaccgcatc ctcgagcgca tgagcgagcc tgaccgcgtc 180

atcatgttca gggtcgcttg gcaagacgat cggggccagg cccaggtgaa ccgggcgttc 240

cgggtcgagt tcaacaacgc catcggcccc tacaaggggg gcctgcggtt ccacccgacc 300

gtgaacctcg gcatcctcaa gtttctcgga ttcgagcaga tcctgaagaa cagcctcact 360

acgctgccca tgggtggcgg caagggcggc agcgatttcg atcccaaggg gaagtccgac 420

ggcgaggtga tgcgcttctg ccagagcttc atgaacgagc tgcaccacta catcggccag 480

aacatcgacg tcccggcggg cgatatcggc gtcggggggc gcgagatcgg gttcctgttc 540

ggtcagttca agcggctcac ccactcgttc gagggcgtgc tcacgggcaa gggcctgggc 600

tggggcggct cgctcgtccg tccggaagcc accggctacg gctgcgtgta tttcgcggag 660

gagcagctca aggctcgcgg agagagcttc gccggcaaga cggtggccgt ctcgggctcc 720

gggaacgtgg cccaatacgc catcgagaag gtgaacgagc tcggcggcaa agtcgtgacg 780

ctctccgatt ccgacgggac catccacgat cccgacggca tccgcgacga aaagtgggcg 840

ttcctcatgg atctcaagaa cgtgcgccgc gggcggatcc gtgagtacgc gcagcgcttc 900

aaggccaatt acaaggaggg ggtgcggccg tggggcatca agtgcgacat cgccctgccg 960

tgcgcgaccc agaacgagat cagcggcgat gaagcgcgca cgctggtgaa aaatggctgc 1020

gtttgcgtcg cagagggcgc gaacatgccg accaccctcg agggcgtgga ggtgttcctc 1080

gccgccaaga tcctctacgg tccgggcaag gccgccaacg ccggcggtgt cgcgacgtcc 1140

gggctcgaga tgtcgcagaa cagcctccgg ctgtcgtgga gccgggaaga ggtcgaccag 1200

cggctgcgcg ggatcatgaa ggagatccac aagtcgtgcg tcgacaccgc ccgggagtac 1260

gaccagccgg gcaactacgt gctgggcgcc aacatagcgg gcttcacgaa ggtggcgaac 1320

gccatgatgg accaggggct ggtctag 1347

<210> 150

<211> 448

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_8序列

<400> 150

Met Asn Val Arg Gln Tyr Ile Gly Ser Phe Met Glu Gln Leu Val Ala

1 5 10 15

Arg Asn Pro Ala Gln Pro Glu Phe His Gln Ala Val Lys Glu Val Val

20 25 30

Glu Ser Leu Glu Pro Cys Leu Gly Arg His Pro Glu Tyr Val Glu His

35 40 45

Arg Ile Leu Glu Arg Met Ser Glu Pro Asp Arg Val Ile Met Phe Arg

50 55 60

Val Ala Trp Gln Asp Asp Arg Gly Gln Ala Gln Val Asn Arg Ala Phe

65 70 75 80

Arg Val Glu Phe Asn Asn Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg

85 90 95

Phe His Pro Thr Val Asn Leu Gly Ile Leu Lys Phe Leu Gly Phe Glu

100 105 110

Gln Ile Leu Lys Asn Ser Leu Thr Thr Leu Pro Met Gly Gly Gly Lys

115 120 125

Gly Gly Ser Asp Phe Asp Pro Lys Gly Lys Ser Asp Gly Glu Val Met

130 135 140

Arg Phe Cys Gln Ser Phe Met Asn Glu Leu His His Tyr Ile Gly Gln

145 150 155 160

Asn Ile Asp Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Ile

165 170 175

Gly Phe Leu Phe Gly Gln Phe Lys Arg Leu Thr His Ser Phe Glu Gly

180 185 190

Val Leu Thr Gly Lys Gly Leu Gly Trp Gly Gly Ser Leu Val Arg Pro

195 200 205

Glu Ala Thr Gly Tyr Gly Cys Val Tyr Phe Ala Glu Glu Gln Leu Lys

210 215 220

Ala Arg Gly Glu Ser Phe Ala Gly Lys Thr Val Ala Val Ser Gly Ser

225 230 235 240

Gly Asn Val Ala Gln Tyr Ala Ile Glu Lys Val Asn Glu Leu Gly Gly

245 250 255

Lys Val Val Thr Leu Ser Asp Ser Asp Gly Thr Ile His Asp Pro Asp

260 265 270

Gly Ile Arg Asp Glu Lys Trp Ala Phe Leu Met Asp Leu Lys Asn Val

275 280 285

Arg Arg Gly Arg Ile Arg Glu Tyr Ala Gln Arg Phe Lys Ala Asn Tyr

290 295 300

Lys Glu Gly Val Arg Pro Trp Gly Ile Lys Cys Asp Ile Ala Leu Pro

305 310 315 320

Cys Ala Thr Gln Asn Glu Ile Ser Gly Asp Glu Ala Arg Thr Leu Val

325 330 335

Lys Asn Gly Cys Val Cys Val Ala Glu Gly Ala Asn Met Pro Thr Thr

340 345 350

Leu Glu Gly Val Glu Val Phe Leu Ala Ala Lys Ile Leu Tyr Gly Pro

355 360 365

Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met

370 375 380

Ser Gln Asn Ser Leu Arg Leu Ser Trp Ser Arg Glu Glu Val Asp Gln

385 390 395 400

Arg Leu Arg Gly Ile Met Lys Glu Ile His Lys Ser Cys Val Asp Thr

405 410 415

Ala Arg Glu Tyr Asp Gln Pro Gly Asn Tyr Val Leu Gly Ala Asn Ile

420 425 430

Ala Gly Phe Thr Lys Val Ala Asn Ala Met Met Asp Gln Gly Leu Val

435 440 445

<210> 151

<211> 1374

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_9序列

<400> 151

atggcaaccg ccaagacatc gtccgccgtg caaaagcagg tggacgcgtt catgcagcat 60

gtgaaggtcc gcaacggcaa cgagcctgaa ttcctccagg ccgtgcacga agtggccgag 120

accgtgatcc ctttcatgga ggccaacccc aagtacaagg gcaagatgct cctggagcgc 180

atggtggagc ctgagcgcac catcctcttc cgcgtgccct gggtagacga tcgcggcaac 240

atccaagtga accgcggcta ccgcgtggag ttcaacagcg ctatcggtcc ttacaagggc 300

ggcctgcgct tccaccccac ggtgaccctc agcgtgttga agttcctggg cttcgagcaa 360

gtgttcaaga acagcctcac caccctgccc atgggcggcg gcaagggcgg tagcgatttc 420

gacccgaaag gcaagagcga taatgaagtg atgcgcttct gccagagctt catgaccgag 480

ctgtggcgcc acatcggtgc cgacacggac gtgcccgccg gcgacatcgg cgtgggcggc 540

cgcgagatcg gtttcatgtt cggccaggac aagcgcctgc gcaacgagtt cacgggcgtg 600

ttcacgggca agggccgcac gtggggcggt tcgctgatcc gtccggaggc caccggctac 660

ggctgcgtgt acttcgcgga ggagatgatg aagcgcaaca aggagagctt caagggcaag 720

acggtggcgg tgagcggcag cggcaacgtg gcccagtacg ccatcgagaa ggccacgcag 780

ctcggtgcga aagtggtgac ctgttccgac agcgacggca gcatcttcga tcccgcgggc 840

atcagcggcg acaagctcgc gttcatcatg gaactgaaga acgtgaagcg tggccgcatc 900

gaggaatacg cgaagaagtt caagggcagc acctacaaga agggcgcccg tgtgtgggac 960

gtggtatcca agtgcgacat cgccctgccc tgcgccacgc agaacgagct ggacggaaag 1020

aacgcgaagg acctgatcaa gaaaggcgtg cagtacgtgg ccgaaggcgc caacatgccc 1080

accaccccgg agggcatcga agccttccac gcggcgaagg tgtacttcgc gccgggcaag 1140

gccagcaacg ccggtggtgt ggccaccagc ggcctggaga tgagccagaa cagccagcgc 1200

ctcagctgga cgcgtgacga ggtggaccac cagctccaca agatcatgaa gaacatccat 1260

gccgcctgtg tgcagtatgg caccgaaggc aagcacgtga actacgtgaa gggcgccaac 1320

atcgccggct tcgtgaaagt ggccgacgcg atgctggacc aaggcgtggt gtag 1374

<210> 152

<211> 457

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_9序列

<400> 152

Met Ala Thr Ala Lys Thr Ser Ser Ala Val Gln Lys Gln Val Asp Ala

1 5 10 15

Phe Met Gln His Val Lys Val Arg Asn Gly Asn Glu Pro Glu Phe Leu

20 25 30

Gln Ala Val His Glu Val Ala Glu Thr Val Ile Pro Phe Met Glu Ala

35 40 45

Asn Pro Lys Tyr Lys Gly Lys Met Leu Leu Glu Arg Met Val Glu Pro

50 55 60

Glu Arg Thr Ile Leu Phe Arg Val Pro Trp Val Asp Asp Arg Gly Asn

65 70 75 80

Ile Gln Val Asn Arg Gly Tyr Arg Val Glu Phe Asn Ser Ala Ile Gly

85 90 95

Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Thr Val Thr Leu Ser Val

100 105 110

Leu Lys Phe Leu Gly Phe Glu Gln Val Phe Lys Asn Ser Leu Thr Thr

115 120 125

Leu Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe Asp Pro Lys Gly

130 135 140

Lys Ser Asp Asn Glu Val Met Arg Phe Cys Gln Ser Phe Met Thr Glu

145 150 155 160

Leu Trp Arg His Ile Gly Ala Asp Thr Asp Val Pro Ala Gly Asp Ile

165 170 175

Gly Val Gly Gly Arg Glu Ile Gly Phe Met Phe Gly Gln Asp Lys Arg

180 185 190

Leu Arg Asn Glu Phe Thr Gly Val Phe Thr Gly Lys Gly Arg Thr Trp

195 200 205

Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr Gly Cys Val Tyr

210 215 220

Phe Ala Glu Glu Met Met Lys Arg Asn Lys Glu Ser Phe Lys Gly Lys

225 230 235 240

Thr Val Ala Val Ser Gly Ser Gly Asn Val Ala Gln Tyr Ala Ile Glu

245 250 255

Lys Ala Thr Gln Leu Gly Ala Lys Val Val Thr Cys Ser Asp Ser Asp

260 265 270

Gly Ser Ile Phe Asp Pro Ala Gly Ile Ser Gly Asp Lys Leu Ala Phe

275 280 285

Ile Met Glu Leu Lys Asn Val Lys Arg Gly Arg Ile Glu Glu Tyr Ala

290 295 300

Lys Lys Phe Lys Gly Ser Thr Tyr Lys Lys Gly Ala Arg Val Trp Asp

305 310 315 320

Val Val Ser Lys Cys Asp Ile Ala Leu Pro Cys Ala Thr Gln Asn Glu

325 330 335

Leu Asp Gly Lys Asn Ala Lys Asp Leu Ile Lys Lys Gly Val Gln Tyr

340 345 350

Val Ala Glu Gly Ala Asn Met Pro Thr Thr Pro Glu Gly Ile Glu Ala

355 360 365

Phe His Ala Ala Lys Val Tyr Phe Ala Pro Gly Lys Ala Ser Asn Ala

370 375 380

Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ser Gln Asn Ser Gln Arg

385 390 395 400

Leu Ser Trp Thr Arg Asp Glu Val Asp His Gln Leu His Lys Ile Met

405 410 415

Lys Asn Ile His Ala Ala Cys Val Gln Tyr Gly Thr Glu Gly Lys His

420 425 430

Val Asn Tyr Val Lys Gly Ala Asn Ile Ala Gly Phe Val Lys Val Ala

435 440 445

Asp Ala Met Leu Asp Gln Gly Val Val

450 455

<210> 153

<211> 1329

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_10序列

<400> 153

tctctggagt cattcctcaa ccatgtccaa aagcgcgacc cgaatcaaac cgagttcgcg 60

caagccgttc gtgaagtaat gaccacactc tggccttttc ttgaacaaaa tccaaaatat 120

cgccagatgt cattactgga gcgtctggtt gaaccggagc gcgtgatcca gtttcgcgtg 180

gtatgggttg atgatcgcaa ccagatacag gtcaaccgtg catggcgtgt gcagttcagc 240

tctgccatcg gcccgtacaa aggcggtatg cgcttccatc cgtcagttaa cctttccatt 300

ctcaaattcc tcggctttga acaaaccttc aaaaatgccc tgactactct gccgatgggc 360

ggtggtaaag gcggcagcga tttcgatcag aaaggaaaaa gcgaaggtga agtgatgcgt 420

ttttgccagg cgctgatgac tgaactgtat cgccacctgg gcgcggatac cgacgttccg 480

gcaggtgata tcggggttgg tggtcgtgaa gtcggcttta tggcggggat gatgaaaaag 540

ctctccaaca ataccgcctg cgtcttcacc ggtaagggcc tttcatttgg cggcagtctt 600

attcgcccgg aagctaccgg ctacggtctg gtttatttca cagaagcaat gctaaaacgc 660

cacggtatgg gttttgaagg gatgcgcgtt tccgtttctg gctccggcaa cgtcgcccag 720

tacgctatcg aaaaagcgat ggaatttggt gctcgtgtga tcactgcgtc agactccagc 780

ggcactgtag ttgatgaaag cggattcacg aaagagaaac tggcacgtct tatcgaaatc 840

aaagccagcc gcgatggtcg agtggcagat tacgccaaag aatttggtct ggtctatctc 900

gaaggccaac agccgtggtc tctaccggtt gatatcgccc tgccttgcgc cacccagaat 960

gaactggatg ttgacgccgc gcatcagctt atcgctaatg gcgttaaagc cgtcgccgaa 1020

ggggcaaata tgccgaccac catcgaagcg actgaactgt tccagcaggc aggcgtacta 1080

tttgcaccgg gtaaagcggc taatgctggt ggcgtcgcta catcgggcct ggaaatggca 1140

caaaacgctg cgcgcctggg ctggaaagcc gagaaagttg acgcacgttt gcatcacatc 1200

atgctggata tccaccatgc ctgtgttgag catggtggtg aaggtgagca aaccaactac 1260

gtgcagggcg cgaacattgc cggttttgtg aaggttgccg atgcgatgct ggcgcagggt 1320

gtgatttaa 1329

<210> 154

<211> 442

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_10序列

<400> 154

Ser Leu Glu Ser Phe Leu Asn His Val Gln Lys Arg Asp Pro Asn Gln

1 5 10 15

Thr Glu Phe Ala Gln Ala Val Arg Glu Val Met Thr Thr Leu Trp Pro

20 25 30

Phe Leu Glu Gln Asn Pro Lys Tyr Arg Gln Met Ser Leu Leu Glu Arg

35 40 45

Leu Val Glu Pro Glu Arg Val Ile Gln Phe Arg Val Val Trp Val Asp

50 55 60

Asp Arg Asn Gln Ile Gln Val Asn Arg Ala Trp Arg Val Gln Phe Ser

65 70 75 80

Ser Ala Ile Gly Pro Tyr Lys Gly Gly Met Arg Phe His Pro Ser Val

85 90 95

Asn Leu Ser Ile Leu Lys Phe Leu Gly Phe Glu Gln Thr Phe Lys Asn

100 105 110

Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe

115 120 125

Asp Gln Lys Gly Lys Ser Glu Gly Glu Val Met Arg Phe Cys Gln Ala

130 135 140

Leu Met Thr Glu Leu Tyr Arg His Leu Gly Ala Asp Thr Asp Val Pro

145 150 155 160

Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Val Gly Phe Met Ala Gly

165 170 175

Met Met Lys Lys Leu Ser Asn Asn Thr Ala Cys Val Phe Thr Gly Lys

180 185 190

Gly Leu Ser Phe Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr

195 200 205

Gly Leu Val Tyr Phe Thr Glu Ala Met Leu Lys Arg His Gly Met Gly

210 215 220

Phe Glu Gly Met Arg Val Ser Val Ser Gly Ser Gly Asn Val Ala Gln

225 230 235 240

Tyr Ala Ile Glu Lys Ala Met Glu Phe Gly Ala Arg Val Ile Thr Ala

245 250 255

Ser Asp Ser Ser Gly Thr Val Val Asp Glu Ser Gly Phe Thr Lys Glu

260 265 270

Lys Leu Ala Arg Leu Ile Glu Ile Lys Ala Ser Arg Asp Gly Arg Val

275 280 285

Ala Asp Tyr Ala Lys Glu Phe Gly Leu Val Tyr Leu Glu Gly Gln Gln

290 295 300

Pro Trp Ser Leu Pro Val Asp Ile Ala Leu Pro Cys Ala Thr Gln Asn

305 310 315 320

Glu Leu Asp Val Asp Ala Ala His Gln Leu Ile Ala Asn Gly Val Lys

325 330 335

Ala Val Ala Glu Gly Ala Asn Met Pro Thr Thr Ile Glu Ala Thr Glu

340 345 350

Leu Phe Gln Gln Ala Gly Val Leu Phe Ala Pro Gly Lys Ala Ala Asn

355 360 365

Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ala Gln Asn Ala Ala

370 375 380

Arg Leu Gly Trp Lys Ala Glu Lys Val Asp Ala Arg Leu His His Ile

385 390 395 400

Met Leu Asp Ile His His Ala Cys Val Glu His Gly Gly Glu Gly Glu

405 410 415

Gln Thr Asn Tyr Val Gln Gly Ala Asn Ile Ala Gly Phe Val Lys Val

420 425 430

Ala Asp Ala Met Leu Ala Gln Gly Val Ile

435 440

<210> 155

<211> 1341

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_11序列

<400> 155

atgatcgaat ccgtcgacaa tttccttgca cgcctgcaac agcgtgaccc tggccaaccc 60

gagtttcacc aggccgtcga agaagtgctg cgcaccttgt ggccctttct ggaagccaac 120

cctcactacc tgcaagcggg cattctcgag cgcatggtcg agcctgagcg tgcagtgttg 180

ttccgggtgt cgtgggtgga cgatcacggc aaggttcagg tcaaccgcgg ttaccgtatc 240

cagatgaaca gcgccattgg cccctacaag ggcggcctgc gcttccaccc ttcggtgaac 300

ctcagtgttc tgaaattcct cgcattcgag caagtcttca agaactccct gacctcgctg 360

cccatgggcg gtggcaaggg tgggtctgac ttcgatccca agggcaagag cgacgccgaa 420

gtgatgcgct tctgccaggc cttcatgagc gagctgtacc gtcacatcgg tgccgactgc 480

gatgttccgg ccggggacat cggagtaggg gcgcgcgaga tcggctatat gttcgggcaa 540

tacaagcgtc tggccaacca gttcacctcc gtgctgaccg gcaagggcat gacctatggc 600

ggcagcctga ttcgtccgga agccacgggc tatggctgtg tgtattttgc cgaggagatg 660

ctcaagcgcc agggccagcg catcgacggc cgtcgcgtgg cgatctcggg ctcgggcaac 720

gtcgcgcaat acgccgcgcg caaggtgatg gacctggggg gcaaggtgat ctcgctgtct 780

gattccgaag gtaccttgta cgccgaagcg ggcctcaccg acgcgcagtg ggaagcggtg 840

atgaccctca agaacgtcaa gcgcggccgc atcagcgagc tggccgggca attcggcctg 900

gagttccgca agggccagac gccgtggagc ctggcctgcg acatcgcgtt gccatgcgcg 960

acgcagaacg aactggacgt cgaggatgcc aaggcactgt tggccaacgg ctgtatctgc 1020

gtcgcagaag gcgccaacat gcccacgacc ctggcggctg tggacatctt ccttgaagct 1080

ggcatcctct atgcgccggg caaggcgtcc aatgcgggtg gcgttgcggt gtcgggcctg 1140

gaaatgtcgc agaacgccat gcgcttgctg tggactgctg gcgaagtgga cagcaagctg 1200

catggcatca tgcagtcgat tcaccacgcc tgcgttcact atggtgaaga gggcgatggc 1260

cgggtcaact atgtcaaagg ggccaacatt gcgggcttcg tgaaggtggc cgatgcgatg 1320

ctggcccaag gggtcgtctg a 1341

<210> 156

<211> 446

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_11序列

<400> 156

Met Ile Glu Ser Val Asp Asn Phe Leu Ala Arg Leu Gln Gln Arg Asp

1 5 10 15

Pro Gly Gln Pro Glu Phe His Gln Ala Val Glu Glu Val Leu Arg Thr

20 25 30

Leu Trp Pro Phe Leu Glu Ala Asn Pro His Tyr Leu Gln Ala Gly Ile

35 40 45

Leu Glu Arg Met Val Glu Pro Glu Arg Ala Val Leu Phe Arg Val Ser

50 55 60

Trp Val Asp Asp His Gly Lys Val Gln Val Asn Arg Gly Tyr Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Ser Val Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ser Leu Thr Ser Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ser Asp Phe Asp Pro Lys Gly Lys Ser Asp Ala Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Ser Glu Leu Tyr Arg His Ile Gly Ala Asp Cys

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Tyr

165 170 175

Met Phe Gly Gln Tyr Lys Arg Leu Ala Asn Gln Phe Thr Ser Val Leu

180 185 190

Thr Gly Lys Gly Met Thr Tyr Gly Gly Ser Leu Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Cys Val Tyr Phe Ala Glu Glu Met Leu Lys Arg Gln

210 215 220

Gly Gln Arg Ile Asp Gly Arg Arg Val Ala Ile Ser Gly Ser Gly Asn

225 230 235 240

Val Ala Gln Tyr Ala Ala Arg Lys Val Met Asp Leu Gly Gly Lys Val

245 250 255

Ile Ser Leu Ser Asp Ser Glu Gly Thr Leu Tyr Ala Glu Ala Gly Leu

260 265 270

Thr Asp Ala Gln Trp Glu Ala Val Met Thr Leu Lys Asn Val Lys Arg

275 280 285

Gly Arg Ile Ser Glu Leu Ala Gly Gln Phe Gly Leu Glu Phe Arg Lys

290 295 300

Gly Gln Thr Pro Trp Ser Leu Ala Cys Asp Ile Ala Leu Pro Cys Ala

305 310 315 320

Thr Gln Asn Glu Leu Asp Val Glu Asp Ala Lys Ala Leu Leu Ala Asn

325 330 335

Gly Cys Ile Cys Val Ala Glu Gly Ala Asn Met Pro Thr Thr Leu Ala

340 345 350

Ala Val Asp Ile Phe Leu Glu Ala Gly Ile Leu Tyr Ala Pro Gly Lys

355 360 365

Ala Ser Asn Ala Gly Gly Val Ala Val Ser Gly Leu Glu Met Ser Gln

370 375 380

Asn Ala Met Arg Leu Leu Trp Thr Ala Gly Glu Val Asp Ser Lys Leu

385 390 395 400

His Gly Ile Met Gln Ser Ile His His Ala Cys Val His Tyr Gly Glu

405 410 415

Glu Gly Asp Gly Arg Val Asn Tyr Val Lys Gly Ala Asn Ile Ala Gly

420 425 430

Phe Val Lys Val Ala Asp Ala Met Leu Ala Gln Gly Val Val

435 440 445

<210> 157

<211> 1374

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_12序列

<400> 157

atggcaaccg ccaagacatc gtccgccgtg caaaagcagg tggacgcgtt catgcagcat 60

gtgaaggtcc gcaacggcaa cgagcctgaa ttcctccagg ccgtgcacga agtggccgag 120

accgtgatcc ctttcatgga ggccaacccc aagtacaagg gcaagatgct cctggagcgc 180

atggtggagc ctgagcgcac catcctcttc cgcgtgccct gggtagacga tcgcggcaac 240

atccaagtga accgcggcta ccgcgtggag ttcaacagcg ctatcggtcc ttacaagggc 300

ggcctgcgct tccaccccac ggtgaccctc agcgtgttga agttcctggg cttcgagcaa 360

gtgttcaaga acagcctcac caccctgccc atgggcggcg gcaagggcgg tagcgatttc 420

gacccgaaag gcaagagcga taatgaagtg atgcgcttct gccagagctt catgaccgag 480

ctgtggcgcc acatcggtgc cgacacggac gtgcccgccg gcgacatcgg cgtgggcggc 540

cgcgagatcg gtttcatgtt cggccaggac aagcgcctgc gcaacgagtt cacgggcgtg 600

ttcacgggca agggccgcac gtggggcggt tcgctgatcc gtccggaggc caccggctac 660

ggctgcgtgt acttcgcgga ggagatgatg aagcgcaaca aggagagctt caagggcaag 720

acggtggcgg tgagcggcag cggcaacgtg gcccagtacg ccatcgagaa ggccacgcag 780

ctcggtgcga aagtggtgac ctgttccgac agcgacggca gcatcttcga tcccgcgggc 840

atcagcggcg acaagctcgc gttcatcatg gaactgaaga acgtgaagcg tggccgcatc 900

gaggaatacg cgaagaagtt caagggcagc acctacaaga agggcgcccg tgtgtgggac 960

gtggtatcca agtgcgacat cgccctgccc tgcgccacgc agaacgagct ggacggaaag 1020

aacgcgaagg acctgatcaa gaaaggcgtg cagtacgtgg ccgaaggcgc caacatgccc 1080

accaccccgg agggcatcga agccttccac gcggcgaagg tgtacttcgc gccgggcaag 1140

gccagcaacg ccggtggtgt ggccaccagc ggcctggaga tgagccagaa cagccagcgc 1200

ctcagctgga cgcgtgacga ggtggaccac cagctccaca agatcatgaa gaacatccat 1260

gccgcctgtg tgcagtatgg caccgaaggc aagcacgtga actacgtgaa gggcgccaac 1320

atcgccggct tcgtgaaagt ggccgacgcg atgctggacc aaggcgtggt gtag 1374

<210> 158

<211> 457

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_12序列

<400> 158

Met Ala Thr Ala Lys Thr Ser Ser Ala Val Gln Lys Gln Val Asp Ala

1 5 10 15

Phe Met Gln His Val Lys Val Arg Asn Gly Asn Glu Pro Glu Phe Leu

20 25 30

Gln Ala Val His Glu Val Ala Glu Thr Val Ile Pro Phe Met Glu Ala

35 40 45

Asn Pro Lys Tyr Lys Gly Lys Met Leu Leu Glu Arg Met Val Glu Pro

50 55 60

Glu Arg Thr Ile Leu Phe Arg Val Pro Trp Val Asp Asp Arg Gly Asn

65 70 75 80

Ile Gln Val Asn Arg Gly Tyr Arg Val Glu Phe Asn Ser Ala Ile Gly

85 90 95

Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Thr Val Thr Leu Ser Val

100 105 110

Leu Lys Phe Leu Gly Phe Glu Gln Val Phe Lys Asn Ser Leu Thr Thr

115 120 125

Leu Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe Asp Pro Lys Gly

130 135 140

Lys Ser Asp Asn Glu Val Met Arg Phe Cys Gln Ser Phe Met Thr Glu

145 150 155 160

Leu Trp Arg His Ile Gly Ala Asp Thr Asp Val Pro Ala Gly Asp Ile

165 170 175

Gly Val Gly Gly Arg Glu Ile Gly Phe Met Phe Gly Gln Asp Lys Arg

180 185 190

Leu Arg Asn Glu Phe Thr Gly Val Phe Thr Gly Lys Gly Arg Thr Trp

195 200 205

Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr Gly Cys Val Tyr

210 215 220

Phe Ala Glu Glu Met Met Lys Arg Asn Lys Glu Ser Phe Lys Gly Lys

225 230 235 240

Thr Val Ala Val Ser Gly Ser Gly Asn Val Ala Gln Tyr Ala Ile Glu

245 250 255

Lys Ala Thr Gln Leu Gly Ala Lys Val Val Thr Cys Ser Asp Ser Asp

260 265 270

Gly Ser Ile Phe Asp Pro Ala Gly Ile Ser Gly Asp Lys Leu Ala Phe

275 280 285

Ile Met Glu Leu Lys Asn Val Lys Arg Gly Arg Ile Glu Glu Tyr Ala

290 295 300

Lys Lys Phe Lys Gly Ser Thr Tyr Lys Lys Gly Ala Arg Val Trp Asp

305 310 315 320

Val Val Ser Lys Cys Asp Ile Ala Leu Pro Cys Ala Thr Gln Asn Glu

325 330 335

Leu Asp Gly Lys Asn Ala Lys Asp Leu Ile Lys Lys Gly Val Gln Tyr

340 345 350

Val Ala Glu Gly Ala Asn Met Pro Thr Thr Pro Glu Gly Ile Glu Ala

355 360 365

Phe His Ala Ala Lys Val Tyr Phe Ala Pro Gly Lys Ala Ser Asn Ala

370 375 380

Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ser Gln Asn Ser Gln Arg

385 390 395 400

Leu Ser Trp Thr Arg Asp Glu Val Asp His Gln Leu His Lys Ile Met

405 410 415

Lys Asn Ile His Ala Ala Cys Val Gln Tyr Gly Thr Glu Gly Lys His

420 425 430

Val Asn Tyr Val Lys Gly Ala Asn Ile Ala Gly Phe Val Lys Val Ala

435 440 445

Asp Ala Met Leu Asp Gln Gly Val Val

450 455

<210> 159

<211> 1383

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_13序列

<400> 159

atggcgacgc tcctcacgac ctcggaagct ctgacgcgtg acacagggaa agccgtcgat 60

cgattcatgg atggcctcgt cgcgcgcaac ccgaatcagc cggagttcca tcaggccgtt 120

cgcgaggtgt gcgagagcgt catgccactc gtcttggagc ggccggagta cgaggaggcc 180

ggaatcctcg agcgcctcac ggaaccggat cgcattctca cgttccgtgt ggcctggcag 240

gatgacgagg gtcgcgtccg tatcaatcgc gcgtatcgcg tgcagttcaa caacgccatt 300

ggtccttaca aaggcggcct ccgctttcat cccaccgtcg atctctcggt cctcaagttc 360

ctcggcttcg agcagatctt caagaacagt ctgaccggtc ttcccatggg tggcgcgaag 420

ggtggttctg atttcgatcc caaagggaag tcggataacg aggtcatgcg gttctgtcag 480

gcgatgatgt cggagctgtg tcacgacatc ggcgaagacg tcgatgtgcc ggccggcgac 540

attggtgtgg gcgctcgaga gattgggtat ctgtttggcg aatatcgtcg cctcatgcgt 600

cgcgtcgctg gtgtgctcac ggggaagggc ttgtcgtttg gcggcagctt gattcggacc 660

gaagccactg gctatggctg cgtctatttc gtcgagaaca tgttgaatca cattggtgat 720

tccctcgacg gcaagacctg tgtcgtttcc ggatcaggca acgtcgcgct ctacacggtc 780

gagaaggtga cggctctcgg cgggaaggtc gtcacgctct cggactcgga cggcttcatc 840

tacgatcgcg acggcatcga tgcggagaag ctggagtggg tcaaggagct caaagaggtt 900

cgtcgcggtc gcatcagcga gtacgcggag cacttcggcg cccagttcca cgccgatgag 960

cggccgtggc atgtcgagtg tcaggcggcg tttccttctg ccacgcagaa cgagttggac 1020

aaagaggatg cggaagtgct ggtcgcgaac ggctgtctcg cggtcggcga aggcgcgaac 1080

atgccgagca cgagcgacgc aacgcgcgtg ttcctcgagg ctggaacgct cttcgcgccg 1140

ggcaaggccg cgaatgccgg tggggtcgcc gtctcaggtc tcgagcagag tcagaatgcg 1200

cagcgcctgt tctggacgcg tgacgaggtg gatcttcgtc ttcaaggcat catgaagacg 1260

atccacgaca agtgcgttga gcagggtcgc gtcgcgaatg gtcagatcaa ctacgtgcag 1320

ggcgccaatc gcgcaggttt cctcaaggtc gccgacgcga tgctcgcgca aggcgtattt 1380

tga 1383

<210> 160

<211> 460

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_13序列

<400> 160

Met Ala Thr Leu Leu Thr Thr Ser Glu Ala Leu Thr Arg Asp Thr Gly

1 5 10 15

Lys Ala Val Asp Arg Phe Met Asp Gly Leu Val Ala Arg Asn Pro Asn

20 25 30

Gln Pro Glu Phe His Gln Ala Val Arg Glu Val Cys Glu Ser Val Met

35 40 45

Pro Leu Val Leu Glu Arg Pro Glu Tyr Glu Glu Ala Gly Ile Leu Glu

50 55 60

Arg Leu Thr Glu Pro Asp Arg Ile Leu Thr Phe Arg Val Ala Trp Gln

65 70 75 80

Asp Asp Glu Gly Arg Val Arg Ile Asn Arg Ala Tyr Arg Val Gln Phe

85 90 95

Asn Asn Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Thr

100 105 110

Val Asp Leu Ser Val Leu Lys Phe Leu Gly Phe Glu Gln Ile Phe Lys

115 120 125

Asn Ser Leu Thr Gly Leu Pro Met Gly Gly Ala Lys Gly Gly Ser Asp

130 135 140

Phe Asp Pro Lys Gly Lys Ser Asp Asn Glu Val Met Arg Phe Cys Gln

145 150 155 160

Ala Met Met Ser Glu Leu Cys His Asp Ile Gly Glu Asp Val Asp Val

165 170 175

Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Tyr Leu Phe

180 185 190

Gly Glu Tyr Arg Arg Leu Met Arg Arg Val Ala Gly Val Leu Thr Gly

195 200 205

Lys Gly Leu Ser Phe Gly Gly Ser Leu Ile Arg Thr Glu Ala Thr Gly

210 215 220

Tyr Gly Cys Val Tyr Phe Val Glu Asn Met Leu Asn His Ile Gly Asp

225 230 235 240

Ser Leu Asp Gly Lys Thr Cys Val Val Ser Gly Ser Gly Asn Val Ala

245 250 255

Leu Tyr Thr Val Glu Lys Val Thr Ala Leu Gly Gly Lys Val Val Thr

260 265 270

Leu Ser Asp Ser Asp Gly Phe Ile Tyr Asp Arg Asp Gly Ile Asp Ala

275 280 285

Glu Lys Leu Glu Trp Val Lys Glu Leu Lys Glu Val Arg Arg Gly Arg

290 295 300

Ile Ser Glu Tyr Ala Glu His Phe Gly Ala Gln Phe His Ala Asp Glu

305 310 315 320

Arg Pro Trp His Val Glu Cys Gln Ala Ala Phe Pro Ser Ala Thr Gln

325 330 335

Asn Glu Leu Asp Lys Glu Asp Ala Glu Val Leu Val Ala Asn Gly Cys

340 345 350

Leu Ala Val Gly Glu Gly Ala Asn Met Pro Ser Thr Ser Asp Ala Thr

355 360 365

Arg Val Phe Leu Glu Ala Gly Thr Leu Phe Ala Pro Gly Lys Ala Ala

370 375 380

Asn Ala Gly Gly Val Ala Val Ser Gly Leu Glu Gln Ser Gln Asn Ala

385 390 395 400

Gln Arg Leu Phe Trp Thr Arg Asp Glu Val Asp Leu Arg Leu Gln Gly

405 410 415

Ile Met Lys Thr Ile His Asp Lys Cys Val Glu Gln Gly Arg Val Ala

420 425 430

Asn Gly Gln Ile Asn Tyr Val Gln Gly Ala Asn Arg Ala Gly Phe Leu

435 440 445

Lys Val Ala Asp Ala Met Leu Ala Gln Gly Val Phe

450 455 460

<210> 161

<211> 1347

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_14序列

<400> 161

atgaacgcga aaacgtatct ggcgagcttc atggagcagc tcgtcactcg caatccggca 60

gagcaggagt ttcaccaggc cgtacgcgag gtcgtcgaat ctctcgagcc gtgtttggag 120

cggcaccctg aatacatcga tcattcgatc ctcgagcgca tggccgagcc cgatcgcgtc 180

atcagtttcc gcgtcgcgtg gcaggacgat cgcggccgcc cccatgtcaa tcgcggcttc 240

cgtgtggagt tcaacaatgc aatcgggccc tacaaggggg gcctccgatt tcaccccacc 300

gtcaatctca gcatcctcaa gttcctcggt ttcgaacaga tcttgaagaa cagcttgacc 360

acgctgccga tgggcggcgc caaagggggg agcaacttcg atcccaaggg caaatccgac 420

agcgaggtga tgcgattttg ccagagcttc atgaacgagc tctatcggca tatcggctcc 480

gacatcgacg tgccggccgg tgacatcggc gtcggcggac gcgagatagg gtttctcttc 540

ggtcaataca agaagctgac ccactccttc gaaggcgtgc tcaccggcaa agggctcggc 600

tggggcgggt cgctcattcg ccccgaggcc accggttacg gctgcgtgta tttcgccgaa 660

gagatgctga aaacgcgcgg ccagagcttc aagggcaaaa cggtgacggt gtcgggctcc 720

ggcaacgtcg cccaatattc ggtggagaag gtcaatcagc taggcggcag ggtggtgtcg 780

ctgtccgact cggaagggac catttacgat ccggatggca tccgcgacga caagtgggaa 840

ttcttgctga cgctaaaaaa cgtgcggcgc gggcgtctgc gcgaatatgc cgagcgcttc 900

aaggccgagt tccgcgatgg cgtgtgcccg tggagcatca aatgcgatgt cgccctcccg 960

agcgccacgc aaaacgaaat ctccgccgag gacgccaagg cactcgtcaa aaatggctgc 1020

atctgcgtgg cggaaggggc gaacatgccc actaccgccg aaggggtgga gatcttccag 1080

aaaggtaaag tcctcttcgg gccgggcaaa gccgccaacg ccgggggcgt cgctacctcg 1140

ggactcgaga tgtcgcaaaa cagcctgcgc ctctcttgga cgcgcgaaga ggtcgatcgg 1200

cgtctttacg acatcatgaa ggccattcac cacgcctgcg tcacgacggc ccacgagtac 1260

gatcgccccg gcgactacgt gctcggcgcc aacatcgcag gcttcgtcaa ggtggccaac 1320

gcgatgatcg atcaaggcct ggtctga 1347

<210> 162

<211> 448

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_14序列

<400> 162

Met Asn Ala Lys Thr Tyr Leu Ala Ser Phe Met Glu Gln Leu Val Thr

1 5 10 15

Arg Asn Pro Ala Glu Gln Glu Phe His Gln Ala Val Arg Glu Val Val

20 25 30

Glu Ser Leu Glu Pro Cys Leu Glu Arg His Pro Glu Tyr Ile Asp His

35 40 45

Ser Ile Leu Glu Arg Met Ala Glu Pro Asp Arg Val Ile Ser Phe Arg

50 55 60

Val Ala Trp Gln Asp Asp Arg Gly Arg Pro His Val Asn Arg Gly Phe

65 70 75 80

Arg Val Glu Phe Asn Asn Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg

85 90 95

Phe His Pro Thr Val Asn Leu Ser Ile Leu Lys Phe Leu Gly Phe Glu

100 105 110

Gln Ile Leu Lys Asn Ser Leu Thr Thr Leu Pro Met Gly Gly Ala Lys

115 120 125

Gly Gly Ser Asn Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met

130 135 140

Arg Phe Cys Gln Ser Phe Met Asn Glu Leu Tyr Arg His Ile Gly Ser

145 150 155 160

Asp Ile Asp Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Ile

165 170 175

Gly Phe Leu Phe Gly Gln Tyr Lys Lys Leu Thr His Ser Phe Glu Gly

180 185 190

Val Leu Thr Gly Lys Gly Leu Gly Trp Gly Gly Ser Leu Ile Arg Pro

195 200 205

Glu Ala Thr Gly Tyr Gly Cys Val Tyr Phe Ala Glu Glu Met Leu Lys

210 215 220

Thr Arg Gly Gln Ser Phe Lys Gly Lys Thr Val Thr Val Ser Gly Ser

225 230 235 240

Gly Asn Val Ala Gln Tyr Ser Val Glu Lys Val Asn Gln Leu Gly Gly

245 250 255

Arg Val Val Ser Leu Ser Asp Ser Glu Gly Thr Ile Tyr Asp Pro Asp

260 265 270

Gly Ile Arg Asp Asp Lys Trp Glu Phe Leu Leu Thr Leu Lys Asn Val

275 280 285

Arg Arg Gly Arg Leu Arg Glu Tyr Ala Glu Arg Phe Lys Ala Glu Phe

290 295 300

Arg Asp Gly Val Cys Pro Trp Ser Ile Lys Cys Asp Val Ala Leu Pro

305 310 315 320

Ser Ala Thr Gln Asn Glu Ile Ser Ala Glu Asp Ala Lys Ala Leu Val

325 330 335

Lys Asn Gly Cys Ile Cys Val Ala Glu Gly Ala Asn Met Pro Thr Thr

340 345 350

Ala Glu Gly Val Glu Ile Phe Gln Lys Gly Lys Val Leu Phe Gly Pro

355 360 365

Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met

370 375 380

Ser Gln Asn Ser Leu Arg Leu Ser Trp Thr Arg Glu Glu Val Asp Arg

385 390 395 400

Arg Leu Tyr Asp Ile Met Lys Ala Ile His His Ala Cys Val Thr Thr

405 410 415

Ala His Glu Tyr Asp Arg Pro Gly Asp Tyr Val Leu Gly Ala Asn Ile

420 425 430

Ala Gly Phe Val Lys Val Ala Asn Ala Met Ile Asp Gln Gly Leu Val

435 440 445

<210> 163

<211> 1338

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_15序列

<400> 163

atgaattcag tccaggaagt cctcgacatc gttcatcgaa gaaatccaca tcagcctgaa 60

ttccttcagg cggtgacgga agtcttcgag tcgatcagtc cagtgatcga acggcgcaaa 120

gattatcgcg acgccaacat tctcgagcgc atcgtcgagc cggaacggat gattcagttc 180

cgtgttccgt ggattgacga tgcgggcaag gtgcgggtga atcgcggcta tcgcgtgcaa 240

atgaacagcg cgctcggtcc gtacaagggc gggctgcgtt ttcatcccac agttaatgcc 300

agcattctga aattcctcgc atttgaacag gtgttcaaaa attcgctcac gactctgccg 360

atgggcggcg gcaaaggcgg cgccgatttc gatccgaaga acaagtcgga caacgaagtg 420

atgcattttt gccaatcttt catgaccgaa ttgttccgcc atgtcggccc cgacacagac 480

gtgccggcgg gcgacattgg agttgggggt cgtgagattg gttacttgtt cggccaatac 540

aaacggctgg ctaatgaatt taccggcgtg ctgaccggca aatcattgaa ctggggcggc 600

tcgctcatcc ggccgcaagc taccggttat ggcgcggttt atttcgcgga agagatgctg 660

aagacgcgca gcgaaggttt ggaaggaaga gtgtgcactg tctccggctc gggtaacgcc 720

gcgcaataca cggtttcgaa gttaaaccag gtcggcgcca aggttgtcac gatgtctgat 780

tccagtggtt tcatttatga caaggatggg atcaccgagg aaaaactaag ctgggtgatg 840

gaactgaaaa acgaacggcg cggtcgcatc aaagaatacg ccaatttttt caaagcgacg 900

tatgtcgacg gcaaaccgcc atggagtgtt ccatgcgaat gcgccttccc gtgcgcaacg 960

cagaacgaaa ttagcggcga agacgcgaag attctgctcg caaacggttg ctttctcgtt 1020

tccgaagcgg ccaacatgcc gaccgcgccc gcaggagttg acctttttct ggcgaacaaa 1080

atcctttacg gtcccggcaa ggccgcgaat gctggtggcg tggccgtttc gggattggag 1140

atggcgcaaa attcgatgcg cttaccctgg ccgcgcgcgg aagtcgatca acgccttcgc 1200

cagattatgg ccaccattca taagaacgca tggaacacag cggcggaata cgaccagccg 1260

ggtaaccttg ttatcggcgc caacatcgcc gggttcgtta aggtagctga cgcgatgctc 1320

gatcagggcg tggtctag 1338

<210> 164

<211> 445

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_15序列

<400> 164

Met Asn Ser Val Gln Glu Val Leu Asp Ile Val His Arg Arg Asn Pro

1 5 10 15

His Gln Pro Glu Phe Leu Gln Ala Val Thr Glu Val Phe Glu Ser Ile

20 25 30

Ser Pro Val Ile Glu Arg Arg Lys Asp Tyr Arg Asp Ala Asn Ile Leu

35 40 45

Glu Arg Ile Val Glu Pro Glu Arg Met Ile Gln Phe Arg Val Pro Trp

50 55 60

Ile Asp Asp Ala Gly Lys Val Arg Val Asn Arg Gly Tyr Arg Val Gln

65 70 75 80

Met Asn Ser Ala Leu Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro

85 90 95

Thr Val Asn Ala Ser Ile Leu Lys Phe Leu Ala Phe Glu Gln Val Phe

100 105 110

Lys Asn Ser Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly Ala

115 120 125

Asp Phe Asp Pro Lys Asn Lys Ser Asp Asn Glu Val Met His Phe Cys

130 135 140

Gln Ser Phe Met Thr Glu Leu Phe Arg His Val Gly Pro Asp Thr Asp

145 150 155 160

Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg Glu Ile Gly Tyr Leu

165 170 175

Phe Gly Gln Tyr Lys Arg Leu Ala Asn Glu Phe Thr Gly Val Leu Thr

180 185 190

Gly Lys Ser Leu Asn Trp Gly Gly Ser Leu Ile Arg Pro Gln Ala Thr

195 200 205

Gly Tyr Gly Ala Val Tyr Phe Ala Glu Glu Met Leu Lys Thr Arg Ser

210 215 220

Glu Gly Leu Glu Gly Arg Val Cys Thr Val Ser Gly Ser Gly Asn Ala

225 230 235 240

Ala Gln Tyr Thr Val Ser Lys Leu Asn Gln Val Gly Ala Lys Val Val

245 250 255

Thr Met Ser Asp Ser Ser Gly Phe Ile Tyr Asp Lys Asp Gly Ile Thr

260 265 270

Glu Glu Lys Leu Ser Trp Val Met Glu Leu Lys Asn Glu Arg Arg Gly

275 280 285

Arg Ile Lys Glu Tyr Ala Asn Phe Phe Lys Ala Thr Tyr Val Asp Gly

290 295 300

Lys Pro Pro Trp Ser Val Pro Cys Glu Cys Ala Phe Pro Cys Ala Thr

305 310 315 320

Gln Asn Glu Ile Ser Gly Glu Asp Ala Lys Ile Leu Leu Ala Asn Gly

325 330 335

Cys Phe Leu Val Ser Glu Ala Ala Asn Met Pro Thr Ala Pro Ala Gly

340 345 350

Val Asp Leu Phe Leu Ala Asn Lys Ile Leu Tyr Gly Pro Gly Lys Ala

355 360 365

Ala Asn Ala Gly Gly Val Ala Val Ser Gly Leu Glu Met Ala Gln Asn

370 375 380

Ser Met Arg Leu Pro Trp Pro Arg Ala Glu Val Asp Gln Arg Leu Arg

385 390 395 400

Gln Ile Met Ala Thr Ile His Lys Asn Ala Trp Asn Thr Ala Ala Glu

405 410 415

Tyr Asp Gln Pro Gly Asn Leu Val Ile Gly Ala Asn Ile Ala Gly Phe

420 425 430

Val Lys Val Ala Asp Ala Met Leu Asp Gln Gly Val Val

435 440 445

<210> 165

<211> 1422

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_16序列

<400> 165

atgtcgtcgc aagtagcctc accgacccga aagctggttc gtcctcccgt ctcacccgcg 60

acccgtgact acattgccgc gctgctcgcg gaggtgaagg cgaagaatcc ggcggagccg 120

gagttccacc aggcggtgca cgaagtcgcc gagtcggtgg gactcgtgct cgagcgccac 180

ccggaatacc gctcggcgaa gatcctggag cggatcatcg agccggagcg cgtcatcatg 240

ttccgtgtcc cgtggctgga cgacgcgggc gaggtccagg tgaaccgcgg cttccgcatc 300

gagatgaaca gcgcgatcgg cccgtacaag ggcgggctgc gttttcacgc ttccgtcaac 360

ctcggcatcc tgaagttcct cgcgttcgag caggtcttca agaacgcgct gacgacgctg 420

ccgatggggg gcggcaaggg cggttccgac ttcgatccga aaggcaggag cgacgcggaa 480

gtcatgcgct tctgtcagag cttcatgacg gagctggcgc ggcacatcgg cgcggacacg 540

gacgtgccgg cgggcgacat cggcgtgggt ggacgcgaaa tcggttatct gttcgggcag 600

tacaagcgga tccgcaacga attcgcgggc gtgctcacgg gcaaggggct caactggggc 660

ggctcgctga tccgtccgga ggcgacgggg tacggcgctg tctacttcgc ggcggagatg 720

ctggcgaccc gcagcgacac cctggcgggc aaggtgtgtc tcgtgtcggg cagcggcaac 780

gtcgcccagt acacggtcga gaagctgctc gcgcacggcg cgaaggtggt gaccctgtcg 840

gactccgctg gtcacgtcta cgacgaagcc ggcatgacgg cggagaagct ggcctatgtg 900

atgaagctga agaacgagcg gcgcggccgg atcgcggagt acgtcgagaa gtatcgggac 960

gcggtgtata cgccggccga tgccgcgcgt ggcttcgatg cgctgtggga tcataaggcc 1020

gactgcgcgt ttccgagcgc gacgcagaac gagatcggcc ggcaggatgc gcagaatctg 1080

ctgatcaacg gcgtatacgt cgtgtcggag ggcgcaaaca tgccgtgcac gccggaagcg 1140

gtcgaactgt tcctcgaaca caatgtgctg tacggcccgg gcaaagcggc gaacgcgggc 1200

ggcgtggcgg tctccggact cgagatgtcg cagaacagca tgcgcctgcg ctggacgcgc 1260

gaggaagtcg atcaccggct gcagcagatc atgcacgaga ttcacgcgac gtgtctggcg 1320

gcggcggagc ggttcggcgc tccgagcaat tacgtgcacg gcgcgaacat cgcgggattc 1380

ctgaaggttg ccgacgcgat gctcgatcag ggtctcgtat ag 1422

<210> 166

<211> 473

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_16序列

<400> 166

Met Ser Ser Gln Val Ala Ser Pro Thr Arg Lys Leu Val Arg Pro Pro

1 5 10 15

Val Ser Pro Ala Thr Arg Asp Tyr Ile Ala Ala Leu Leu Ala Glu Val

20 25 30

Lys Ala Lys Asn Pro Ala Glu Pro Glu Phe His Gln Ala Val His Glu

35 40 45

Val Ala Glu Ser Val Gly Leu Val Leu Glu Arg His Pro Glu Tyr Arg

50 55 60

Ser Ala Lys Ile Leu Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Met

65 70 75 80

Phe Arg Val Pro Trp Leu Asp Asp Ala Gly Glu Val Gln Val Asn Arg

85 90 95

Gly Phe Arg Ile Glu Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly

100 105 110

Leu Arg Phe His Ala Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala

115 120 125

Phe Glu Gln Val Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly

130 135 140

Gly Lys Gly Gly Ser Asp Phe Asp Pro Lys Gly Arg Ser Asp Ala Glu

145 150 155 160

Val Met Arg Phe Cys Gln Ser Phe Met Thr Glu Leu Ala Arg His Ile

165 170 175

Gly Ala Asp Thr Asp Val Pro Ala Gly Asp Ile Gly Val Gly Gly Arg

180 185 190

Glu Ile Gly Tyr Leu Phe Gly Gln Tyr Lys Arg Ile Arg Asn Glu Phe

195 200 205

Ala Gly Val Leu Thr Gly Lys Gly Leu Asn Trp Gly Gly Ser Leu Ile

210 215 220

Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met

225 230 235 240

Leu Ala Thr Arg Ser Asp Thr Leu Ala Gly Lys Val Cys Leu Val Ser

245 250 255

Gly Ser Gly Asn Val Ala Gln Tyr Thr Val Glu Lys Leu Leu Ala His

260 265 270

Gly Ala Lys Val Val Thr Leu Ser Asp Ser Ala Gly His Val Tyr Asp

275 280 285

Glu Ala Gly Met Thr Ala Glu Lys Leu Ala Tyr Val Met Lys Leu Lys

290 295 300

Asn Glu Arg Arg Gly Arg Ile Ala Glu Tyr Val Glu Lys Tyr Arg Asp

305 310 315 320

Ala Val Tyr Thr Pro Ala Asp Ala Ala Arg Gly Phe Asp Ala Leu Trp

325 330 335

Asp His Lys Ala Asp Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile

340 345 350

Gly Arg Gln Asp Ala Gln Asn Leu Leu Ile Asn Gly Val Tyr Val Val

355 360 365

Ser Glu Gly Ala Asn Met Pro Cys Thr Pro Glu Ala Val Glu Leu Phe

370 375 380

Leu Glu His Asn Val Leu Tyr Gly Pro Gly Lys Ala Ala Asn Ala Gly

385 390 395 400

Gly Val Ala Val Ser Gly Leu Glu Met Ser Gln Asn Ser Met Arg Leu

405 410 415

Arg Trp Thr Arg Glu Glu Val Asp His Arg Leu Gln Gln Ile Met His

420 425 430

Glu Ile His Ala Thr Cys Leu Ala Ala Ala Glu Arg Phe Gly Ala Pro

435 440 445

Ser Asn Tyr Val His Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala

450 455 460

Asp Ala Met Leu Asp Gln Gly Leu Val

465 470

<210> 167

<211> 1389

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_17序列

<400> 167

atgcaggtca gcggcggcgt ccgttcgaag ccctcggcat acgtcaacga tgtgctcgcg 60

caggtgaagg cgaagaaccc cgcggagccg gaattccacc aggcggtcga agaagtcctc 120

gaaagcatcg accttgccgt cgcgaagcgt cccgagctgc gcaaagcccg catcctcgaa 180

cgcatcgtcg agcccgagcg cgtcgtgatg ttccgcgtgg cgtggcagga cgatgccggc 240

gaggtgcaga tcaatcgcgg gtaccgcgtg cagatgaaca gcgcgatagg cccctacaaa 300

ggcggccttc ggttccatcc cagcgtgacg ctcggggtgc tgaagttcct cgcgttcgag 360

caggtcttca agaactccct caccacgctg cccatgggcg gcgggaaggg cggatccgat 420

ttcgatccga aggggcgatc ggacgccgaa gtgatgcgct tctgccaggc gttcatgacg 480

gagctcgcgc gccacatcgg gcccgacacc gacgtgccgg ccggcgacat cggtgtcggc 540

gcgcgtgaga ttggctttct gttcgggcag tacaaacggc tgcgcaacga attcaccggc 600

gtgctgaccg ggaaggcgct gaactggggc ggatcgctga tcaggccgga agccaccggc 660

tacggcgccg tgtatttcgc ggcggagatg ctggcgacgc gcaatcagac gctcgagggc 720

aagacgtgtc tcgtgtcggg cagtggcaat gtcgcgcaat acacgatcga gaagctgctg 780

gatctcggcg cgcgcgcggt gacggcgtcc gattcagacg gctatatcta tgacgaagcg 840

gggttcgatc gcgcgaagct cgcaaagctg atggcgctga aaaacgtgaa gcgcggccgg 900

ctgcgcgagt acgcggacga ggtgaagggc gtgacctaca cgccggtgaa gggcggcgcc 960

gcgcatccga tgtggtcgca tcgagccgac tgcgcgttcc cgagcgcgac gcagaacgag 1020

ctctcgggac aggatgccgc gaacctcgtc tcgaacaaca tcacggccgt ggccgagggg 1080

gcgaacatgc cctgcacgct cgacgccgtg cgcgtgttca tcgacgcgcg tgtgctctac 1140

gcgccgggga aggccgcgaa cgccggcggc gtggcgacgt cgggcctcga gatggcgcag 1200

aacagcgcgc gtctgagctg gacacgcgag gaagtggacg gccgcctgca caacatcatg 1260

aaagcgattc accgcgcgtg ccgcgacacg gcggacgcgt acggcgcgcc tggcaactac 1320

gtactcggcg cgaacatcgc gggcttcctc aaggtcgccg acgcgatgat ggatcagggg 1380

ctcgtctga 1389

<210> 168

<211> 462

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_17序列

<400> 168

Met Gln Val Ser Gly Gly Val Arg Ser Lys Pro Ser Ala Tyr Val Asn

1 5 10 15

Asp Val Leu Ala Gln Val Lys Ala Lys Asn Pro Ala Glu Pro Glu Phe

20 25 30

His Gln Ala Val Glu Glu Val Leu Glu Ser Ile Asp Leu Ala Val Ala

35 40 45

Lys Arg Pro Glu Leu Arg Lys Ala Arg Ile Leu Glu Arg Ile Val Glu

50 55 60

Pro Glu Arg Val Val Met Phe Arg Val Ala Trp Gln Asp Asp Ala Gly

65 70 75 80

Glu Val Gln Ile Asn Arg Gly Tyr Arg Val Gln Met Asn Ser Ala Ile

85 90 95

Gly Pro Tyr Lys Gly Gly Leu Arg Phe His Pro Ser Val Thr Leu Gly

100 105 110

Val Leu Lys Phe Leu Ala Phe Glu Gln Val Phe Lys Asn Ser Leu Thr

115 120 125

Thr Leu Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe Asp Pro Lys

130 135 140

Gly Arg Ser Asp Ala Glu Val Met Arg Phe Cys Gln Ala Phe Met Thr

145 150 155 160

Glu Leu Ala Arg His Ile Gly Pro Asp Thr Asp Val Pro Ala Gly Asp

165 170 175

Ile Gly Val Gly Ala Arg Glu Ile Gly Phe Leu Phe Gly Gln Tyr Lys

180 185 190

Arg Leu Arg Asn Glu Phe Thr Gly Val Leu Thr Gly Lys Ala Leu Asn

195 200 205

Trp Gly Gly Ser Leu Ile Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val

210 215 220

Tyr Phe Ala Ala Glu Met Leu Ala Thr Arg Asn Gln Thr Leu Glu Gly

225 230 235 240

Lys Thr Cys Leu Val Ser Gly Ser Gly Asn Val Ala Gln Tyr Thr Ile

245 250 255

Glu Lys Leu Leu Asp Leu Gly Ala Arg Ala Val Thr Ala Ser Asp Ser

260 265 270

Asp Gly Tyr Ile Tyr Asp Glu Ala Gly Phe Asp Arg Ala Lys Leu Ala

275 280 285

Lys Leu Met Ala Leu Lys Asn Val Lys Arg Gly Arg Leu Arg Glu Tyr

290 295 300

Ala Asp Glu Val Lys Gly Val Thr Tyr Thr Pro Val Lys Gly Gly Ala

305 310 315 320

Ala His Pro Met Trp Ser His Arg Ala Asp Cys Ala Phe Pro Ser Ala

325 330 335

Thr Gln Asn Glu Leu Ser Gly Gln Asp Ala Ala Asn Leu Val Ser Asn

340 345 350

Asn Ile Thr Ala Val Ala Glu Gly Ala Asn Met Pro Cys Thr Leu Asp

355 360 365

Ala Val Arg Val Phe Ile Asp Ala Arg Val Leu Tyr Ala Pro Gly Lys

370 375 380

Ala Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ala Gln

385 390 395 400

Asn Ser Ala Arg Leu Ser Trp Thr Arg Glu Glu Val Asp Gly Arg Leu

405 410 415

His Asn Ile Met Lys Ala Ile His Arg Ala Cys Arg Asp Thr Ala Asp

420 425 430

Ala Tyr Gly Ala Pro Gly Asn Tyr Val Leu Gly Ala Asn Ile Ala Gly

435 440 445

Phe Leu Lys Val Ala Asp Ala Met Met Asp Gln Gly Leu Val

450 455 460

<210> 169

<211> 1338

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_18序列

<400> 169

atggccgatg ttaaggccaa gaacccgatg gagccggagt tccatcaagc cgtccaagaa 60

gtggtcgagt ccctgtcgct tgttctcgat caacaccctg aatatctaaa ggcggggatc 120

ctggaacgga tggtcgagcc ggaacgcgtc attatgttcc gagtgccatg gcaggacgat 180

aaaggcaatc tccacgtcaa tcgcggtttt cgtgtccaga tgaatagcgc gatcggcccg 240

tataaggggg ggttgcggtt ccatccctcc gtcaacctcg gtattctcaa gtttcttgcc 300

ttcgaacagg tgtttaagaa cgcgcttacc actctgccga tgggcggcgc caagggcgga 360

tctgacttcg atccgaaagg aaagagcgac ttggaagtca cgcgcttctg ccaggctttt 420

atgtgcgaac tttttcgcca catcgggcca gacacagatg ttcccgcggg ggacatcggt 480

gtcggcggcc gcgaaatagg atttctgttc gggatgtaca agaagcttgg aaacgaattc 540

acgggcgttt taaccggaaa aggcccaact tggggtggat ccgtcatccg ccccgaggcc 600

accggatatg gagcagtcta tttcgcggcc gaaatgctcg aaacccgcaa agaaaatctt 660

aagggtaaga cctgccttgt ttccggaagc ggcaatgtgt cgcaatatac ggtcgataag 720

ctcatcgagg tcggggcgcg gcccgtcacg ctctcagact ccaatggtta tatctatgat 780

gaggccggta ttactcagga aaagctcgcc tttgtcatgg agttaaaaaa cgtccgccgg 840

ggccgaattg gcgagtacgc ggacaaattc aaaagcgcga cttattttcc gagggatccg 900

aagctcgatt acaacccgct ctggaaccac aaggcggagt gtgcgttccc gagcgcgact 960

cagaacgaga ttaacgcgaa ggacgccgcc aatctcctca agaacggtgt ctatgtcgtc 1020

tcagaaggcg caaatatgcc gaccgcgatc gaagggatca atcagttcat cgaggccaag 1080

atcctgttcg gccccggcaa ggccgcaaac gcgggcggtg tcgccacctc tgggttggaa 1140

atggcgcaga acagcatgcg tatttcctgg acgcgcgagg aagtggacgc gcggctgcag 1200

agcatcatga aagggatcca caaaaattgt tacgtgacgg cggagaagta cggtactccg 1260

ggcaactacg ttaacggtgc gaacattgcg ggtttcctga aggtggctaa cgccatgatg 1320

gatcagggac tggtgtag 1338

<210> 170

<211> 445

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_18序列

<400> 170

Met Ala Asp Val Lys Ala Lys Asn Pro Met Glu Pro Glu Phe His Gln

1 5 10 15

Ala Val Gln Glu Val Val Glu Ser Leu Ser Leu Val Leu Asp Gln His

20 25 30

Pro Glu Tyr Leu Lys Ala Gly Ile Leu Glu Arg Met Val Glu Pro Glu

35 40 45

Arg Val Ile Met Phe Arg Val Pro Trp Gln Asp Asp Lys Gly Asn Leu

50 55 60

His Val Asn Arg Gly Phe Arg Val Gln Met Asn Ser Ala Ile Gly Pro

65 70 75 80

Tyr Lys Gly Gly Leu Arg Phe His Pro Ser Val Asn Leu Gly Ile Leu

85 90 95

Lys Phe Leu Ala Phe Glu Gln Val Phe Lys Asn Ala Leu Thr Thr Leu

100 105 110

Pro Met Gly Gly Ala Lys Gly Gly Ser Asp Phe Asp Pro Lys Gly Lys

115 120 125

Ser Asp Leu Glu Val Thr Arg Phe Cys Gln Ala Phe Met Cys Glu Leu

130 135 140

Phe Arg His Ile Gly Pro Asp Thr Asp Val Pro Ala Gly Asp Ile Gly

145 150 155 160

Val Gly Gly Arg Glu Ile Gly Phe Leu Phe Gly Met Tyr Lys Lys Leu

165 170 175

Gly Asn Glu Phe Thr Gly Val Leu Thr Gly Lys Gly Pro Thr Trp Gly

180 185 190

Gly Ser Val Ile Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val Tyr Phe

195 200 205

Ala Ala Glu Met Leu Glu Thr Arg Lys Glu Asn Leu Lys Gly Lys Thr

210 215 220

Cys Leu Val Ser Gly Ser Gly Asn Val Ser Gln Tyr Thr Val Asp Lys

225 230 235 240

Leu Ile Glu Val Gly Ala Arg Pro Val Thr Leu Ser Asp Ser Asn Gly

245 250 255

Tyr Ile Tyr Asp Glu Ala Gly Ile Thr Gln Glu Lys Leu Ala Phe Val

260 265 270

Met Glu Leu Lys Asn Val Arg Arg Gly Arg Ile Gly Glu Tyr Ala Asp

275 280 285

Lys Phe Lys Ser Ala Thr Tyr Phe Pro Arg Asp Pro Lys Leu Asp Tyr

290 295 300

Asn Pro Leu Trp Asn His Lys Ala Glu Cys Ala Phe Pro Ser Ala Thr

305 310 315 320

Gln Asn Glu Ile Asn Ala Lys Asp Ala Ala Asn Leu Leu Lys Asn Gly

325 330 335

Val Tyr Val Val Ser Glu Gly Ala Asn Met Pro Thr Ala Ile Glu Gly

340 345 350

Ile Asn Gln Phe Ile Glu Ala Lys Ile Leu Phe Gly Pro Gly Lys Ala

355 360 365

Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ala Gln Asn

370 375 380

Ser Met Arg Ile Ser Trp Thr Arg Glu Glu Val Asp Ala Arg Leu Gln

385 390 395 400

Ser Ile Met Lys Gly Ile His Lys Asn Cys Tyr Val Thr Ala Glu Lys

405 410 415

Tyr Gly Thr Pro Gly Asn Tyr Val Asn Gly Ala Asn Ile Ala Gly Phe

420 425 430

Leu Lys Val Ala Asn Ala Met Met Asp Gln Gly Leu Val

435 440 445

<210> 171

<211> 1362

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_19序列

<400> 171

atgaatgact acgtcacagc gttgatggcc gaggtaaagg ccaagaaccc atcggagccc 60

gagtttcacc aggcggtcga ggaggtggta gagtcgctcg cgctcgtcct ggaacaacat 120

ccggaatacc ggaaagcgaa aatcatcgag cgaatcattg agccggagcg ggtcatcatc 180

ttccgcgttc cctggcagga cgaccagggc gagctgcagg tgaaccgcgg gtttcgcatt 240

cagatgaaca gcgccatcgg cccgtacaag ggcggcctgc gtttccatcc ttcggtcaac 300

ctcggcatcc taaaattcct cgctttcgaa caggtgttca aaaacgcgct caccactctg 360

ccgatgggcg gcggcaaagg cggttccgac ttcgatccga aaggcaagag cgacagtgaa 420

gtgatgcgct tctgtcaggc gttcatgtgc gaactgttcc ggcacattgg cccggatacc 480

gatgttcccg cgggcgatat cggcgtcggg gcacgtgaaa tcggatactt gttcgggatg 540

tacaagaggc tcaggaacga gttcagcggt gtgataacgg gcaagggtct gacctggggt 600

gggtccgtca ttcgccctga ggcgacgggc tacggcgcgg tttatttcgc ggctgaaatg 660

ctcaagacgc gcaaagaaga gatgaagggc aaaacctgtc tcgtgtccgg gagcggcaat 720

gtttcgcagt acacggtgga caaacttatc tcgctggggg ccaaggcagt cacactctcg 780

gattcatctg gctacatcta cgacgaggcc gggatcgacc gcgacaagct tgcctttgtc 840

atggacctga agaacaaccg gcgtggccgg atttcagaat acgccgataa gttcaagggg 900

acgaccttca cggccgtgga cgaggcgctc gatcataacc cgctttggga tcacaaggcc 960

gagtgcgcct ttcccagtgc aacgcagaac gagatcaacg ggaaggacgc ggcgaacctt 1020

ctccgaaacg gcgtctatgt cgtctcggag ggggcgaata tgccgactac gattgacggc 1080

gtaaaccagt tcctcgaggc gcagatcctc ttcggtcctg gcaaagcagc aaatgccggc 1140

ggagttgcga cctccggctt ggagatggcg caaaacagca tgcggatttc ctggacccgc 1200

gaggaagtgg ataaccgtct cttcaatatc atgaagacga tccacgaagt ttgccatcgc 1260

acggccgaga agtacggcac gccgggcaac tacgtgaacg gcgcaaacat tgccggcttt 1320

cagaaagtcg ccaacgcgat gatggaccag ggactggtgt ag 1362

<210> 172

<211> 453

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_19序列

<400> 172

Met Asn Asp Tyr Val Thr Ala Leu Met Ala Glu Val Lys Ala Lys Asn

1 5 10 15

Pro Ser Glu Pro Glu Phe His Gln Ala Val Glu Glu Val Val Glu Ser

20 25 30

Leu Ala Leu Val Leu Glu Gln His Pro Glu Tyr Arg Lys Ala Lys Ile

35 40 45

Ile Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Ile Phe Arg Val Pro

50 55 60

Trp Gln Asp Asp Gln Gly Glu Leu Gln Val Asn Arg Gly Phe Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ser Asp Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Cys Glu Leu Phe Arg His Ile Gly Pro Asp Thr

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Tyr

165 170 175

Leu Phe Gly Met Tyr Lys Arg Leu Arg Asn Glu Phe Ser Gly Val Ile

180 185 190

Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met Leu Lys Thr Arg

210 215 220

Lys Glu Glu Met Lys Gly Lys Thr Cys Leu Val Ser Gly Ser Gly Asn

225 230 235 240

Val Ser Gln Tyr Thr Val Asp Lys Leu Ile Ser Leu Gly Ala Lys Ala

245 250 255

Val Thr Leu Ser Asp Ser Ser Gly Tyr Ile Tyr Asp Glu Ala Gly Ile

260 265 270

Asp Arg Asp Lys Leu Ala Phe Val Met Asp Leu Lys Asn Asn Arg Arg

275 280 285

Gly Arg Ile Ser Glu Tyr Ala Asp Lys Phe Lys Gly Thr Thr Phe Thr

290 295 300

Ala Val Asp Glu Ala Leu Asp His Asn Pro Leu Trp Asp His Lys Ala

305 310 315 320

Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile Asn Gly Lys Asp

325 330 335

Ala Ala Asn Leu Leu Arg Asn Gly Val Tyr Val Val Ser Glu Gly Ala

340 345 350

Asn Met Pro Thr Thr Ile Asp Gly Val Asn Gln Phe Leu Glu Ala Gln

355 360 365

Ile Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr

370 375 380

Ser Gly Leu Glu Met Ala Gln Asn Ser Met Arg Ile Ser Trp Thr Arg

385 390 395 400

Glu Glu Val Asp Asn Arg Leu Phe Asn Ile Met Lys Thr Ile His Glu

405 410 415

Val Cys His Arg Thr Ala Glu Lys Tyr Gly Thr Pro Gly Asn Tyr Val

420 425 430

Asn Gly Ala Asn Ile Ala Gly Phe Gln Lys Val Ala Asn Ala Met Met

435 440 445

Asp Gln Gly Leu Val

450

<210> 173

<211> 1362

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_20序列

<400> 173

atgaatgacc atgtcgctgc gttgatggcc gaggtaaagg ccaagaaccc ctcggagccg 60

gaatttcacc aggcggtgga ggaggtggcc gagtcgctca cgctggtgct ggatcagcat 120

ccggaatatc ggaaggcaaa gatcctcgag cgaatcatcg agccggagcg cgtgatcatg 180

tttcgcgttc cctggcagga tgacgcgggg gagctgcacg tgaatcgcgg gttccgcatc 240

cagatgaaca gcgcgattgg cccatacaaa ggcggcctgc gtttccatcc ctcggtcaac 300

ctcggcatcc tgaagttcct cgccttcgag caggtcttta agaacgcgct gaccacgctg 360

ccgatgggcg gcggaaaggg tggggccgac tttgatccga aagggaaaag cgacagcgag 420

gtaatgcggt tctgccaggc cttcatgtgc gagctgttcc ggcacatcgg cccggatacg 480

gatgtgccgg cgggcgatat tggcgtcggg gcacgcgaga tcggatttct ttttgggatg 540

tacaagaggc tgaagaacga gttcaccggt gtgatgacgg gcaaaggcct cacctggggt 600

ggctcggtca ttcgtcctga ggcgacggga tacggcgcag tctatttcgc ggctgagatg 660

ctcaagacgc gcaaggaaga gatgaagggc aagacgtgtc tcgtctcggg aagcggcaac 720

gtttcgcagt acacggtgga caaacttatc tcgctgggcg caaagacggt cacgctctcg 780

gattcatccg gctacatcta tgacgaggcc gggatcgacc gggacaagct cgcctttgtc 840

atggatctga agaacaaccg ccgcgggcgg atcgcggaat atgccgacaa gttcaagggc 900

gcggtcttca cgccattgga tgaggcgctc gatcataacc ccctctggaa tcacaaggcc 960

gagtgcgcct ttcccagcgc cacgcaaaac gagatcaacg ggaaggacgc ggcgaacctt 1020

ctccggaacg gcgtctacgt catctccgag ggcgcaaaca tgccgaccac aactgacggc 1080

gtcagccggt tcctcgaggc gcaggtcctt ttcggtcccg ggaaggccgc caatgccggc 1140

ggagtcgcga cttccggatt ggaaatggcg caaaacagca tgcggatttc ctggacccgc 1200

gaggaagtgg ataaccgtct tttcaatatc atgaagacga tccacgaaaa ttgctatcgc 1260

acggccgaga aatacggcac gccgggtaac tacgtcaacg gtgcgaacat cgccggcttc 1320

ctcaaggtcg cgaacgcgat gatggaccag ggattggtgt ag 1362

<210> 174

<211> 453

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_20序列

<400> 174

Met Asn Asp His Val Ala Ala Leu Met Ala Glu Val Lys Ala Lys Asn

1 5 10 15

Pro Ser Glu Pro Glu Phe His Gln Ala Val Glu Glu Val Ala Glu Ser

20 25 30

Leu Thr Leu Val Leu Asp Gln His Pro Glu Tyr Arg Lys Ala Lys Ile

35 40 45

Leu Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Met Phe Arg Val Pro

50 55 60

Trp Gln Asp Asp Ala Gly Glu Leu His Val Asn Arg Gly Phe Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ala Asp Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Cys Glu Leu Phe Arg His Ile Gly Pro Asp Thr

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Phe

165 170 175

Leu Phe Gly Met Tyr Lys Arg Leu Lys Asn Glu Phe Thr Gly Val Met

180 185 190

Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met Leu Lys Thr Arg

210 215 220

Lys Glu Glu Met Lys Gly Lys Thr Cys Leu Val Ser Gly Ser Gly Asn

225 230 235 240

Val Ser Gln Tyr Thr Val Asp Lys Leu Ile Ser Leu Gly Ala Lys Thr

245 250 255

Val Thr Leu Ser Asp Ser Ser Gly Tyr Ile Tyr Asp Glu Ala Gly Ile

260 265 270

Asp Arg Asp Lys Leu Ala Phe Val Met Asp Leu Lys Asn Asn Arg Arg

275 280 285

Gly Arg Ile Ala Glu Tyr Ala Asp Lys Phe Lys Gly Ala Val Phe Thr

290 295 300

Pro Leu Asp Glu Ala Leu Asp His Asn Pro Leu Trp Asn His Lys Ala

305 310 315 320

Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile Asn Gly Lys Asp

325 330 335

Ala Ala Asn Leu Leu Arg Asn Gly Val Tyr Val Ile Ser Glu Gly Ala

340 345 350

Asn Met Pro Thr Thr Thr Asp Gly Val Ser Arg Phe Leu Glu Ala Gln

355 360 365

Val Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr

370 375 380

Ser Gly Leu Glu Met Ala Gln Asn Ser Met Arg Ile Ser Trp Thr Arg

385 390 395 400

Glu Glu Val Asp Asn Arg Leu Phe Asn Ile Met Lys Thr Ile His Glu

405 410 415

Asn Cys Tyr Arg Thr Ala Glu Lys Tyr Gly Thr Pro Gly Asn Tyr Val

420 425 430

Asn Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala Asn Ala Met Met

435 440 445

Asp Gln Gly Leu Val

450

<210> 175

<211> 1362

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_21序列

<400> 175

atgaatgacc aagttgcagc gttgatggcc gatgttaagg ccaagaatcc gggggagccg 60

gaatttcacc aggccgtcca ggaagtagtc gaatcgctca cgctcgttct ggatcagcac 120

ccggaatatc gtaaggcgaa gatcatcgag cggatcattg agccggagcg ggtcatcatc 180

ttccgcgttc cctggcagga cgatcagggc gagctgcatg tcaaccgcgg cttccgcatc 240

caaatgaaca gcgcgatcgg cccctacaaa ggcggtctgc gctttcatcc ctccgtcaac 300

ctgggcattc tgaagtttct cgctttcgag caggtgttca agaatgcgct caccacgttg 360

cccatgggcg gcggcaaagg cggggctgac ttcgatccga aaggcaagag cgacagcgaa 420

gtgatgcgtt tctgccaggc gttcatgtgt gaactcttcc ggcacatcgg cccggatacg 480

gacgtgccgg cgggcgacat cggcgtcggg gcgcgtgaga tcggattttt gttcgggatg 540

tacaagaggc tcaagaacga gttcaccggc gtgatgaccg gcaaaggtct tacttggggc 600

ggctcggtca ttcgtccgga ggcgacggga tacggggcgg tctatttcgc agctgaaatg 660

ctcaagacgc gcaaggaaga gatgaagggc aagacctgtc tcgtttccgg aagcggcaac 720

gtttcgcagt acacggtgga caaactgatc tcgctcgggg cgaaggcggt cacgctctcg 780

gattcatccg gctatatcta tgacgaagcc gggatcgacc gggagaagct cgcctttgtc 840

atggacctga agaaccaccg ccgcggccgc atctccgaat atgccgacaa gttcaaagga 900

acgaccttca ccgcagtgga cgaggcgctc gatcataacc caatttggga tcacaaggcg 960

gagtgcgcgt ttccgagcgc gacccagaac gaaatcaacg ggaaggatgc tgcaaatctc 1020

ctgaagaacg gcgtctacgt cgtctccgaa ggcgcgaaca tgccgaccac gatcgatgga 1080

gtaaataaat ttctcgaggc gaatatcttg ttcggtccgg ggaaggccgc gaatgccggc 1140

ggagtcgcca tctccggatt ggagatggcg caaaaaagca tgcgtatctc gtggactcgc 1200

gaagaagtcg acacgcgtct gttcaacatc atgcggacga tccacgaaaa ctgccatcgc 1260

acctccgaga agtacggcac cccaggtaac tacgtcaacg gcgcgaacat cgccggcttc 1320

ctgaaagtcg cgaacgccat gatggaccag ggattggtct ag 1362

<210> 176

<211> 453

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_21序列

<400> 176

Met Asn Asp Gln Val Ala Ala Leu Met Ala Asp Val Lys Ala Lys Asn

1 5 10 15

Pro Gly Glu Pro Glu Phe His Gln Ala Val Gln Glu Val Val Glu Ser

20 25 30

Leu Thr Leu Val Leu Asp Gln His Pro Glu Tyr Arg Lys Ala Lys Ile

35 40 45

Ile Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Ile Phe Arg Val Pro

50 55 60

Trp Gln Asp Asp Gln Gly Glu Leu His Val Asn Arg Gly Phe Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ala Asp Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Cys Glu Leu Phe Arg His Ile Gly Pro Asp Thr

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Phe

165 170 175

Leu Phe Gly Met Tyr Lys Arg Leu Lys Asn Glu Phe Thr Gly Val Met

180 185 190

Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met Leu Lys Thr Arg

210 215 220

Lys Glu Glu Met Lys Gly Lys Thr Cys Leu Val Ser Gly Ser Gly Asn

225 230 235 240

Val Ser Gln Tyr Thr Val Asp Lys Leu Ile Ser Leu Gly Ala Lys Ala

245 250 255

Val Thr Leu Ser Asp Ser Ser Gly Tyr Ile Tyr Asp Glu Ala Gly Ile

260 265 270

Asp Arg Glu Lys Leu Ala Phe Val Met Asp Leu Lys Asn His Arg Arg

275 280 285

Gly Arg Ile Ser Glu Tyr Ala Asp Lys Phe Lys Gly Thr Thr Phe Thr

290 295 300

Ala Val Asp Glu Ala Leu Asp His Asn Pro Ile Trp Asp His Lys Ala

305 310 315 320

Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile Asn Gly Lys Asp

325 330 335

Ala Ala Asn Leu Leu Lys Asn Gly Val Tyr Val Val Ser Glu Gly Ala

340 345 350

Asn Met Pro Thr Thr Ile Asp Gly Val Asn Lys Phe Leu Glu Ala Asn

355 360 365

Ile Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Ile

370 375 380

Ser Gly Leu Glu Met Ala Gln Lys Ser Met Arg Ile Ser Trp Thr Arg

385 390 395 400

Glu Glu Val Asp Thr Arg Leu Phe Asn Ile Met Arg Thr Ile His Glu

405 410 415

Asn Cys His Arg Thr Ser Glu Lys Tyr Gly Thr Pro Gly Asn Tyr Val

420 425 430

Asn Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala Asn Ala Met Met

435 440 445

Asp Gln Gly Leu Val

450

<210> 177

<211> 1362

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_22序列

<400> 177

atgaaagacg acgtttccgc attgatgtcc gaggtcaaag ccaagaaccc gggagaacca 60

gagttccatc aggctgttca ggaagtatcg gaatctctag cgctcgtgct cgatcagcat 120

cccgaatatc ggaaagccaa gatcctcgag cggatcattg agccggagcg cgtaatcatg 180

tttcgcgtgc cctggcaaga cgaccagggt gagctccacg tcaatcgcgg gttccgcatc 240

caaatgaaca gcgcgattgg gccttacaag ggcggcctcc gatttcatcc ttcggtgaac 300

ctcggcatcc tgaaattcct cgccttcgag caggtcttta agaatgcgct taccaccttg 360

cccatgggag gcgggaaggg cggggcggat ttcgatccga aagggaagag cgacagtgaa 420

gtgatgcgct tctgccaggc gttcatgtgt gagctgtttc ggcatattgg gccggacaca 480

gatgtgccgg cgggtgacat tggcgtgggc gcgcgcgaga tcgggtttct tttcgggatg 540

tttaagcggt taaagaatga gttcaccggc gtgatgacgg gcaaaggcct cacgtggggc 600

ggctcggtca ttcgaccgga agcgacggga tacggcgcgg tttactttgc cgccgagatg 660

ctcaagacgc gcaaggaaga gctagccggt aagacttgtc ttgtttcggg cagcggcaac 720

gtcgcgcaat acacggtcga taaacttatc tcgctaggcg cgaaggcggt cactctctcg 780

gattccacgg gttacattta cgatgaggct ggcatcaatc gggaaaagct cgcctttgtc 840

atggatctta agaacaaccg gcgcgggcgg atcgcggaat acgcggataa gttcaagggg 900

gcgactttca cgcccctgaa cgaaacgctc gatcacaatc cgctttggga gcacaaggcc 960

gaatgcgcct ttcccagcgc gacccagaac gagatcaacg gaaaggacgc agcgaatctt 1020

ctgcgcaacg gcgtctatgt ggtttcagaa ggcgcgaaca tgcccacgac catcgacggc 1080

gtgaatcagt tcctggaagc acaaatcctt ttcggcccgg gcaaggcggc gaatgcgggt 1140

ggcgtcgcca cttcgggact cgagatggcg caaaacagca tgcgcatttc ctggacacgc 1200

gaggaagtgg acgcgcgcct gttcaacatc atgaagacga tccacgaggt ctgccaccgg 1260

acagctgaga agtacggcac gccggggaat tatgtgaacg gcgccaacat tgcgggcttt 1320

ctcaaggtcg cgaacgcgat gatggatcag ggattagtgt ag 1362

<210> 178

<211> 453

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_22序列

<400> 178

Met Lys Asp Asp Val Ser Ala Leu Met Ser Glu Val Lys Ala Lys Asn

1 5 10 15

Pro Gly Glu Pro Glu Phe His Gln Ala Val Gln Glu Val Ser Glu Ser

20 25 30

Leu Ala Leu Val Leu Asp Gln His Pro Glu Tyr Arg Lys Ala Lys Ile

35 40 45

Leu Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Met Phe Arg Val Pro

50 55 60

Trp Gln Asp Asp Gln Gly Glu Leu His Val Asn Arg Gly Phe Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ala Asp Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Cys Glu Leu Phe Arg His Ile Gly Pro Asp Thr

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Phe

165 170 175

Leu Phe Gly Met Phe Lys Arg Leu Lys Asn Glu Phe Thr Gly Val Met

180 185 190

Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met Leu Lys Thr Arg

210 215 220

Lys Glu Glu Leu Ala Gly Lys Thr Cys Leu Val Ser Gly Ser Gly Asn

225 230 235 240

Val Ala Gln Tyr Thr Val Asp Lys Leu Ile Ser Leu Gly Ala Lys Ala

245 250 255

Val Thr Leu Ser Asp Ser Thr Gly Tyr Ile Tyr Asp Glu Ala Gly Ile

260 265 270

Asn Arg Glu Lys Leu Ala Phe Val Met Asp Leu Lys Asn Asn Arg Arg

275 280 285

Gly Arg Ile Ala Glu Tyr Ala Asp Lys Phe Lys Gly Ala Thr Phe Thr

290 295 300

Pro Leu Asn Glu Thr Leu Asp His Asn Pro Leu Trp Glu His Lys Ala

305 310 315 320

Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile Asn Gly Lys Asp

325 330 335

Ala Ala Asn Leu Leu Arg Asn Gly Val Tyr Val Val Ser Glu Gly Ala

340 345 350

Asn Met Pro Thr Thr Ile Asp Gly Val Asn Gln Phe Leu Glu Ala Gln

355 360 365

Ile Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr

370 375 380

Ser Gly Leu Glu Met Ala Gln Asn Ser Met Arg Ile Ser Trp Thr Arg

385 390 395 400

Glu Glu Val Asp Ala Arg Leu Phe Asn Ile Met Lys Thr Ile His Glu

405 410 415

Val Cys His Arg Thr Ala Glu Lys Tyr Gly Thr Pro Gly Asn Tyr Val

420 425 430

Asn Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala Asn Ala Met Met

435 440 445

Asp Gln Gly Leu Val

450

<210> 179

<211> 1338

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_23序列

<400> 179

atggccgagg tcaaagccaa gaacccgggc gaagccgagt ttcaccaggc agtccaggaa 60

gtcgctgaat cgctggcgct ggtgctcgat cagcacccgg aatatcggaa ggcgaaaatt 120

ctggagcgaa tcatcgagcc ggagcgcgtc atcatgtttc gggtgccgtg gcaggacgac 180

cagggcgagc tccatgtgaa ccgcgggttt cggattcaga tgaacagcgc gatcgggcct 240

tacaagggcg ggctccgttt tcatccgacg gtgaatcttg gcatcctgaa gttcctcgct 300

ttcgagcagg tctttaaaaa tgcgttgacc acgttgccga tgggcggcgg caaaggcggt 360

tcggatttcg atcccaaagg caaaagcgac agcgaagtga tgcgcttctg tcaggcgttc 420

atgtgcgagc tgttccggca catcggtccg gacacggacg tgccagcggg cgacatcggc 480

gttggtgcgc gcgagattgg atttttgttc gggatgtata agcggcttcg gaacgagttc 540

acgggcgtca tcaccggcaa aggccttacc tggggcggct cggtgattcg tcccgaagcg 600

accggttatg gggcggttta tttcgcggcg gagatgctga agacacgcaa ggaagaattg 660

aaaggcaaga cctgtttggt ttccggcagc ggcaatgtcg cccagtacac agtggacaag 720

ctgatctcgt taggcgcgaa agccgtcacg ctttcggatt ccactggcta catcttcgac 780

gaagccggga tcgatcgtga caagctcgcg ttcgtcatgg atttgaagaa caaccgccgc 840

ggccgcattt ccgaatacgc ggacaagttc aaaggggcgg tcttcacggc ggttgaggcg 900

gcggcggatc ataatccgct ttgggatcac aaagctgagt gcgcatttcc gagcgcgacg 960

cagaacgaga tcaacgcgaa ggatgccgcg aaccttttgc ggaacggcat ctacgtcgtc 1020

tcggaagggg cgaacatgcc gaccacgatc gatggcgtga accagttcct cgatgcgaac 1080

atcctgttcg gcccgggcaa ggcggccaac gcgggcggtg tggcaacttc cggcttggaa 1140

atggcgcaaa acagcatgcg catctcctgg acgcgcgaag aagtcgatgg ccggcttttc 1200

aatatcatga aaaccatcca cgaagtttgc caccgcacgg ccgagaagta cggcacgccg 1260

ggcaactacg tgaacggcgc gaacatcgcc ggcttcctca aggtggcgaa cgcgatgatg 1320

gaccaggggt tggtgtag 1338

<210> 180

<211> 445

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_23序列

<400> 180

Met Ala Glu Val Lys Ala Lys Asn Pro Gly Glu Ala Glu Phe His Gln

1 5 10 15

Ala Val Gln Glu Val Ala Glu Ser Leu Ala Leu Val Leu Asp Gln His

20 25 30

Pro Glu Tyr Arg Lys Ala Lys Ile Leu Glu Arg Ile Ile Glu Pro Glu

35 40 45

Arg Val Ile Met Phe Arg Val Pro Trp Gln Asp Asp Gln Gly Glu Leu

50 55 60

His Val Asn Arg Gly Phe Arg Ile Gln Met Asn Ser Ala Ile Gly Pro

65 70 75 80

Tyr Lys Gly Gly Leu Arg Phe His Pro Thr Val Asn Leu Gly Ile Leu

85 90 95

Lys Phe Leu Ala Phe Glu Gln Val Phe Lys Asn Ala Leu Thr Thr Leu

100 105 110

Pro Met Gly Gly Gly Lys Gly Gly Ser Asp Phe Asp Pro Lys Gly Lys

115 120 125

Ser Asp Ser Glu Val Met Arg Phe Cys Gln Ala Phe Met Cys Glu Leu

130 135 140

Phe Arg His Ile Gly Pro Asp Thr Asp Val Pro Ala Gly Asp Ile Gly

145 150 155 160

Val Gly Ala Arg Glu Ile Gly Phe Leu Phe Gly Met Tyr Lys Arg Leu

165 170 175

Arg Asn Glu Phe Thr Gly Val Ile Thr Gly Lys Gly Leu Thr Trp Gly

180 185 190

Gly Ser Val Ile Arg Pro Glu Ala Thr Gly Tyr Gly Ala Val Tyr Phe

195 200 205

Ala Ala Glu Met Leu Lys Thr Arg Lys Glu Glu Leu Lys Gly Lys Thr

210 215 220

Cys Leu Val Ser Gly Ser Gly Asn Val Ala Gln Tyr Thr Val Asp Lys

225 230 235 240

Leu Ile Ser Leu Gly Ala Lys Ala Val Thr Leu Ser Asp Ser Thr Gly

245 250 255

Tyr Ile Phe Asp Glu Ala Gly Ile Asp Arg Asp Lys Leu Ala Phe Val

260 265 270

Met Asp Leu Lys Asn Asn Arg Arg Gly Arg Ile Ser Glu Tyr Ala Asp

275 280 285

Lys Phe Lys Gly Ala Val Phe Thr Ala Val Glu Ala Ala Ala Asp His

290 295 300

Asn Pro Leu Trp Asp His Lys Ala Glu Cys Ala Phe Pro Ser Ala Thr

305 310 315 320

Gln Asn Glu Ile Asn Ala Lys Asp Ala Ala Asn Leu Leu Arg Asn Gly

325 330 335

Ile Tyr Val Val Ser Glu Gly Ala Asn Met Pro Thr Thr Ile Asp Gly

340 345 350

Val Asn Gln Phe Leu Asp Ala Asn Ile Leu Phe Gly Pro Gly Lys Ala

355 360 365

Ala Asn Ala Gly Gly Val Ala Thr Ser Gly Leu Glu Met Ala Gln Asn

370 375 380

Ser Met Arg Ile Ser Trp Thr Arg Glu Glu Val Asp Gly Arg Leu Phe

385 390 395 400

Asn Ile Met Lys Thr Ile His Glu Val Cys His Arg Thr Ala Glu Lys

405 410 415

Tyr Gly Thr Pro Gly Asn Tyr Val Asn Gly Ala Asn Ile Ala Gly Phe

420 425 430

Leu Lys Val Ala Asn Ala Met Met Asp Gln Gly Leu Val

435 440 445

<210> 181

<211> 1362

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_24序列

<400> 181

atgaaatacg atgtctccgc tttgatggcc gaggttaagg ccaagaaccc gggcgaaccc 60

gagtttcacc aggccgtcca ggaagtcgtc gaatcgctgg ccctcgtcct ggagcagcat 120

cccgaatatc agaaagccaa gatcatcgag cggatcatcg aacccgagcg cgtgatcatg 180

ttccggatcc cgtggcagga cgacaaaggc gagctccatg tgaaccgcgg gttccgcatc 240

cagatgaaca gcgccatcgg gccctacaag ggcggcctcc ggtttcatcc gtcggttaat 300

ctgggcatcc tgaagttcct cgccttcgag caggttttca aaaacgcgct taccactctt 360

ccaatgggcg ggggcaaagg cggctctgac ttcgatccaa aaggaaagag cgacagcgaa 420

gtgatgcggt tttgccaggc gttcatgtgc gagctgtttc gtcacatcgg cccggatacg 480

gacgtgccgg ccggcgatat cggggtaggc gcgcgcgaga tcggctttct tttcgggatg 540

tataagagac tgaagaatga attcaccgga gtgatgacgg gcaagggtct cacctggggc 600

ggctcggtca ttcgtcctga agccaccgga tacggcgcgg tttatttcgc cgcggaaatg 660

ctcaagacgc gcaaggaaga gctcaaaggc aagacctgtc tcgtttccgg cagcggcaac 720

gtcgcgcaat acaccgtcga caaactgatc tcgttaggcg cgcaggcggt cacgctctcg 780

gattcgaccg gttacatcta cgacgaagcc gggatcgacc gcgacaagct cgcctttgtc 840

atggacctca agaacaaccg gcgcggccgg atcagcgaat acgccgacaa gttcaaaggg 900

gccgagttca ttccggcgga tgcgaagcgc gaccataacc ccctttggga tcacaaggcc 960

gagtgcgctt tcccgagcgc gacccagaac gaaattaacg agaaggacgc ggcgaacctg 1020

atcaagaacg gcgtctatgt cgtctcggaa ggcgcgaaca tgccgaccac gatcgatggc 1080

gtgaaccagt tcctgaaagc cggtatcctt ttcgggcccg gaaaagcggc caatgccgga 1140

ggggttgcga cctccggttt ggaaatggcg cagaacagca tgcgcatttc ctggacccgc 1200

gaagaagtcg acgggcgcct cttcaacatc atgaagacca tccacgaagt ctgtcatcgc 1260

accgcggaaa agtacggcac gcccggcaac tacgtgaacg gcgcgaacat cgccggcttc 1320

ctcaaagtgg ccaacgcgat gatggaccag gggctggtgt aa 1362

<210> 182

<211> 453

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的基因ID gdh_24序列

<400> 182

Met Lys Tyr Asp Val Ser Ala Leu Met Ala Glu Val Lys Ala Lys Asn

1 5 10 15

Pro Gly Glu Pro Glu Phe His Gln Ala Val Gln Glu Val Val Glu Ser

20 25 30

Leu Ala Leu Val Leu Glu Gln His Pro Glu Tyr Gln Lys Ala Lys Ile

35 40 45

Ile Glu Arg Ile Ile Glu Pro Glu Arg Val Ile Met Phe Arg Ile Pro

50 55 60

Trp Gln Asp Asp Lys Gly Glu Leu His Val Asn Arg Gly Phe Arg Ile

65 70 75 80

Gln Met Asn Ser Ala Ile Gly Pro Tyr Lys Gly Gly Leu Arg Phe His

85 90 95

Pro Ser Val Asn Leu Gly Ile Leu Lys Phe Leu Ala Phe Glu Gln Val

100 105 110

Phe Lys Asn Ala Leu Thr Thr Leu Pro Met Gly Gly Gly Lys Gly Gly

115 120 125

Ser Asp Phe Asp Pro Lys Gly Lys Ser Asp Ser Glu Val Met Arg Phe

130 135 140

Cys Gln Ala Phe Met Cys Glu Leu Phe Arg His Ile Gly Pro Asp Thr

145 150 155 160

Asp Val Pro Ala Gly Asp Ile Gly Val Gly Ala Arg Glu Ile Gly Phe

165 170 175

Leu Phe Gly Met Tyr Lys Arg Leu Lys Asn Glu Phe Thr Gly Val Met

180 185 190

Thr Gly Lys Gly Leu Thr Trp Gly Gly Ser Val Ile Arg Pro Glu Ala

195 200 205

Thr Gly Tyr Gly Ala Val Tyr Phe Ala Ala Glu Met Leu Lys Thr Arg

210 215 220

Lys Glu Glu Leu Lys Gly Lys Thr Cys Leu Val Ser Gly Ser Gly Asn

225 230 235 240

Val Ala Gln Tyr Thr Val Asp Lys Leu Ile Ser Leu Gly Ala Gln Ala

245 250 255

Val Thr Leu Ser Asp Ser Thr Gly Tyr Ile Tyr Asp Glu Ala Gly Ile

260 265 270

Asp Arg Asp Lys Leu Ala Phe Val Met Asp Leu Lys Asn Asn Arg Arg

275 280 285

Gly Arg Ile Ser Glu Tyr Ala Asp Lys Phe Lys Gly Ala Glu Phe Ile

290 295 300

Pro Ala Asp Ala Lys Arg Asp His Asn Pro Leu Trp Asp His Lys Ala

305 310 315 320

Glu Cys Ala Phe Pro Ser Ala Thr Gln Asn Glu Ile Asn Glu Lys Asp

325 330 335

Ala Ala Asn Leu Ile Lys Asn Gly Val Tyr Val Val Ser Glu Gly Ala

340 345 350

Asn Met Pro Thr Thr Ile Asp Gly Val Asn Gln Phe Leu Lys Ala Gly

355 360 365

Ile Leu Phe Gly Pro Gly Lys Ala Ala Asn Ala Gly Gly Val Ala Thr

370 375 380

Ser Gly Leu Glu Met Ala Gln Asn Ser Met Arg Ile Ser Trp Thr Arg

385 390 395 400

Glu Glu Val Asp Gly Arg Leu Phe Asn Ile Met Lys Thr Ile His Glu

405 410 415

Val Cys His Arg Thr Ala Glu Lys Tyr Gly Thr Pro Gly Asn Tyr Val

420 425 430

Asn Gly Ala Asn Ile Ala Gly Phe Leu Lys Val Ala Asn Ala Met Met

435 440 445

Asp Gln Gly Leu Val

450

<210> 183

<211> 1008

<212> DNA

<213> 阪崎肠杆菌

<400> 183

atgatcgacc tgcgctccga caccgttacg cgccctggta aggcgatgct ggaacgtatg 60

atggcggcac ctacgggtga tgacgtatat ggtgatgacc cgaccgtgaa tgcgcttcag 120

gattacgcgg cgcgtctttc tggtaaggag gcggccttat tcttacctac gggaacacaa 180

gcaaatctgg tggctttatt aagccactgc gagcgtggtg aggagtacat tgttggtcag 240

ttggctcata actacctgta cgaagctgga ggggccgccg tgttgggctc tatccagcca 300

caaccaatcg aagctgatgt ggacggtact ttacccctgg acaaggtcgc agcgaaaatc 360

aagccggatg atattcactt tgcgcgcacg cgcctgctga gtttggaaaa cactcataat 420

ggcaaagtcc tgcctcgtga ttacttacaa caggcatggg gatttacgcg cgagcgcggt 480

cttgcgttgc atgtcgatgg ggcacgtatc tttaatgcgg tagtcgcata cggctgtgaa 540

ttaaaggaga ttgctcaata ctgcgatacc ttcactattt gtttgagcaa aggattgggt 600

gctccagtag gatctttgtt ggtagggagt cacgattata ttaagcgcgc caagcgctgg 660

cgcaaaatga ctggcggagg aatgcgccaa gcggggattt tagccgctgc aggcttatat 720

gcgcttgagc acaacgttgc acgccttaaa gaggaccacg acaatgcagc ctggttggcg 780

gctgcgttgc gcgatgctgg ggctgaggta cgtcgccatg acaccaatat gttatttgtc 840

tccgtgcctc aagctcaggt tgcagccctt ggagccttta tgaaatctcg caacgtttta 900

atttccgcag ctcctgttac tcgtttggtt actcatcttg acgttaatcg tgagcagctt 960

gaaacggttg tagcctattg gcgtgaattt ctgcagcaaa cagcctaa 1008

<210> 184

<211> 335

<212> PRT

<213> 阪崎肠杆菌

<400> 184

Met Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Gly Lys Ala Met

1 5 10 15

Leu Glu Arg Met Met Ala Ala Pro Thr Gly Asp Asp Val Tyr Gly Asp

20 25 30

Asp Pro Thr Val Asn Ala Leu Gln Asp Tyr Ala Ala Arg Leu Ser Gly

35 40 45

Lys Glu Ala Ala Leu Phe Leu Pro Thr Gly Thr Gln Ala Asn Leu Val

50 55 60

Ala Leu Leu Ser His Cys Glu Arg Gly Glu Glu Tyr Ile Val Gly Gln

65 70 75 80

Leu Ala His Asn Tyr Leu Tyr Glu Ala Gly Gly Ala Ala Val Leu Gly

85 90 95

Ser Ile Gln Pro Gln Pro Ile Glu Ala Asp Val Asp Gly Thr Leu Pro

100 105 110

Leu Asp Lys Val Ala Ala Lys Ile Lys Pro Asp Asp Ile His Phe Ala

115 120 125

Arg Thr Arg Leu Leu Ser Leu Glu Asn Thr His Asn Gly Lys Val Leu

130 135 140

Pro Arg Asp Tyr Leu Gln Gln Ala Trp Gly Phe Thr Arg Glu Arg Gly

145 150 155 160

Leu Ala Leu His Val Asp Gly Ala Arg Ile Phe Asn Ala Val Val Ala

165 170 175

Tyr Gly Cys Glu Leu Lys Glu Ile Ala Gln Tyr Cys Asp Thr Phe Thr

180 185 190

Ile Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Leu Leu Val

195 200 205

Gly Ser His Asp Tyr Ile Lys Arg Ala Lys Arg Trp Arg Lys Met Thr

210 215 220

Gly Gly Gly Met Arg Gln Ala Gly Ile Leu Ala Ala Ala Gly Leu Tyr

225 230 235 240

Ala Leu Glu His Asn Val Ala Arg Leu Lys Glu Asp His Asp Asn Ala

245 250 255

Ala Trp Leu Ala Ala Ala Leu Arg Asp Ala Gly Ala Glu Val Arg Arg

260 265 270

His Asp Thr Asn Met Leu Phe Val Ser Val Pro Gln Ala Gln Val Ala

275 280 285

Ala Leu Gly Ala Phe Met Lys Ser Arg Asn Val Leu Ile Ser Ala Ala

290 295 300

Pro Val Thr Arg Leu Val Thr His Leu Asp Val Asn Arg Glu Gln Leu

305 310 315 320

Glu Thr Val Val Ala Tyr Trp Arg Glu Phe Leu Gln Gln Thr Ala

325 330 335

<210> 185

<211> 1041

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_1序列

<400> 185

atgagagaaa tcgacctgcg cagcgatacc gtcacacgcc cttccgccgc catgcgtgcg 60

gccatggccg cggccgaggt gggggacgac gtctacggtg aagacccgac cgtcaaccag 120

ctcgaggcgc tcgcagcgga gatgctcggc atggatgcgg ggttgttcgt gccatcgggc 180

acgcagggca acctgctcgg cgtgatgtcc cactgcgagc gcggcgacga atacatcgtc 240

gggcaacagg cgcataccta taagtacgag gggggcggcg ctgccgtgct gggcagcatc 300

cagccacagc cgctcgagtt ccagcgggat gggacgctcg atctcgctca agtcgccaac 360

gcgatcaaac ctgacgacac gcactttgcg cggacgcgtc tcctgtgcct cgaaaacact 420

caggatggca agccgctgcc gctggaatat ctggagcgcg cccatgcgtt cgcgcgcgaa 480

cgcaggctcg ggttgcacct ggacggcgcc cggctgttca acgccgcggt ggaccaggag 540

gtggcgccgc agcgaatcgc gcgcctcttt gacaccgtat cagtgtgtct gtcgaaaggt 600

ctcggcgcgc cggtcggatc ggtgttgtgc ggaaccgccg cgcatatgac caaggcgaga 660

cgctggcgga aagtgctcgg tggcgggatg cgacaggctg gcgtgctggc ggccgccggc 720

atctacgcgc tgcagaacaa cgtcaatcgg cttgctgagg atcatgcgaa cgcacggctc 780

ctcgcgacgc tcctctcgcg aatcgacaag gtcaccgtgg agtccgtcca gaccaacatg 840

gtcttcgcca gggtcgaccc gtcgcacgag ccgcacctgc gccagttcct cacgcatcga 900

catatccgga tccaccccgg cccccggctc aggctggtca cccatctcga cgtccaacgc 960

gacgacgtgg ttgcctttgc cgacgcggtg acggccttct acgcagggag cctgccgcca 1020

acggtcgccg cgcaggcctg a 1041

<210> 186

<211> 346

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_1序列

<400> 186

Met Arg Glu Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser Ala

1 5 10 15

Ala Met Arg Ala Ala Met Ala Ala Ala Glu Val Gly Asp Asp Val Tyr

20 25 30

Gly Glu Asp Pro Thr Val Asn Gln Leu Glu Ala Leu Ala Ala Glu Met

35 40 45

Leu Gly Met Asp Ala Gly Leu Phe Val Pro Ser Gly Thr Gln Gly Asn

50 55 60

Leu Leu Gly Val Met Ser His Cys Glu Arg Gly Asp Glu Tyr Ile Val

65 70 75 80

Gly Gln Gln Ala His Thr Tyr Lys Tyr Glu Gly Gly Gly Ala Ala Val

85 90 95

Leu Gly Ser Ile Gln Pro Gln Pro Leu Glu Phe Gln Arg Asp Gly Thr

100 105 110

Leu Asp Leu Ala Gln Val Ala Asn Ala Ile Lys Pro Asp Asp Thr His

115 120 125

Phe Ala Arg Thr Arg Leu Leu Cys Leu Glu Asn Thr Gln Asp Gly Lys

130 135 140

Pro Leu Pro Leu Glu Tyr Leu Glu Arg Ala His Ala Phe Ala Arg Glu

145 150 155 160

Arg Arg Leu Gly Leu His Leu Asp Gly Ala Arg Leu Phe Asn Ala Ala

165 170 175

Val Asp Gln Glu Val Ala Pro Gln Arg Ile Ala Arg Leu Phe Asp Thr

180 185 190

Val Ser Val Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Val

195 200 205

Leu Cys Gly Thr Ala Ala His Met Thr Lys Ala Arg Arg Trp Arg Lys

210 215 220

Val Leu Gly Gly Gly Met Arg Gln Ala Gly Val Leu Ala Ala Ala Gly

225 230 235 240

Ile Tyr Ala Leu Gln Asn Asn Val Asn Arg Leu Ala Glu Asp His Ala

245 250 255

Asn Ala Arg Leu Leu Ala Thr Leu Leu Ser Arg Ile Asp Lys Val Thr

260 265 270

Val Glu Ser Val Gln Thr Asn Met Val Phe Ala Arg Val Asp Pro Ser

275 280 285

His Glu Pro His Leu Arg Gln Phe Leu Thr His Arg His Ile Arg Ile

290 295 300

His Pro Gly Pro Arg Leu Arg Leu Val Thr His Leu Asp Val Gln Arg

305 310 315 320

Asp Asp Val Val Ala Phe Ala Asp Ala Val Thr Ala Phe Tyr Ala Gly

325 330 335

Ser Leu Pro Pro Thr Val Ala Ala Gln Ala

340 345

<210> 187

<211> 1017

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_2序列

<400> 187

gtgaagatag tcgatctgcg cagtgacacc gtgacccggc cgtcgccagg catgcgtgcc 60

gccatggctg cggcggaagt cggggacgac gtctacggcg aggaccccac ggtcaaccgt 120

ctcgaagcga tgaccgccga gatgctcggc aaggaagccg cgatcttcgt ctgcagcggc 180

acgcagagca acctgctcgc gctgatgtcc cattgcgagc gcggcgacga gtacatcgtc 240

ggacagcaag cgcacaccta caagttggaa ggcggcgggg cggcggtgct cggcagcatc 300

cagccgcagc cgctggacta cgagccggat ggatcgctcg acctcacccg tgtcgaagcc 360

gcgatcaagc ccgatgatcc acatttcgcc aagacccgcc tgctgtgtct ggagaacacc 420

caggccggca aggtgctgtc gctcgactac ctcgcgcgcg cgggccagtt cgcccgggcg 480

aacggacttc gcctgcatct cgacggcgcg cgcatcttca acgcggcggt cgatctcggc 540

gtcgcggtca tcgagatcag ccggcatttc gactcggtat cggtgtgtct gtcgaagggg 600

ctcggcgcgc cggtcggctc gatcctgtgc ggcacgcgtg agcggatcgt cagcgcgcgg 660

cgctggcgca aagtgctcgg cggcggcatg cgccaagccg gcgtgctggc ggcggcgggg 720

atctacgcgc tcgagcacaa catcgagcgg ttggccgagg atcacgagaa cgcgcgcgcg 780

ctcgtcgacg gaatggcgga gatcgacgag ctgaagacgg acggtccaca caccaacatg 840

gtgtacatcg cgctcgagcc gcggcggtcc gtggcgatgc gcaactatct ggaagagcgt 900

ggcatgcggg tgaagggcca gggaaccatg cggctggtga cgcacctcga cgtcgatcgg 960

agcgacatcc agcgattcgt ggcagcggcg aagcagttct tcgccgacgc ggcctga 1017

<210> 188

<211> 338

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_2序列

<400> 188

Val Lys Ile Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser Pro

1 5 10 15

Gly Met Arg Ala Ala Met Ala Ala Ala Glu Val Gly Asp Asp Val Tyr

20 25 30

Gly Glu Asp Pro Thr Val Asn Arg Leu Glu Ala Met Thr Ala Glu Met

35 40 45

Leu Gly Lys Glu Ala Ala Ile Phe Val Cys Ser Gly Thr Gln Ser Asn

50 55 60

Leu Leu Ala Leu Met Ser His Cys Glu Arg Gly Asp Glu Tyr Ile Val

65 70 75 80

Gly Gln Gln Ala His Thr Tyr Lys Leu Glu Gly Gly Gly Ala Ala Val

85 90 95

Leu Gly Ser Ile Gln Pro Gln Pro Leu Asp Tyr Glu Pro Asp Gly Ser

100 105 110

Leu Asp Leu Thr Arg Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His

115 120 125

Phe Ala Lys Thr Arg Leu Leu Cys Leu Glu Asn Thr Gln Ala Gly Lys

130 135 140

Val Leu Ser Leu Asp Tyr Leu Ala Arg Ala Gly Gln Phe Ala Arg Ala

145 150 155 160

Asn Gly Leu Arg Leu His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala

165 170 175

Val Asp Leu Gly Val Ala Val Ile Glu Ile Ser Arg His Phe Asp Ser

180 185 190

Val Ser Val Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Ile

195 200 205

Leu Cys Gly Thr Arg Glu Arg Ile Val Ser Ala Arg Arg Trp Arg Lys

210 215 220

Val Leu Gly Gly Gly Met Arg Gln Ala Gly Val Leu Ala Ala Ala Gly

225 230 235 240

Ile Tyr Ala Leu Glu His Asn Ile Glu Arg Leu Ala Glu Asp His Glu

245 250 255

Asn Ala Arg Ala Leu Val Asp Gly Met Ala Glu Ile Asp Glu Leu Lys

260 265 270

Thr Asp Gly Pro His Thr Asn Met Val Tyr Ile Ala Leu Glu Pro Arg

275 280 285

Arg Ser Val Ala Met Arg Asn Tyr Leu Glu Glu Arg Gly Met Arg Val

290 295 300

Lys Gly Gln Gly Thr Met Arg Leu Val Thr His Leu Asp Val Asp Arg

305 310 315 320

Ser Asp Ile Gln Arg Phe Val Ala Ala Ala Lys Gln Phe Phe Ala Asp

325 330 335

Ala Ala

<210> 189

<211> 1101

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_3序列

<400> 189

gtgtcaatta gcgtggtgga gatcggattg accggtctca ggccgccgct aattggcggc 60

atgaagacga tcgatttgcg cagcgacacg gtgacccggc cctgtgaagg gatgcggcgg 120

gcgatggcgg ccgcggaggt cggagatgat gtgtttggcg acgatccgac cgtgcgtcgt 180

ctcgaggcgc tggctgcgga aatgctgggg aaggaagcgg cggtgttcgc ggcgagcggc 240

acgcagacca atctcatctc cctcctgaca cattgcgggc gcggcgacga gtatatcgtc 300

gcccagcagg cgcacacgta tctccacgag ggcggtggcg gcgcggcgct gggcggcatc 360

cagccccagc cgctcgattg tctcccggat ggcaccctcg accttgccaa agtcgaagcg 420

cttatcaagc cggacgactc tcactacgcc cgcaccagac tgctttgcct cgagaacacc 480

atcggtggcc gtgcgctgcc ggcggactat cctgacaagg cccgcgcgtt gaccgaccgc 540

cgcggactac gcctccatct cgatggcgcc cgcatcttca atgctgcgat caagcagaac 600

cggcccgttg cggaactcgc gcggccgttt cacagcgtat cactctgcct ctcgaaggga 660

ctcggcgcgc cggtgggttc gctgctgctg gggagcgaag acttcatccg cgaggcccgc 720

cgctggcgaa aagtcgtcgg cggcggcatg cgtcaggcgg gaattctggc ggcggccggc 780

attttcgccc tgaccaaaaa cgtcgcccgg ctcgcagacg accacgagaa tgcccggcgc 840

ctcgccgacg gcctcgcagg cctggacggc ggccgctttt tcgtcgatcc tgccgctgtg 900

cagacgaaca tggtttttgt caggctcaac ggcatcgatg ccgcgacgct ggcggcacat 960

cttgcggagg cgggcatcct cattctcaaa ggaaatcccc tgagactcgt cacgcatcgc 1020

gacgtcgagg ccacggatat cgagcgtgcg atccgcgcct tcgcaacgtt tccgccaggg 1080

ccagattcag tgaatctgtg a 1101

<210> 190

<211> 366

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_3序列

<400> 190

Val Ser Ile Ser Val Val Glu Ile Gly Leu Thr Gly Leu Arg Pro Pro

1 5 10 15

Leu Ile Gly Gly Met Lys Thr Ile Asp Leu Arg Ser Asp Thr Val Thr

20 25 30

Arg Pro Cys Glu Gly Met Arg Arg Ala Met Ala Ala Ala Glu Val Gly

35 40 45

Asp Asp Val Phe Gly Asp Asp Pro Thr Val Arg Arg Leu Glu Ala Leu

50 55 60

Ala Ala Glu Met Leu Gly Lys Glu Ala Ala Val Phe Ala Ala Ser Gly

65 70 75 80

Thr Gln Thr Asn Leu Ile Ser Leu Leu Thr His Cys Gly Arg Gly Asp

85 90 95

Glu Tyr Ile Val Ala Gln Gln Ala His Thr Tyr Leu His Glu Gly Gly

100 105 110

Gly Gly Ala Ala Leu Gly Gly Ile Gln Pro Gln Pro Leu Asp Cys Leu

115 120 125

Pro Asp Gly Thr Leu Asp Leu Ala Lys Val Glu Ala Leu Ile Lys Pro

130 135 140

Asp Asp Ser His Tyr Ala Arg Thr Arg Leu Leu Cys Leu Glu Asn Thr

145 150 155 160

Ile Gly Gly Arg Ala Leu Pro Ala Asp Tyr Pro Asp Lys Ala Arg Ala

165 170 175

Leu Thr Asp Arg Arg Gly Leu Arg Leu His Leu Asp Gly Ala Arg Ile

180 185 190

Phe Asn Ala Ala Ile Lys Gln Asn Arg Pro Val Ala Glu Leu Ala Arg

195 200 205

Pro Phe His Ser Val Ser Leu Cys Leu Ser Lys Gly Leu Gly Ala Pro

210 215 220

Val Gly Ser Leu Leu Leu Gly Ser Glu Asp Phe Ile Arg Glu Ala Arg

225 230 235 240

Arg Trp Arg Lys Val Val Gly Gly Gly Met Arg Gln Ala Gly Ile Leu

245 250 255

Ala Ala Ala Gly Ile Phe Ala Leu Thr Lys Asn Val Ala Arg Leu Ala

260 265 270

Asp Asp His Glu Asn Ala Arg Arg Leu Ala Asp Gly Leu Ala Gly Leu

275 280 285

Asp Gly Gly Arg Phe Phe Val Asp Pro Ala Ala Val Gln Thr Asn Met

290 295 300

Val Phe Val Arg Leu Asn Gly Ile Asp Ala Ala Thr Leu Ala Ala His

305 310 315 320

Leu Ala Glu Ala Gly Ile Leu Ile Leu Lys Gly Asn Pro Leu Arg Leu

325 330 335

Val Thr His Arg Asp Val Glu Ala Thr Asp Ile Glu Arg Ala Ile Arg

340 345 350

Ala Phe Ala Thr Phe Pro Pro Gly Pro Asp Ser Val Asn Leu

355 360 365

<210> 191

<211> 1047

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_4序列

<400> 191

atgctagtgg atgacttccg ttcggatacg gtcacacgcc ccgatgaggg gatgcgttct 60

gccatggctg cggccgaggt tggcgacgca gtctacgatg actgcccgac caccaagcgc 120

ctggaagcca tggccgccga gcgattggga aaggaagcct cgcttttctt ccccaccggc 180

acgcaagcga acctggccgg gctcatggcc cattgcggcc gcggcgacga atatctcgtt 240

ggccagatgg ctcacaccta tcggctcgag ggcggcggcg ccgcagtcct ggggtcgatc 300

cagccgcagc ccctgcccaa cgagccagat gggacgatcg cgctggacgc catttcgaac 360

gcgatcaagc caaacgactt tcactttgca gtcacacggc tgctggccct ggaaaatacg 420

ttcaacggcc tggtgctgtc cgacgcatac ctgcaagccg caactcaact cgcccgctct 480

cgcgggcttg cgacgcatct ggacggtgcc cggttgatga acgctgcagc ggcgagcggc 540

cgggacgcgg catggatcgc ggcgcagttc gacaccgttt ccatgtgtct ctccaagggg 600

ctgggagcgc cggttggatc cgtcctcatc ggccccaggg acttcatcaa aaaggcccgc 660

cggatcagga aaatgctggg cggtggcatg cgccagaccg gcgtgctcgc cggggccgcg 720

atctatgcat tggagcacaa cgtcgcgcgg ctcagtgaag atcatcggcg cgccgctgat 780

ctcgcgactg tgctggcgcg ttttcccgag ctgggcgcgg gaccgtcacg aacgaacatg 840

gtgttcatga cgcccaaggg gctcgacgtt agcgctttcg tcgcgtttct gcgcggccgc 900

ggcatcgccg tcagcgggag atatggaacg ctgcgttggg tgacccatct ggacgttggt 960

gacgattccg tggagcgggt cgccgaggcc tgcgaggtct tctttgaagg gcaaaacgcg 1020

gccgccggat tgagagtgca agcctaa 1047

<210> 192

<211> 348

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_4序列

<400> 192

Met Leu Val Asp Asp Phe Arg Ser Asp Thr Val Thr Arg Pro Asp Glu

1 5 10 15

Gly Met Arg Ser Ala Met Ala Ala Ala Glu Val Gly Asp Ala Val Tyr

20 25 30

Asp Asp Cys Pro Thr Thr Lys Arg Leu Glu Ala Met Ala Ala Glu Arg

35 40 45

Leu Gly Lys Glu Ala Ser Leu Phe Phe Pro Thr Gly Thr Gln Ala Asn

50 55 60

Leu Ala Gly Leu Met Ala His Cys Gly Arg Gly Asp Glu Tyr Leu Val

65 70 75 80

Gly Gln Met Ala His Thr Tyr Arg Leu Glu Gly Gly Gly Ala Ala Val

85 90 95

Leu Gly Ser Ile Gln Pro Gln Pro Leu Pro Asn Glu Pro Asp Gly Thr

100 105 110

Ile Ala Leu Asp Ala Ile Ser Asn Ala Ile Lys Pro Asn Asp Phe His

115 120 125

Phe Ala Val Thr Arg Leu Leu Ala Leu Glu Asn Thr Phe Asn Gly Leu

130 135 140

Val Leu Ser Asp Ala Tyr Leu Gln Ala Ala Thr Gln Leu Ala Arg Ser

145 150 155 160

Arg Gly Leu Ala Thr His Leu Asp Gly Ala Arg Leu Met Asn Ala Ala

165 170 175

Ala Ala Ser Gly Arg Asp Ala Ala Trp Ile Ala Ala Gln Phe Asp Thr

180 185 190

Val Ser Met Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Val

195 200 205

Leu Ile Gly Pro Arg Asp Phe Ile Lys Lys Ala Arg Arg Ile Arg Lys

210 215 220

Met Leu Gly Gly Gly Met Arg Gln Thr Gly Val Leu Ala Gly Ala Ala

225 230 235 240

Ile Tyr Ala Leu Glu His Asn Val Ala Arg Leu Ser Glu Asp His Arg

245 250 255

Arg Ala Ala Asp Leu Ala Thr Val Leu Ala Arg Phe Pro Glu Leu Gly

260 265 270

Ala Gly Pro Ser Arg Thr Asn Met Val Phe Met Thr Pro Lys Gly Leu

275 280 285

Asp Val Ser Ala Phe Val Ala Phe Leu Arg Gly Arg Gly Ile Ala Val

290 295 300

Ser Gly Arg Tyr Gly Thr Leu Arg Trp Val Thr His Leu Asp Val Gly

305 310 315 320

Asp Asp Ser Val Glu Arg Val Ala Glu Ala Cys Glu Val Phe Phe Glu

325 330 335

Gly Gln Asn Ala Ala Ala Gly Leu Arg Val Gln Ala

340 345

<210> 193

<211> 1035

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_5序列

<400> 193

atgagcccaa tccgccacga tttccgctcc gacaccgtta cgcgtccgag ccccgctatg 60

cgggaggcga tggggcgggc cgaggtcggc gacgacgtgt ttggcgggga tccgaccgtg 120

aacgcgctcg aggccgagac cgccgaactg ctaggcaaag aggccgggct attcctaccg 180

tccggcacgc aatcgaacct cgtcgccctg atggcccatt gtgggcgggg cgacgaatac 240

atcacgggcc agcaggcgca ttgctatagg tgggaagcgg gcggggctgc cgtgctcggc 300

tcaatccagc cgcagccgat cgccaatgcc gctgacggca ccctcccgct cgccgagatc 360

gaagcggcta tcaagccgga cgatccgcat tacgcgacga cccggctcct ggcactcgag 420

aacacgatcg gcggcaaggt cctgccgcag gattacgtga ttgcggctac cgcgctcgcc 480

cggaagcaca ggctggcctg ccacctcgac ggcgcccggc tgtgcaatgc cgctgtggca 540

cagaacacga gcgcggccga gctcgccgcg ccgttcgaca cggtttcgct ctgcctctcg 600

aaggggctgg gcgcgcccgt cggatcggtg ctggtcgggc cgcgcgacct gatcggcaag 660

gcgcgccgca tccggaagat ggtgggcggg ggcatgcgcc aggcgggcgt gattgcggcg 720

ggtgccctct atgcactccg ccacaacatc gctcggctcg cggacgacca cgccaacgcc 780

gcgcgtctcg cgaagggtct ggccggcctg ccggggctct cggttgaggc ctccgggacc 840

aacatcgttt tcgtcgaggt ggaccgcgcg atcgcggagg cctttgcggg ccatctcgct 900

gcggccagcg tcggggtcac tggaaccacc cgccaacgct gggtcactca cctcgatgtc 960

ggccccgccg acgtcgacgc ggcgctggtt gcggcccagg ccttcttcac ggctgcccgt 1020

caggccgccg agtag 1035

<210> 194

<211> 344

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_5序列

<400> 194

Met Ser Pro Ile Arg His Asp Phe Arg Ser Asp Thr Val Thr Arg Pro

1 5 10 15

Ser Pro Ala Met Arg Glu Ala Met Gly Arg Ala Glu Val Gly Asp Asp

20 25 30

Val Phe Gly Gly Asp Pro Thr Val Asn Ala Leu Glu Ala Glu Thr Ala

35 40 45

Glu Leu Leu Gly Lys Glu Ala Gly Leu Phe Leu Pro Ser Gly Thr Gln

50 55 60

Ser Asn Leu Val Ala Leu Met Ala His Cys Gly Arg Gly Asp Glu Tyr

65 70 75 80

Ile Thr Gly Gln Gln Ala His Cys Tyr Arg Trp Glu Ala Gly Gly Ala

85 90 95

Ala Val Leu Gly Ser Ile Gln Pro Gln Pro Ile Ala Asn Ala Ala Asp

100 105 110

Gly Thr Leu Pro Leu Ala Glu Ile Glu Ala Ala Ile Lys Pro Asp Asp

115 120 125

Pro His Tyr Ala Thr Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Gly

130 135 140

Gly Lys Val Leu Pro Gln Asp Tyr Val Ile Ala Ala Thr Ala Leu Ala

145 150 155 160

Arg Lys His Arg Leu Ala Cys His Leu Asp Gly Ala Arg Leu Cys Asn

165 170 175

Ala Ala Val Ala Gln Asn Thr Ser Ala Ala Glu Leu Ala Ala Pro Phe

180 185 190

Asp Thr Val Ser Leu Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly

195 200 205

Ser Val Leu Val Gly Pro Arg Asp Leu Ile Gly Lys Ala Arg Arg Ile

210 215 220

Arg Lys Met Val Gly Gly Gly Met Arg Gln Ala Gly Val Ile Ala Ala

225 230 235 240

Gly Ala Leu Tyr Ala Leu Arg His Asn Ile Ala Arg Leu Ala Asp Asp

245 250 255

His Ala Asn Ala Ala Arg Leu Ala Lys Gly Leu Ala Gly Leu Pro Gly

260 265 270

Leu Ser Val Glu Ala Ser Gly Thr Asn Ile Val Phe Val Glu Val Asp

275 280 285

Arg Ala Ile Ala Glu Ala Phe Ala Gly His Leu Ala Ala Ala Ser Val

290 295 300

Gly Val Thr Gly Thr Thr Arg Gln Arg Trp Val Thr His Leu Asp Val

305 310 315 320

Gly Pro Ala Asp Val Asp Ala Ala Leu Val Ala Ala Gln Ala Phe Phe

325 330 335

Thr Ala Ala Arg Gln Ala Ala Glu

340

<210> 195

<211> 1032

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_6序列

<400> 195

atgactgaac gctggatcga cctgcgcagc gacaccgtca cccatccgaa tgacgccatg 60

cgcgccgtca tgggctccgc cagcgttggc gacgacgtct acgccgacga tcccagcgtg 120

aaccggctgc aggcgacggc cgctgagata ttcggcttcg aggccggcct gttcgcgccg 180

tccggcacgc agaccaacct gatcgcgctg atgacccatt gcgggcgcgg cgacgaatac 240

ctggtcgggc aggaagccca cacctacaag tacgaaggcg gcggcgcggc tgttctgggc 300

agcatccagc cgcagccgat tgtcaaccag gcggacggct cgattgccct agctgacatc 360

gccgccgcca tcaagcccga cgacatgcac ttcgcgcgca cgcgactgct ggcgctggag 420

aacaccattg gcgggcgcgt gctgggccgg gactacctgc ttgcggccac cggcctggcg 480

cacgagcggg gacttgccac ccacctggac ggggcgcgta tctgcaacgc cgccgtccag 540

cagggcatca gcctgatgga cgcggccacg ggtttcgaca gcgtgtcggt ctgcctgtcc 600

aagggcctgg gggcgccggt gggctcggtg ctgctcgggc cgcgcggctt catcgaggcc 660

ggcaagcgct ggcgcaagat gctgggcggc ggcatgcgcc aggccggcat cctggcagcg 720

gccggcctgt acgcgctgga gcacaacgtc gagcgcctgg ccgaggacca cgccaacgcg 780

gccgcgctgg ccgacggact gcgggtcatc gaacagctga aggtcagcac gccgcagacc 840

aacatcttct atgtcgagat tccggcggat gcctgcgatg gcttgcgcga agcgttggcg 900

cgcgcacaca tccgcgccag tatcggtccg cacacgcgct tggtcacgca tctcgacgtc 960

aaggccgagg acgtgaagac ggttgtcgac gcattcaccc gctttttcgc cggctggggg 1020

gcatccgcat ga 1032

<210> 196

<211> 343

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_6序列

<400> 196

Met Thr Glu Arg Trp Ile Asp Leu Arg Ser Asp Thr Val Thr His Pro

1 5 10 15

Asn Asp Ala Met Arg Ala Val Met Gly Ser Ala Ser Val Gly Asp Asp

20 25 30

Val Tyr Ala Asp Asp Pro Ser Val Asn Arg Leu Gln Ala Thr Ala Ala

35 40 45

Glu Ile Phe Gly Phe Glu Ala Gly Leu Phe Ala Pro Ser Gly Thr Gln

50 55 60

Thr Asn Leu Ile Ala Leu Met Thr His Cys Gly Arg Gly Asp Glu Tyr

65 70 75 80

Leu Val Gly Gln Glu Ala His Thr Tyr Lys Tyr Glu Gly Gly Gly Ala

85 90 95

Ala Val Leu Gly Ser Ile Gln Pro Gln Pro Ile Val Asn Gln Ala Asp

100 105 110

Gly Ser Ile Ala Leu Ala Asp Ile Ala Ala Ala Ile Lys Pro Asp Asp

115 120 125

Met His Phe Ala Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Gly

130 135 140

Gly Arg Val Leu Gly Arg Asp Tyr Leu Leu Ala Ala Thr Gly Leu Ala

145 150 155 160

His Glu Arg Gly Leu Ala Thr His Leu Asp Gly Ala Arg Ile Cys Asn

165 170 175

Ala Ala Val Gln Gln Gly Ile Ser Leu Met Asp Ala Ala Thr Gly Phe

180 185 190

Asp Ser Val Ser Val Cys Leu Ser Lys Gly Leu Gly Ala Pro Val Gly

195 200 205

Ser Val Leu Leu Gly Pro Arg Gly Phe Ile Glu Ala Gly Lys Arg Trp

210 215 220

Arg Lys Met Leu Gly Gly Gly Met Arg Gln Ala Gly Ile Leu Ala Ala

225 230 235 240

Ala Gly Leu Tyr Ala Leu Glu His Asn Val Glu Arg Leu Ala Glu Asp

245 250 255

His Ala Asn Ala Ala Ala Leu Ala Asp Gly Leu Arg Val Ile Glu Gln

260 265 270

Leu Lys Val Ser Thr Pro Gln Thr Asn Ile Phe Tyr Val Glu Ile Pro

275 280 285

Ala Asp Ala Cys Asp Gly Leu Arg Glu Ala Leu Ala Arg Ala His Ile

290 295 300

Arg Ala Ser Ile Gly Pro His Thr Arg Leu Val Thr His Leu Asp Val

305 310 315 320

Lys Ala Glu Asp Val Lys Thr Val Val Asp Ala Phe Thr Arg Phe Phe

325 330 335

Ala Gly Trp Gly Ala Ser Ala

340

<210> 197

<211> 1494

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_7序列

<400> 197

gtatacaagc gagagctgaa ctggctgcca tttttcggat ggggaattgc atcggcgcag 60

atgattagca tcgatcgcag caaaggtcag gacgcgttcg aacaagtcgt cgagcagggc 120

aacgatcggt tggcacgcgg ctggtggatc gtaatttttc ccgaaggtac gcgcatgcgg 180

cccggcacga tgaagcggta caagaccggt ggtgcacgtt tggcagtgcg aaccggcgct 240

gttgtcgtac cgattgcact gaactcgggc gagtattggc cgaagcattc gttaatcaag 300

acgcccggca tcattacggt gtccatcggt caaccgatcg atcctcgcga caagacggct 360

gaagaaatca gtgcgcaagt cgaatcatgg atcgaatcag aaatgcggcg gctggcgccg 420

catcgttaca gcgctccata cactccggag ccccgggtcc aaaccgtggc cgcgcaaagc 480

cttcgcacac cgattgctat gacaattgtc gatttacgct ccgatacagt cacacgtcca 540

tcgcccggta tgcgcaaagc gatgatggac gcagaagtgg gcgatgatgt gttcggcgac 600

gatcccaccg tcaaccggtt gcaggcgcgt gccgccgaga tattcggctt cgaatcggca 660

ctgctttttc cgtcaggcac gcaatctaac ctcgcagcgc tgatgagcca ttgccagcgc 720

ggcgatgagg taatcgtcgg caagctggca cacacttatc gtaacgaagc cggaggcgca 780

gccgtgctcg gctcgatcca accgcacgtc atcacgaatc gcgcggacgg ctcacttgat 840

ctcgctgaag tcgaggcggc gatcaagccc gatgacccgc atttcgctcg aacccgactg 900

cttgcgctcg aaaatacgat ctcaggcaaa gtactgtcga ggtcttatct cgaaaaggcg 960

ttgcagttgg cggaggcaaa gaaactttct gcacacctcg atggcgcgcg catcttcaat 1020

gcggcggtcg atcagaagat caaagtgaac gagctgtgcg cgggcttcga ctccgtatcg 1080

gcgtgtctat cgaagggact cggcgctccc gccggaactg tattgctcgg aagcaaggat 1140

ttgatcgagc gcgcgaagcg caaccgaaag atcctcggtg gcgcgatgcg ccaggcggga 1200

atcatcgctg cggccggcct ttacgcactg cagaacaaca tcgagcggtt gcaaagcgat 1260

catgacaatg ccaagcggct ggctgacgga ttgagagcgt tgaagctcga cgtcgaacaa 1320

catacgaaca tggtattcgt aaacgtgtca gccgaacatg ccgctgcgct cgctgcacac 1380

cttggacgaa gcggcgtgat cgtactgccg tgggcgccaa tgcgtttggt cacgcacctc 1440

gacgtcgacc gcgacggcat cgagcgagcg ctcgatgccg tcgcggaatt cgtt 1494

<210> 198

<211> 498

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_7序列

<400> 198

Val Tyr Lys Arg Glu Leu Asn Trp Leu Pro Phe Phe Gly Trp Gly Ile

1 5 10 15

Ala Ser Ala Gln Met Ile Ser Ile Asp Arg Ser Lys Gly Gln Asp Ala

20 25 30

Phe Glu Gln Val Val Glu Gln Gly Asn Asp Arg Leu Ala Arg Gly Trp

35 40 45

Trp Ile Val Ile Phe Pro Glu Gly Thr Arg Met Arg Pro Gly Thr Met

50 55 60

Lys Arg Tyr Lys Thr Gly Gly Ala Arg Leu Ala Val Arg Thr Gly Ala

65 70 75 80

Val Val Val Pro Ile Ala Leu Asn Ser Gly Glu Tyr Trp Pro Lys His

85 90 95

Ser Leu Ile Lys Thr Pro Gly Ile Ile Thr Val Ser Ile Gly Gln Pro

100 105 110

Ile Asp Pro Arg Asp Lys Thr Ala Glu Glu Ile Ser Ala Gln Val Glu

115 120 125

Ser Trp Ile Glu Ser Glu Met Arg Arg Leu Ala Pro His Arg Tyr Ser

130 135 140

Ala Pro Tyr Thr Pro Glu Pro Arg Val Gln Thr Val Ala Ala Gln Ser

145 150 155 160

Leu Arg Thr Pro Ile Ala Met Thr Ile Val Asp Leu Arg Ser Asp Thr

165 170 175

Val Thr Arg Pro Ser Pro Gly Met Arg Lys Ala Met Met Asp Ala Glu

180 185 190

Val Gly Asp Asp Val Phe Gly Asp Asp Pro Thr Val Asn Arg Leu Gln

195 200 205

Ala Arg Ala Ala Glu Ile Phe Gly Phe Glu Ser Ala Leu Leu Phe Pro

210 215 220

Ser Gly Thr Gln Ser Asn Leu Ala Ala Leu Met Ser His Cys Gln Arg

225 230 235 240

Gly Asp Glu Val Ile Val Gly Lys Leu Ala His Thr Tyr Arg Asn Glu

245 250 255

Ala Gly Gly Ala Ala Val Leu Gly Ser Ile Gln Pro His Val Ile Thr

260 265 270

Asn Arg Ala Asp Gly Ser Leu Asp Leu Ala Glu Val Glu Ala Ala Ile

275 280 285

Lys Pro Asp Asp Pro His Phe Ala Arg Thr Arg Leu Leu Ala Leu Glu

290 295 300

Asn Thr Ile Ser Gly Lys Val Leu Ser Arg Ser Tyr Leu Glu Lys Ala

305 310 315 320

Leu Gln Leu Ala Glu Ala Lys Lys Leu Ser Ala His Leu Asp Gly Ala

325 330 335

Arg Ile Phe Asn Ala Ala Val Asp Gln Lys Ile Lys Val Asn Glu Leu

340 345 350

Cys Ala Gly Phe Asp Ser Val Ser Ala Cys Leu Ser Lys Gly Leu Gly

355 360 365

Ala Pro Ala Gly Thr Val Leu Leu Gly Ser Lys Asp Leu Ile Glu Arg

370 375 380

Ala Lys Arg Asn Arg Lys Ile Leu Gly Gly Ala Met Arg Gln Ala Gly

385 390 395 400

Ile Ile Ala Ala Ala Gly Leu Tyr Ala Leu Gln Asn Asn Ile Glu Arg

405 410 415

Leu Gln Ser Asp His Asp Asn Ala Lys Arg Leu Ala Asp Gly Leu Arg

420 425 430

Ala Leu Lys Leu Asp Val Glu Gln His Thr Asn Met Val Phe Val Asn

435 440 445

Val Ser Ala Glu His Ala Ala Ala Leu Ala Ala His Leu Gly Arg Ser

450 455 460

Gly Val Ile Val Leu Pro Trp Ala Pro Met Arg Leu Val Thr His Leu

465 470 475 480

Asp Val Asp Arg Asp Gly Ile Glu Arg Ala Leu Asp Ala Val Ala Glu

485 490 495

Phe Val

<210> 199

<211> 1002

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_8序列

<400> 199

atgatcgatc tgcgcagcga taccgtcacc cggccgacgc ccggcatgct caaggccatg 60

gccgaggcgc cggtgggcga tgacgtcttc ggtgacgatc cgaccgtcaa caaattgcag 120

gcggtggtcg cggagcgcgc gggcaaggaa gccgcgctgt tcctggcgac aggcacccag 180

agcaatctca ccgccctgat ggcccattgc gagcggggcg acgaatatat cgtcggacag 240

aacgcgcata cctataagta cgaaggtggc ggcgcggcgg tgctgggtag catccagccg 300

cagccgctgg ccaacgcgcc cgatggcacc attccgctcg acctcatcgc cgcggcgatc 360

aagccggacg acacccattt tgcgatcacg cgcctgctcg cgctcgaaaa cacgatcgga 420

ggaaaggtgc tgccggcgag ttacatcgcc gaggccaccg ctttcgcgcg gagcaaggga 480

ctcggcaccc atctcgacgg cgcgcggatc tggaacgtga tggcggcgtc gaacgcctcg 540

ctcgccgagc tctgcgcgcc tttcgatacg gtctcgatgt gtttctcgaa aggcatgggc 600

gctccggtgg gctctgtgct ggccggaccg aaggcgctga tcacccgcgc ggcgcgctgg 660

cgtaaaacct tgggcggcgg gatgcgccag gcgggtgtgc tggcggcggc ctgcctctac 720

gcgctggaga atcacatcgg ccgcctgcgg accgatcacg gaaacgccgc caagctcggc 780

gcggcgctcg ggcaaatccc ggcgctgaaa ctcattcatc agtcgacgaa tatggtgtgg 840

cttaccgtgc cgccggagaa gtgcgccgcg ctcgacagct ttctcaaggc gcgcggcatc 900

ctgacgctga tggggccaac gctgcgtctt gtcacgcacg gcgacctgaa gcccggcgac 960

gtcgataccg ccatccaggc cttcaaggat ttcttcaagt ag 1002

<210> 200

<211> 333

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_8序列

<400> 200

Met Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Thr Pro Gly Met

1 5 10 15

Leu Lys Ala Met Ala Glu Ala Pro Val Gly Asp Asp Val Phe Gly Asp

20 25 30

Asp Pro Thr Val Asn Lys Leu Gln Ala Val Val Ala Glu Arg Ala Gly

35 40 45

Lys Glu Ala Ala Leu Phe Leu Ala Thr Gly Thr Gln Ser Asn Leu Thr

50 55 60

Ala Leu Met Ala His Cys Glu Arg Gly Asp Glu Tyr Ile Val Gly Gln

65 70 75 80

Asn Ala His Thr Tyr Lys Tyr Glu Gly Gly Gly Ala Ala Val Leu Gly

85 90 95

Ser Ile Gln Pro Gln Pro Leu Ala Asn Ala Pro Asp Gly Thr Ile Pro

100 105 110

Leu Asp Leu Ile Ala Ala Ala Ile Lys Pro Asp Asp Thr His Phe Ala

115 120 125

Ile Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Gly Gly Lys Val Leu

130 135 140

Pro Ala Ser Tyr Ile Ala Glu Ala Thr Ala Phe Ala Arg Ser Lys Gly

145 150 155 160

Leu Gly Thr His Leu Asp Gly Ala Arg Ile Trp Asn Val Met Ala Ala

165 170 175

Ser Asn Ala Ser Leu Ala Glu Leu Cys Ala Pro Phe Asp Thr Val Ser

180 185 190

Met Cys Phe Ser Lys Gly Met Gly Ala Pro Val Gly Ser Val Leu Ala

195 200 205

Gly Pro Lys Ala Leu Ile Thr Arg Ala Ala Arg Trp Arg Lys Thr Leu

210 215 220

Gly Gly Gly Met Arg Gln Ala Gly Val Leu Ala Ala Ala Cys Leu Tyr

225 230 235 240

Ala Leu Glu Asn His Ile Gly Arg Leu Arg Thr Asp His Gly Asn Ala

245 250 255

Ala Lys Leu Gly Ala Ala Leu Gly Gln Ile Pro Ala Leu Lys Leu Ile

260 265 270

His Gln Ser Thr Asn Met Val Trp Leu Thr Val Pro Pro Glu Lys Cys

275 280 285

Ala Ala Leu Asp Ser Phe Leu Lys Ala Arg Gly Ile Leu Thr Leu Met

290 295 300

Gly Pro Thr Leu Arg Leu Val Thr His Gly Asp Leu Lys Pro Gly Asp

305 310 315 320

Val Asp Thr Ala Ile Gln Ala Phe Lys Asp Phe Phe Lys

325 330

<210> 201

<211> 1044

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_9序列

<400> 201

atgaactcga ccattgactt gcgcagcgac accgtgacgc gccccaccgc tgcgatgcgt 60

gccgcgatga tggaagcgcc gctcggcgac gacgtgttcg gcgacgatcc cacggtcaat 120

gcgctgcaag acaagatcgc cggcatgctc ggcaaggagg cggcgctgtt catggcgtcg 180

ggcacgcaga gcaatctgtc ggcgttgatg gcgcattgcc agcgcggcga cgagtacatc 240

gtgggtcagg gcgcgcacac ctatcgctac gaggccggcg gcggcgcggt gctcggcagc 300

atccagccgc agcccatcac gaatcagccc gatggctcgc tggcgctggc cgacatcgag 360

gccgcgatca agcccgacga tgcgcatttc gcgcgcacgc gcttgctgtg cctggagaac 420

acgttcggtg gccaggtgct cggcatcgac tacctgcgcc aggccaccga tctggcgcgg 480

cgccacggcc tggccacgca tctggacggc gcgcgcctgt tcaacgccgc ggtcgcgctg 540

gcgcatcagc agtcgggcgg cggcgatgcg cgcgccaagg ccaaagagat ggccgaactg 600

ttcgacagcg tgtcggtgtg cttcagcaag ggcctcggcg cgccggtggg ttcggcgctg 660

gtcggcagcc gcgagctgat cgcacgcgcg caccgcgtgc gcaagatgct gggcggcgga 720

ttgcgccagg ccggcgtgct ggccgccgct gcgctgcatg cgctcgacca tcacatcgat 780

cggcttgccg aggatcacgc gaacgcgcag cggctggccg aagggctgcg cgggcttggc 840

agtgtgttga acgcgccggc caacaccaac atggtcttcg tcgacctggc acccggcggc 900

tcacgccagg acaccgtggc ccacctgcgc gagcacggcg tgttgtgcac cggcttgtac 960

aagctgcgcc tggtgacgca cctcgacgtg agtgccgacg acgtcgatcg cgccgtgcgc 1020

atcttgcgcg agacgttgaa cacg 1044

<210> 202

<211> 348

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_9序列

<400> 202

Met Asn Ser Thr Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Thr

1 5 10 15

Ala Ala Met Arg Ala Ala Met Met Glu Ala Pro Leu Gly Asp Asp Val

20 25 30

Phe Gly Asp Asp Pro Thr Val Asn Ala Leu Gln Asp Lys Ile Ala Gly

35 40 45

Met Leu Gly Lys Glu Ala Ala Leu Phe Met Ala Ser Gly Thr Gln Ser

50 55 60

Asn Leu Ser Ala Leu Met Ala His Cys Gln Arg Gly Asp Glu Tyr Ile

65 70 75 80

Val Gly Gln Gly Ala His Thr Tyr Arg Tyr Glu Ala Gly Gly Gly Ala

85 90 95

Val Leu Gly Ser Ile Gln Pro Gln Pro Ile Thr Asn Gln Pro Asp Gly

100 105 110

Ser Leu Ala Leu Ala Asp Ile Glu Ala Ala Ile Lys Pro Asp Asp Ala

115 120 125

His Phe Ala Arg Thr Arg Leu Leu Cys Leu Glu Asn Thr Phe Gly Gly

130 135 140

Gln Val Leu Gly Ile Asp Tyr Leu Arg Gln Ala Thr Asp Leu Ala Arg

145 150 155 160

Arg His Gly Leu Ala Thr His Leu Asp Gly Ala Arg Leu Phe Asn Ala

165 170 175

Ala Val Ala Leu Ala His Gln Gln Ser Gly Gly Gly Asp Ala Arg Ala

180 185 190

Lys Ala Lys Glu Met Ala Glu Leu Phe Asp Ser Val Ser Val Cys Phe

195 200 205

Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Ala Leu Val Gly Ser Arg

210 215 220

Glu Leu Ile Ala Arg Ala His Arg Val Arg Lys Met Leu Gly Gly Gly

225 230 235 240

Leu Arg Gln Ala Gly Val Leu Ala Ala Ala Ala Leu His Ala Leu Asp

245 250 255

His His Ile Asp Arg Leu Ala Glu Asp His Ala Asn Ala Gln Arg Leu

260 265 270

Ala Glu Gly Leu Arg Gly Leu Gly Ser Val Leu Asn Ala Pro Ala Asn

275 280 285

Thr Asn Met Val Phe Val Asp Leu Ala Pro Gly Gly Ser Arg Gln Asp

290 295 300

Thr Val Ala His Leu Arg Glu His Gly Val Leu Cys Thr Gly Leu Tyr

305 310 315 320

Lys Leu Arg Leu Val Thr His Leu Asp Val Ser Ala Asp Asp Val Asp

325 330 335

Arg Ala Val Arg Ile Leu Arg Glu Thr Leu Asn Thr

340 345

<210> 203

<211> 1098

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_10序列

<400> 203

atgcggcacg agccggccgg gaagctgccc ctcgcctcga cctgctccaa cgaatcacca 60

atcacgaatc acgaatcacg gctctccatg ctgatcgacc tcagaagcga caccgtcacc 120

cgtcccaccg cggcgatgcg cgaggcgatg ctgcgcgcgg aggtcggtga cgacgtctac 180

ggcgaggacc cgaccgtgaa cgcgctgcag gcgcggctgg cggccgagct cggcttcgag 240

gccggcctgt tcgtgccctc gggcacgcag tcgaacctgc tggcgctgat gagccattgc 300

gcgcgcggcg acgagtacct ggtcgggatg gaggcgcaca cctacaagtt cgaaggcggc 360

ggcgccgcgg tgctgggctc gatccagccg cagccgatcc cgcacgcgcc cgacggcacg 420

ctgccgctcg atgccgtggc gcgcgcgatc aagccggtcg atccgcactt cgcccgcagc 480

cgtctgctgt gcctggagaa cacctggcac ggccgaccgc tgccgctcga ctacctcgcg 540

caggcgcgtg ccttctgtcg cgagcgcggg ctcggcctgc acctggatgg cgcgcgcctg 600

ttcaatgccg cggtcgcctg ccgggtcgag gcacgcgcca tcgcccggca cttcgacagc 660

gtctcgatct gcttctccaa gggcctgggc gcaccggtcg gctcggtact ggtcggctcg 720

cacgcgctga tcgacgaggc gcggcgctgg cgcaaggtcg ccggcggcgg ctggcgccag 780

gccggcatgc tggcggcagc ctgcctgtac gcgctcgacc atcacgtcgc gcgactggcc 840

gacgaccatg cccgcgccgc gcgcctggcc gaaggtctgc gcggcctgac tgggctcgag 900

gtcgtggcgc agcacaccaa catggtgttc atcgacgtcg caccggaacg gctggcggcg 960

ttcaagcagc aactcgaggc ggcgcgcatc cggatgtcga tcggctacct gccgagcatc 1020

cgtctggtga cgcacctgga catcgacgac gccgcggtcg agcacacgat cgcgacgctg 1080

cgtgggttct tccgctga 1098

<210> 204

<211> 365

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_10序列

<400> 204

Met Arg His Glu Pro Ala Gly Lys Leu Pro Leu Ala Ser Thr Cys Ser

1 5 10 15

Asn Glu Ser Pro Ile Thr Asn His Glu Ser Arg Leu Ser Met Leu Ile

20 25 30

Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Thr Ala Ala Met Arg Glu

35 40 45

Ala Met Leu Arg Ala Glu Val Gly Asp Asp Val Tyr Gly Glu Asp Pro

50 55 60

Thr Val Asn Ala Leu Gln Ala Arg Leu Ala Ala Glu Leu Gly Phe Glu

65 70 75 80

Ala Gly Leu Phe Val Pro Ser Gly Thr Gln Ser Asn Leu Leu Ala Leu

85 90 95

Met Ser His Cys Ala Arg Gly Asp Glu Tyr Leu Val Gly Met Glu Ala

100 105 110

His Thr Tyr Lys Phe Glu Gly Gly Gly Ala Ala Val Leu Gly Ser Ile

115 120 125

Gln Pro Gln Pro Ile Pro His Ala Pro Asp Gly Thr Leu Pro Leu Asp

130 135 140

Ala Val Ala Arg Ala Ile Lys Pro Val Asp Pro His Phe Ala Arg Ser

145 150 155 160

Arg Leu Leu Cys Leu Glu Asn Thr Trp His Gly Arg Pro Leu Pro Leu

165 170 175

Asp Tyr Leu Ala Gln Ala Arg Ala Phe Cys Arg Glu Arg Gly Leu Gly

180 185 190

Leu His Leu Asp Gly Ala Arg Leu Phe Asn Ala Ala Val Ala Cys Arg

195 200 205

Val Glu Ala Arg Ala Ile Ala Arg His Phe Asp Ser Val Ser Ile Cys

210 215 220

Phe Ser Lys Gly Leu Gly Ala Pro Val Gly Ser Val Leu Val Gly Ser

225 230 235 240

His Ala Leu Ile Asp Glu Ala Arg Arg Trp Arg Lys Val Ala Gly Gly

245 250 255

Gly Trp Arg Gln Ala Gly Met Leu Ala Ala Ala Cys Leu Tyr Ala Leu

260 265 270

Asp His His Val Ala Arg Leu Ala Asp Asp His Ala Arg Ala Ala Arg

275 280 285

Leu Ala Glu Gly Leu Arg Gly Leu Thr Gly Leu Glu Val Val Ala Gln

290 295 300

His Thr Asn Met Val Phe Ile Asp Val Ala Pro Glu Arg Leu Ala Ala

305 310 315 320

Phe Lys Gln Gln Leu Glu Ala Ala Arg Ile Arg Met Ser Ile Gly Tyr

325 330 335

Leu Pro Ser Ile Arg Leu Val Thr His Leu Asp Ile Asp Asp Ala Ala

340 345 350

Val Glu His Thr Ile Ala Thr Leu Arg Gly Phe Phe Arg

355 360 365

<210> 205

<211> 1002

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_11序列

<400> 205

atgacggttg acctgcgctc cgataccgtc acgcgccctt cggcggggat gcgcaaggcg 60

atgatggagg ccgagctggg cgacgacgtg ttcggcgacg acccgaccgt caaccgcctg 120

caggagcggg cggccgagat cttcggcttc gaggccgccc tgcttttccc caccggcacc 180

cagtcgaatc tcgccgccct gatcgctcac tgcgatcggg gcgacgaagt catcctcggc 240

tcggaagccc acagttaccg ctacgaggcc ggcggcctcg cggtgctcgg ctcgatccag 300

ccgcaggtgg ttcccaaccg tgccgatgga acccttgatt tgaatgaagt ggaatccctg 360

ataaagcccg acgaccctca cttcccgcgc acgcggctgc tcgcgctcga gaacacgatt 420

accggccggg tcatcccgcg gccgtatctc gagcaggcgg ttgccctcgc gaaaaagaag 480

cggctcgccg tccacctgga cggggcgagg attttcaatg ccgcaacggc gctgaagatg 540

aaggtgaaag acctgtgcgc cgggttcgac tcggtgtcgt cgtgcctgtc gaaagggctg 600

ggcaccccgg ccggcaccgt tcttctcggc ggaaaagaat tcatccagaa agcgaaacgc 660

gcgcggaaaa tcctcggcgg cgggatgcgg caggcgggcg tgatcgccgc ggccgggctc 720

tacgcgctgg agaacaacgt cgagcgcctg cgcatcgatc acgacaacgc ggaaaagctt 780

gcacgcggct tacgcgactt gaaactggac gttcagctaa ataccaacat ggtgctggtg 840

aagatcaagc cggaggaagc ccacgatctg gcttcgttta tgagcaccaa gggtgtgctc 900

gtgctgccgc gtgcgccgat gcggctggtc acccacctcg acgtcgacgc cgccgggatc 960

gaccgcgcgc tcgtaggctt ccgcggcttt ttcggaaaat ga 1002

<210> 206

<211> 333

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_11序列

<400> 206

Met Thr Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser Ala Gly

1 5 10 15

Met Arg Lys Ala Met Met Glu Ala Glu Leu Gly Asp Asp Val Phe Gly

20 25 30

Asp Asp Pro Thr Val Asn Arg Leu Gln Glu Arg Ala Ala Glu Ile Phe

35 40 45

Gly Phe Glu Ala Ala Leu Leu Phe Pro Thr Gly Thr Gln Ser Asn Leu

50 55 60

Ala Ala Leu Ile Ala His Cys Asp Arg Gly Asp Glu Val Ile Leu Gly

65 70 75 80

Ser Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Leu Ala Val Leu

85 90 95

Gly Ser Ile Gln Pro Gln Val Val Pro Asn Arg Ala Asp Gly Thr Leu

100 105 110

Asp Leu Asn Glu Val Glu Ser Leu Ile Lys Pro Asp Asp Pro His Phe

115 120 125

Pro Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Thr Gly Arg Val

130 135 140

Ile Pro Arg Pro Tyr Leu Glu Gln Ala Val Ala Leu Ala Lys Lys Lys

145 150 155 160

Arg Leu Ala Val His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala Thr

165 170 175

Ala Leu Lys Met Lys Val Lys Asp Leu Cys Ala Gly Phe Asp Ser Val

180 185 190

Ser Ser Cys Leu Ser Lys Gly Leu Gly Thr Pro Ala Gly Thr Val Leu

195 200 205

Leu Gly Gly Lys Glu Phe Ile Gln Lys Ala Lys Arg Ala Arg Lys Ile

210 215 220

Leu Gly Gly Gly Met Arg Gln Ala Gly Val Ile Ala Ala Ala Gly Leu

225 230 235 240

Tyr Ala Leu Glu Asn Asn Val Glu Arg Leu Arg Ile Asp His Asp Asn

245 250 255

Ala Glu Lys Leu Ala Arg Gly Leu Arg Asp Leu Lys Leu Asp Val Gln

260 265 270

Leu Asn Thr Asn Met Val Leu Val Lys Ile Lys Pro Glu Glu Ala His

275 280 285

Asp Leu Ala Ser Phe Met Ser Thr Lys Gly Val Leu Val Leu Pro Arg

290 295 300

Ala Pro Met Arg Leu Val Thr His Leu Asp Val Asp Ala Ala Gly Ile

305 310 315 320

Asp Arg Ala Leu Val Gly Phe Arg Gly Phe Phe Gly Lys

325 330

<210> 207

<211> 969

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_12序列

<400> 207

atggcggccg agctcggcga cgacgtgttc ggcgacgacc cgaccgtcaa ccgcctgcag 60

gagcgcgccg ccgaggtctt cggcttcgag gcggcgctgc tctttccctc cgggacccag 120

tcgaacctcg cggcgctgat gggacactgc cagcgcggcg aggaggtgat cctcgggacg 180

gaagcgcata gctaccgcta cgaggcgggc gggctctcgg tgctcggctc gatccacccg 240

caggcgatca ccaaccgggc ggacggcacg ctggatctcg ccgaggtcga ggccgcgatc 300

aagcccgacg acccgcattt tccgaggact cgcctcattg ctctggagaa cacgatcacc 360

ggccgggtcc tgccccgcga atacctggcc aaggcggcgg agctggcgaa gcggaaaaac 420

ctggcgatcc acctcgacgg cgcgcgggtg ttcaacgcgg cgacgcatct cggcatgaag 480

gtgaaagacc tttgcgccgg cttcgactcg gtgtcctcgt gcctgtcgaa ggggttgggc 540

acgccggcag gcacggtgct cctcggcagc aagtccttca tcgggaaagc aagacgctcg 600

cgcaagatcc tcggcggcgc gatgcgccag gcgggggtca tcgctgccgc gggactctac 660

gcgctcgagc acaacgtcga gcgcctgaag accgatcacg agaatgccga gcgccttgcc 720

aggggattgc gcgagctcgg gctcgaagtg cagcacaaca cgaacatggt gctggtgaag 780

attgcgccgg accgggcagc ggcggtggaa gtccacatga ggaagaacaa catcctggtc 840

ctgccgcgcg cgccgatgcg gctcgtcacg catctcgacg tcgacgcggc cggcatcgac 900

cgcgcgctcg ccgggtttcg cagcttccta gggagtggtt tcctcgaaag ccaggcgcag 960

tcgcgctag 969

<210> 208

<211> 322

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_12序列

<400> 208

Met Ala Ala Glu Leu Gly Asp Asp Val Phe Gly Asp Asp Pro Thr Val

1 5 10 15

Asn Arg Leu Gln Glu Arg Ala Ala Glu Val Phe Gly Phe Glu Ala Ala

20 25 30

Leu Leu Phe Pro Ser Gly Thr Gln Ser Asn Leu Ala Ala Leu Met Gly

35 40 45

His Cys Gln Arg Gly Glu Glu Val Ile Leu Gly Thr Glu Ala His Ser

50 55 60

Tyr Arg Tyr Glu Ala Gly Gly Leu Ser Val Leu Gly Ser Ile His Pro

65 70 75 80

Gln Ala Ile Thr Asn Arg Ala Asp Gly Thr Leu Asp Leu Ala Glu Val

85 90 95

Glu Ala Ala Ile Lys Pro Asp Asp Pro His Phe Pro Arg Thr Arg Leu

100 105 110

Ile Ala Leu Glu Asn Thr Ile Thr Gly Arg Val Leu Pro Arg Glu Tyr

115 120 125

Leu Ala Lys Ala Ala Glu Leu Ala Lys Arg Lys Asn Leu Ala Ile His

130 135 140

Leu Asp Gly Ala Arg Val Phe Asn Ala Ala Thr His Leu Gly Met Lys

145 150 155 160

Val Lys Asp Leu Cys Ala Gly Phe Asp Ser Val Ser Ser Cys Leu Ser

165 170 175

Lys Gly Leu Gly Thr Pro Ala Gly Thr Val Leu Leu Gly Ser Lys Ser

180 185 190

Phe Ile Gly Lys Ala Arg Arg Ser Arg Lys Ile Leu Gly Gly Ala Met

195 200 205

Arg Gln Ala Gly Val Ile Ala Ala Ala Gly Leu Tyr Ala Leu Glu His

210 215 220

Asn Val Glu Arg Leu Lys Thr Asp His Glu Asn Ala Glu Arg Leu Ala

225 230 235 240

Arg Gly Leu Arg Glu Leu Gly Leu Glu Val Gln His Asn Thr Asn Met

245 250 255

Val Leu Val Lys Ile Ala Pro Asp Arg Ala Ala Ala Val Glu Val His

260 265 270

Met Arg Lys Asn Asn Ile Leu Val Leu Pro Arg Ala Pro Met Arg Leu

275 280 285

Val Thr His Leu Asp Val Asp Ala Ala Gly Ile Asp Arg Ala Leu Ala

290 295 300

Gly Phe Arg Ser Phe Leu Gly Ser Gly Phe Leu Glu Ser Gln Ala Gln

305 310 315 320

Ser Arg

<210> 209

<211> 1017

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_13序列

<400> 209

atgatcgatc ttcgttccga caccgtaacc cgcccctcgc ccgggatgcg caaggcgatg 60

cacgaagccg agctcggcga cgacgtgttc ggcgacgacc cgaccgtcaa ccgcctgcag 120

gcgcgcgccg ccgagatgtt cggcttcgag gcggcgctgc tctttcccac cggcacccag 180

tccaacctcg cggcgctgat gagccactgc ggccgcggcg aggaagtgat cctcggcatg 240

gaggcgcaca gctaccgcta cgaggcgggc ggcgtctcgg tgctcgcttc catccagccg 300

caggcagtgc cgaaccgccc cgacggctcg ctcgatctgg cggaagtcga ggccgcgatc 360

aagcccgacg acccgcattt cgcgcgcacg cgtctgctgg cgctcgagaa caccatcagc 420

ggccgggttc tatcccgcga atatctgcaa aaggcggtgg atctggcgag gcgcaaacac 480

ctcgcgatcc acctcgacgg cgcgcggatt ttcaatgcgg ccacccagct caacatgaag 540

gtgaaggacc tttgcgccgg cttcgactcg gtttcctcct gcctgtccaa gggattgggc 600

acgcctgcgg gtacggttct gctgggaagc tcggaattga tccagaaggc aaagcgcgcg 660

cgcaagatcc tcggcggcgg aatgcgccag gcaggcgtga ttgcggccgc gggcctctac 720

gcgctggaga acaacgtcga gcgtctgaag acggaccatg aaaatgccga gcggctggcg 780

cgtgggctgc gcgagctcgg actggacgtt cagcacaaca ccaacatggt gatggtgaag 840

cttccgccgg agaaggccca gccgctggcc gatgcgttga aacggcagca catcctggtg 900

ctgccgcgcg cgccgatgcg cctcgtcacg cacctggatg tcgacgcggc gggcatcgac 960

cgcgcgctgg cgggcttccg cagcttcttt ggcgcccggg ctccgtcccc gaactga 1017

<210> 210

<211> 338

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_13序列

<400> 210

Met Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser Pro Gly Met

1 5 10 15

Arg Lys Ala Met His Glu Ala Glu Leu Gly Asp Asp Val Phe Gly Asp

20 25 30

Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu Met Phe Gly

35 40 45

Phe Glu Ala Ala Leu Leu Phe Pro Thr Gly Thr Gln Ser Asn Leu Ala

50 55 60

Ala Leu Met Ser His Cys Gly Arg Gly Glu Glu Val Ile Leu Gly Met

65 70 75 80

Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Val Ser Val Leu Ala

85 90 95

Ser Ile Gln Pro Gln Ala Val Pro Asn Arg Pro Asp Gly Ser Leu Asp

100 105 110

Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His Phe Ala

115 120 125

Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Ser Gly Arg Val Leu

130 135 140

Ser Arg Glu Tyr Leu Gln Lys Ala Val Asp Leu Ala Arg Arg Lys His

145 150 155 160

Leu Ala Ile His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala Thr Gln

165 170 175

Leu Asn Met Lys Val Lys Asp Leu Cys Ala Gly Phe Asp Ser Val Ser

180 185 190

Ser Cys Leu Ser Lys Gly Leu Gly Thr Pro Ala Gly Thr Val Leu Leu

195 200 205

Gly Ser Ser Glu Leu Ile Gln Lys Ala Lys Arg Ala Arg Lys Ile Leu

210 215 220

Gly Gly Gly Met Arg Gln Ala Gly Val Ile Ala Ala Ala Gly Leu Tyr

225 230 235 240

Ala Leu Glu Asn Asn Val Glu Arg Leu Lys Thr Asp His Glu Asn Ala

245 250 255

Glu Arg Leu Ala Arg Gly Leu Arg Glu Leu Gly Leu Asp Val Gln His

260 265 270

Asn Thr Asn Met Val Met Val Lys Leu Pro Pro Glu Lys Ala Gln Pro

275 280 285

Leu Ala Asp Ala Leu Lys Arg Gln His Ile Leu Val Leu Pro Arg Ala

290 295 300

Pro Met Arg Leu Val Thr His Leu Asp Val Asp Ala Ala Gly Ile Asp

305 310 315 320

Arg Ala Leu Ala Gly Phe Arg Ser Phe Phe Gly Ala Arg Ala Pro Ser

325 330 335

Pro Asn

<210> 211

<211> 1002

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_14序列

<400> 211

atgaatcaac ccatagacct tcgctccgat acggttaccc gtccttcagc cggcatgcgc 60

aaggccatgg cggaggccga gctcggcgac gacgtcttcg gcgacgaccc caccgtcaac 120

cgcctgcagg cgcgcgccgc cgagctgttc ggggtggagg cggcgctttt cttccccagc 180

ggcacgcagt ccaacctcgc cgcgctgatg tcgcattgcc agcggggcga ggaagtcatc 240

ctcggctccg aggcgcacag ctatcgctac gaggccggcg ggctcgccgt cctcggctcg 300

atccagccgc aggtcgtgct caaccgcgcc gatggcacgc tcgatctcgc ggaagtggaa 360

gcggcgatca agcccgacga tccgcatttc ccgaaaacgc gattgctcgc gctcgagaac 420

acgataacgg ggcgggtgct cccccgctcc tacctcgaaa aggcgatcaa cgttgcaaac 480

aggcgcggtc tcgccaccca cctcgatggc gcgcgcatct tcaacgcggc aatgcacgag 540

aagatcaacg tcaaaacgct gtgcgcagga ttcgattcgg tctcgtcgtg cctgtccaaa 600

gggctcggca cgccggccgg caccgtgctg gtcgggaaaa aagagatcat cgagaaggcc 660

aagcgcgcaa gaaagatcct gggcgggggc atgcgccagg ccggggtgct ggccgcagcc 720

ggcctctacg cgctggagaa caacgtcgag cggctcgccg aagaccacgc caacgccgag 780

cggctcgcca aagggctgcg cgagctggga caggaagtgc agctcagcac gaacatggtg 840

atgctcacta tcccgtccga gaaggccgcg ccgctcgccg agcacatgaa gaagagcggc 900

gtgatcgtgc tgccgcgggc accgatgcgg cttgtcacgc acctcgacgt cgacgcggcc 960

ggcatcgatc gcgcgctggc cgccttccgc gctttcttct ag 1002

<210> 212

<211> 333

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_14序列

<400> 212

Met Asn Gln Pro Ile Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser

1 5 10 15

Ala Gly Met Arg Lys Ala Met Ala Glu Ala Glu Leu Gly Asp Asp Val

20 25 30

Phe Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu

35 40 45

Leu Phe Gly Val Glu Ala Ala Leu Phe Phe Pro Ser Gly Thr Gln Ser

50 55 60

Asn Leu Ala Ala Leu Met Ser His Cys Gln Arg Gly Glu Glu Val Ile

65 70 75 80

Leu Gly Ser Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Leu Ala

85 90 95

Val Leu Gly Ser Ile Gln Pro Gln Val Val Leu Asn Arg Ala Asp Gly

100 105 110

Thr Leu Asp Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro

115 120 125

His Phe Pro Lys Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Thr Gly

130 135 140

Arg Val Leu Pro Arg Ser Tyr Leu Glu Lys Ala Ile Asn Val Ala Asn

145 150 155 160

Arg Arg Gly Leu Ala Thr His Leu Asp Gly Ala Arg Ile Phe Asn Ala

165 170 175

Ala Met His Glu Lys Ile Asn Val Lys Thr Leu Cys Ala Gly Phe Asp

180 185 190

Ser Val Ser Ser Cys Leu Ser Lys Gly Leu Gly Thr Pro Ala Gly Thr

195 200 205

Val Leu Val Gly Lys Lys Glu Ile Ile Glu Lys Ala Lys Arg Ala Arg

210 215 220

Lys Ile Leu Gly Gly Gly Met Arg Gln Ala Gly Val Leu Ala Ala Ala

225 230 235 240

Gly Leu Tyr Ala Leu Glu Asn Asn Val Glu Arg Leu Ala Glu Asp His

245 250 255

Ala Asn Ala Glu Arg Leu Ala Lys Gly Leu Arg Glu Leu Gly Gln Glu

260 265 270

Val Gln Leu Ser Thr Asn Met Val Met Leu Thr Ile Pro Ser Glu Lys

275 280 285

Ala Ala Pro Leu Ala Glu His Met Lys Lys Ser Gly Val Ile Val Leu

290 295 300

Pro Arg Ala Pro Met Arg Leu Val Thr His Leu Asp Val Asp Ala Ala

305 310 315 320

Gly Ile Asp Arg Ala Leu Ala Ala Phe Arg Ala Phe Phe

325 330

<210> 213

<211> 1005

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_15序列

<400> 213

atggcctcaa tcgtagacct gcgctcggat accgtcacgc gtccctccgc ggcgatgcgc 60

cgcgcgatgc tggagtcgga gcttggcgac gacgtgttcg gcgacgaccc gacggtcaat 120

cgcctgcagg agcgggccgc ggagatcttc ggcttcgaag ccgccttgct gtttccgtcc 180

ggtacgcagt ccaacctcgc cgccctcatg agccattgcc agcgtggcga ggaagtgatt 240

ctcggccagg aggcgcacag ctatcgctac gaggcgggcg gcgctgcggt actgggctcg 300

atccagcccc aggcgatcgc caaccggccc gacggcacgc tcgatctcgc cgaagtcgag 360

gccgcgatca agcccgacga tccgcacttc gcaagaacgc gcctgctcgc gctcgagaac 420

acgatcggcg gtcgggtgct cccccgccgc tatctcgccg aagcgctcga tctcgcgaag 480

aagaagactc ttgcgacgca cctcgatggc gcacgcgtct tcaacgccgc gaccgagctg 540

cagatgaagg tgaaggacct gtgcgcggga ttcgactcgg tgtcggcctg cctgtcgaag 600

ggcctgggcg cgcccgcggg aaccgtgctg ctcggcagga aggatttcat ccagaaagcg 660

aagaggtcgc gcaagatcct cggcggcgcg atgcgccagg ccggggtgat cgccgccgcc 720

ggcctctacg cgctcgaaaa caacgtcgcg cgcctggagg aggaccatcg caatgcgcag 780

cggctggcga aggggctgca ggggctggag cttccggccg agcagcacac caacatggtc 840

ttcgtgcgca tcgcacccga gcggctggag ggcctggccg gccatctgaa gaaggcaggc 900

atcgcggtgc tgccgggcgc gcgcatgcgg ctggtgacgc atctcgatgt ggacgccgcc 960

ggggtcgagc gcgcgctggc ggcgttccgg agctacttcc ggtaa 1005

<210> 214

<211> 334

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_15序列

<400> 214

Met Ala Ser Ile Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser

1 5 10 15

Ala Ala Met Arg Arg Ala Met Leu Glu Ser Glu Leu Gly Asp Asp Val

20 25 30

Phe Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Glu Arg Ala Ala Glu

35 40 45

Ile Phe Gly Phe Glu Ala Ala Leu Leu Phe Pro Ser Gly Thr Gln Ser

50 55 60

Asn Leu Ala Ala Leu Met Ser His Cys Gln Arg Gly Glu Glu Val Ile

65 70 75 80

Leu Gly Gln Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Ala Ala

85 90 95

Val Leu Gly Ser Ile Gln Pro Gln Ala Ile Ala Asn Arg Pro Asp Gly

100 105 110

Thr Leu Asp Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro

115 120 125

His Phe Ala Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Gly Gly

130 135 140

Arg Val Leu Pro Arg Arg Tyr Leu Ala Glu Ala Leu Asp Leu Ala Lys

145 150 155 160

Lys Lys Thr Leu Ala Thr His Leu Asp Gly Ala Arg Val Phe Asn Ala

165 170 175

Ala Thr Glu Leu Gln Met Lys Val Lys Asp Leu Cys Ala Gly Phe Asp

180 185 190

Ser Val Ser Ala Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr

195 200 205

Val Leu Leu Gly Arg Lys Asp Phe Ile Gln Lys Ala Lys Arg Ser Arg

210 215 220

Lys Ile Leu Gly Gly Ala Met Arg Gln Ala Gly Val Ile Ala Ala Ala

225 230 235 240

Gly Leu Tyr Ala Leu Glu Asn Asn Val Ala Arg Leu Glu Glu Asp His

245 250 255

Arg Asn Ala Gln Arg Leu Ala Lys Gly Leu Gln Gly Leu Glu Leu Pro

260 265 270

Ala Glu Gln His Thr Asn Met Val Phe Val Arg Ile Ala Pro Glu Arg

275 280 285

Leu Glu Gly Leu Ala Gly His Leu Lys Lys Ala Gly Ile Ala Val Leu

290 295 300

Pro Gly Ala Arg Met Arg Leu Val Thr His Leu Asp Val Asp Ala Ala

305 310 315 320

Gly Val Glu Arg Ala Leu Ala Ala Phe Arg Ser Tyr Phe Arg

325 330

<210> 215

<211> 957

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_16序列

<400> 215

atgcgccgcg cgatgttcga cgccgaggtc ggcgacgatg tttggggtga cgatccgacc 60

gtgaaccgct tgcaggagcg gtcggcggaa atcttcgggt tcgaggctgc cctgttcttt 120

ccttccggaa cgcagtcgaa cctggcggcg ttgatgtgcc attgtcagcg cggcgacgag 180

gtgatcgttg gtgcggaggc gcacacctac cgctacgagg caggcggaat ctcggtgctc 240

gcctcggtcc atccgcgtcc actcccgaat cagccggacg gaaccctcga cctcgccgaa 300

gtcgaacgcg cgatcaaccc tgaggatgca catttcgcga ggacacgcct cctcacgctg 360

gagaatacca taagcggacg cgttctcccg agcacctatc tcgagtccgc gatgtcgctt 420

gcccagcgaa accatttatc gacgcatctg gatggcgcgc ggatcttcaa tgcagcgctt 480

cacctaggcg tggcggtcag agagctttgt tacggattcg attcggtctc tgcctgcctg 540

tctaagggac tgggggctcc tgccgggacg atcctgctgg gaagcaaagc tttcatcgag 600

gaggccaaac gcgcgcgtaa gctcctgggg ggaggcatgc ggcaagtcgg aattctcgcg 660

gctgctgggc tctatgcgct cgagaacaac gtcgagcgaa tggctgacga tcatcggaac 720

gccgtacgtc tcgccgaagg attgcgagag ctaggactcg cagtaggaca aagtacgaat 780

atggttttcg ttactattcc ggatggtcaa gcgcgcggtc tggccgaagc cctgaaaaat 840

gcccaggtcc tcgtggcccc tgaagagacg atgaggttgg taacccacct cgacgtggat 900

gcagcgggga tcgagcgggt gatagaaggc ttccgtgctt tcttccggtc caattaa 957

<210> 216

<211> 318

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_16序列

<400> 216

Met Arg Arg Ala Met Phe Asp Ala Glu Val Gly Asp Asp Val Trp Gly

1 5 10 15

Asp Asp Pro Thr Val Asn Arg Leu Gln Glu Arg Ser Ala Glu Ile Phe

20 25 30

Gly Phe Glu Ala Ala Leu Phe Phe Pro Ser Gly Thr Gln Ser Asn Leu

35 40 45

Ala Ala Leu Met Cys His Cys Gln Arg Gly Asp Glu Val Ile Val Gly

50 55 60

Ala Glu Ala His Thr Tyr Arg Tyr Glu Ala Gly Gly Ile Ser Val Leu

65 70 75 80

Ala Ser Val His Pro Arg Pro Leu Pro Asn Gln Pro Asp Gly Thr Leu

85 90 95

Asp Leu Ala Glu Val Glu Arg Ala Ile Asn Pro Glu Asp Ala His Phe

100 105 110

Ala Arg Thr Arg Leu Leu Thr Leu Glu Asn Thr Ile Ser Gly Arg Val

115 120 125

Leu Pro Ser Thr Tyr Leu Glu Ser Ala Met Ser Leu Ala Gln Arg Asn

130 135 140

His Leu Ser Thr His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala Leu

145 150 155 160

His Leu Gly Val Ala Val Arg Glu Leu Cys Tyr Gly Phe Asp Ser Val

165 170 175

Ser Ala Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr Ile Leu

180 185 190

Leu Gly Ser Lys Ala Phe Ile Glu Glu Ala Lys Arg Ala Arg Lys Leu

195 200 205

Leu Gly Gly Gly Met Arg Gln Val Gly Ile Leu Ala Ala Ala Gly Leu

210 215 220

Tyr Ala Leu Glu Asn Asn Val Glu Arg Met Ala Asp Asp His Arg Asn

225 230 235 240

Ala Val Arg Leu Ala Glu Gly Leu Arg Glu Leu Gly Leu Ala Val Gly

245 250 255

Gln Ser Thr Asn Met Val Phe Val Thr Ile Pro Asp Gly Gln Ala Arg

260 265 270

Gly Leu Ala Glu Ala Leu Lys Asn Ala Gln Val Leu Val Ala Pro Glu

275 280 285

Glu Thr Met Arg Leu Val Thr His Leu Asp Val Asp Ala Ala Gly Ile

290 295 300

Glu Arg Val Ile Glu Gly Phe Arg Ala Phe Phe Arg Ser Asn

305 310 315

<210> 217

<211> 1014

<212> DNA

<213> 未知

<220>

<223> ltaE_17

<400> 217

atgacgatcg tcgacttacg ttccgatacc gtcacgcgtc cgtcaccggg catgcgcaaa 60

gcgatgatgg acgccgaagt tggcgatgat gtgttcggcg acgatccgac ggtcaatcga 120

ctgcaagcgc gcgcggcgga gatcttcggc ttcgagtcgg cgctgctgtt tccatcgggc 180

acgcagtcca atctggcggc gctgatgagt cattgtcagc gtggcgatga ggtaatcgtc 240

ggccagctgg cgcacagtta ccgtaacgaa gccggcggcg cggccgtgct cggctcgatt 300

cagccgcaag ccattacgaa tcgcgctgac ggctcacttg atctcgctga aatagaggcg 360

gctatcaagc ctgacgaccc gcatttcgcg cggacccgtc tgcttgcgct cgagaacacg 420

atctcaggca aggtgctaac gaggtcgtat cttgagaagg ctttgcaatt ggctaaggca 480

aagaagctgt ttacgcatct cgatggcgcg cgcattttca atgccgccgc cgatcagaag 540

atgaaagtga acgagctgtg cgcgggcttc gattccgtat cggcgtgttt atcgaagggg 600

ctaggcgctc ccgccggaac agtattgctc ggaagcaagg atctgatcga gcgagcgaag 660

cgcaaccgaa aaatcctcgg cggcgcgatg cgccaggcgg ggattatcgc tgccgcgggt 720

ctttacgcac tacagaacaa catcgagcgg ttgcaaagcg atcatgacaa tgccgagcgg 780

ctggccgccg gattaagaat gctgaagctc gacgtcgaac aacatacgaa catggtgttc 840

gtgaacatgc cagccgaaca tactgctgcg ctcgctgcgc atcttgggcg acgtggcgtg 900

gtcgtgatgc cgtgggcgcc gatgcgtttg gttacgcacc tcgacgtcga ccgcgccggg 960

atcgagcgag tgctcggtgc ggtcgctgag ttcgtttccg ttaattcggt gtaa 1014

<210> 218

<211> 337

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_17序列

<400> 218

Met Thr Ile Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser Pro

1 5 10 15

Gly Met Arg Lys Ala Met Met Asp Ala Glu Val Gly Asp Asp Val Phe

20 25 30

Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu Ile

35 40 45

Phe Gly Phe Glu Ser Ala Leu Leu Phe Pro Ser Gly Thr Gln Ser Asn

50 55 60

Leu Ala Ala Leu Met Ser His Cys Gln Arg Gly Asp Glu Val Ile Val

65 70 75 80

Gly Gln Leu Ala His Ser Tyr Arg Asn Glu Ala Gly Gly Ala Ala Val

85 90 95

Leu Gly Ser Ile Gln Pro Gln Ala Ile Thr Asn Arg Ala Asp Gly Ser

100 105 110

Leu Asp Leu Ala Glu Ile Glu Ala Ala Ile Lys Pro Asp Asp Pro His

115 120 125

Phe Ala Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Ser Gly Lys

130 135 140

Val Leu Thr Arg Ser Tyr Leu Glu Lys Ala Leu Gln Leu Ala Lys Ala

145 150 155 160

Lys Lys Leu Phe Thr His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala

165 170 175

Ala Asp Gln Lys Met Lys Val Asn Glu Leu Cys Ala Gly Phe Asp Ser

180 185 190

Val Ser Ala Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr Val

195 200 205

Leu Leu Gly Ser Lys Asp Leu Ile Glu Arg Ala Lys Arg Asn Arg Lys

210 215 220

Ile Leu Gly Gly Ala Met Arg Gln Ala Gly Ile Ile Ala Ala Ala Gly

225 230 235 240

Leu Tyr Ala Leu Gln Asn Asn Ile Glu Arg Leu Gln Ser Asp His Asp

245 250 255

Asn Ala Glu Arg Leu Ala Ala Gly Leu Arg Met Leu Lys Leu Asp Val

260 265 270

Glu Gln His Thr Asn Met Val Phe Val Asn Met Pro Ala Glu His Thr

275 280 285

Ala Ala Leu Ala Ala His Leu Gly Arg Arg Gly Val Val Val Met Pro

290 295 300

Trp Ala Pro Met Arg Leu Val Thr His Leu Asp Val Asp Arg Ala Gly

305 310 315 320

Ile Glu Arg Val Leu Gly Ala Val Ala Glu Phe Val Ser Val Asn Ser

325 330 335

Val

<210> 219

<211> 1005

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_18序列

<400> 219

atgccagggc ttgtcgacct gcgttccgac accgtcacac ggccttcccc cggcatgcgc 60

cgcgcgatgc tcgaggccga gctcggcgac gacgtgttcg gcgacgatcc gacggtcaac 120

cgcctgcagg cgcgcgccgc cgagatcttc ggcatggaag cgggcctgct cctgccctcg 180

ggcacccagt ccaacctggc ggcgctgatg agccattgcc agcgcggcga cgaggtgatc 240

atcggccagg aggcgcacag ctaccgctac gaagccggcg gcatggcggt gctcggctcg 300

atccagccgc gcaccgtggc caaccgtgcc gacggcagcc tcgacctccg cgaggtcgag 360

gcggcgatca acccggacga cgcgcatttc gcgaggaccc ggctgctcgc cctcgagaac 420

acgatctcgg gccgcgtgct ttccgggaag tatttacgcg aagcggtcga tcttgcaaat 480

cgaaaaaagc tggcgaccca cctggacggt gcgcgcatct tcaacgcggc ggtccacgaa 540

ggcaccggcg tgaaggagct gtgcgccggc ttcgactcgg tatcggcgtg cctgtcgaag 600

gggctgggcg ctccggccgg gacggtcctg gtgggcagga aagagctcat cgacaaggcg 660

cggcgcgcgc gcaagatgct cggcggcgcg atgcggcagg cgggcgtgat cgcggcggcc 720

gggctgtatg cgctcgagca caacgtcgag cggctcgccg aggaccacgc caacgccgtc 780

aggctgtcga aaggactggc ggagatcggc ctgcccgtcg agcagcacac caacatggtg 840

tttgcacgga ttccgcccga gcgtgtcgcg ccactggaat cccatctgaa ggatagaggc 900

gttctggtgc tgcccggcgc gcgcatgcgt ctcgtcacgc acctcgacct cgaccgcgag 960

ggcgtcgagc gcgcgctcgc ggcgttcagg gggttcttcg gctga 1005

<210> 220

<211> 334

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_18序列

<400> 220

Met Pro Gly Leu Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser

1 5 10 15

Pro Gly Met Arg Arg Ala Met Leu Glu Ala Glu Leu Gly Asp Asp Val

20 25 30

Phe Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu

35 40 45

Ile Phe Gly Met Glu Ala Gly Leu Leu Leu Pro Ser Gly Thr Gln Ser

50 55 60

Asn Leu Ala Ala Leu Met Ser His Cys Gln Arg Gly Asp Glu Val Ile

65 70 75 80

Ile Gly Gln Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Met Ala

85 90 95

Val Leu Gly Ser Ile Gln Pro Arg Thr Val Ala Asn Arg Ala Asp Gly

100 105 110

Ser Leu Asp Leu Arg Glu Val Glu Ala Ala Ile Asn Pro Asp Asp Ala

115 120 125

His Phe Ala Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Ser Gly

130 135 140

Arg Val Leu Ser Gly Lys Tyr Leu Arg Glu Ala Val Asp Leu Ala Asn

145 150 155 160

Arg Lys Lys Leu Ala Thr His Leu Asp Gly Ala Arg Ile Phe Asn Ala

165 170 175

Ala Val His Glu Gly Thr Gly Val Lys Glu Leu Cys Ala Gly Phe Asp

180 185 190

Ser Val Ser Ala Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr

195 200 205

Val Leu Val Gly Arg Lys Glu Leu Ile Asp Lys Ala Arg Arg Ala Arg

210 215 220

Lys Met Leu Gly Gly Ala Met Arg Gln Ala Gly Val Ile Ala Ala Ala

225 230 235 240

Gly Leu Tyr Ala Leu Glu His Asn Val Glu Arg Leu Ala Glu Asp His

245 250 255

Ala Asn Ala Val Arg Leu Ser Lys Gly Leu Ala Glu Ile Gly Leu Pro

260 265 270

Val Glu Gln His Thr Asn Met Val Phe Ala Arg Ile Pro Pro Glu Arg

275 280 285

Val Ala Pro Leu Glu Ser His Leu Lys Asp Arg Gly Val Leu Val Leu

290 295 300

Pro Gly Ala Arg Met Arg Leu Val Thr His Leu Asp Leu Asp Arg Glu

305 310 315 320

Gly Val Glu Arg Ala Leu Ala Ala Phe Arg Gly Phe Phe Gly

325 330

<210> 221

<211> 939

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_19序列

<400> 221

atgcttgaag cagagctcgg cgacgatgtc ttcggcgacg atcccacggt gaaccggctc 60

caggcgaggg cggccgagct cttcggtttc gaggcagcgc ttttctttcc gtccggcacg 120

caatccaacc tcgcggcgct catgagccac tgccagcgcg gcgaggaggt gatcctcggt 180

cacgaagcgc acagctatcg ctacgaggcg ggcggcgccg cggtgctggg gtcgatccag 240

ttgcaggtgg tggccaaccg gcccgacggc acgcttgacc ttgccgaggt cgaagcggcg 300

atcaagcccg acgatccgca ctttgcaaag acacgcctcc ttgctctcga gaacaccatc 360

ggcgggcgcg ccttgccgcg cgcgtatctg gagcaggcat tgaagctcgc gcagcgccgg 420

ggtctgcaaa cccatttgga cggcgcgcga gtcttcaacg ccgcggtgta tttcggaacg 480

ggcgtcaagg cgctttgcgc cgggttcgat tcggtgtcgg cgtgcctctc caaaggattg 540

ggcgcaccgg caggaacggt tctactgggc agcaaggaac tcatcgcaag agcgcggcgg 600

gcgcgcaaga ttctcggtgg cgcgatgcgc caggcgggtg tcctcgccgc ggccgggctc 660

tacgcgctcg agaacaacgt cgagcgcctc gccgaggacc atgagaacgc gcgaggactc 720

gcgagcggcc tgcgcgcgct ggacctcgcg gtcgagcagc acaccaacat ggtcttcgtg 780

cgcgttgcgc ccgagcacgt ggagcggctc gcggcccacc tcgcaaatcg cggcgtcgcc 840

gtgctgccag cagcacgcat gcggctcgtc acgcacctgg acgtcgacag cgctgcggtg 900

gagcgcgccg tcggcgcgtt cagagcgttc ttcgcctag 939

<210> 222

<211> 312

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_19序列

<400> 222

Met Leu Glu Ala Glu Leu Gly Asp Asp Val Phe Gly Asp Asp Pro Thr

1 5 10 15

Val Asn Arg Leu Gln Ala Arg Ala Ala Glu Leu Phe Gly Phe Glu Ala

20 25 30

Ala Leu Phe Phe Pro Ser Gly Thr Gln Ser Asn Leu Ala Ala Leu Met

35 40 45

Ser His Cys Gln Arg Gly Glu Glu Val Ile Leu Gly His Glu Ala His

50 55 60

Ser Tyr Arg Tyr Glu Ala Gly Gly Ala Ala Val Leu Gly Ser Ile Gln

65 70 75 80

Leu Gln Val Val Ala Asn Arg Pro Asp Gly Thr Leu Asp Leu Ala Glu

85 90 95

Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His Phe Ala Lys Thr Arg

100 105 110

Leu Leu Ala Leu Glu Asn Thr Ile Gly Gly Arg Ala Leu Pro Arg Ala

115 120 125

Tyr Leu Glu Gln Ala Leu Lys Leu Ala Gln Arg Arg Gly Leu Gln Thr

130 135 140

His Leu Asp Gly Ala Arg Val Phe Asn Ala Ala Val Tyr Phe Gly Thr

145 150 155 160

Gly Val Lys Ala Leu Cys Ala Gly Phe Asp Ser Val Ser Ala Cys Leu

165 170 175

Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr Val Leu Leu Gly Ser Lys

180 185 190

Glu Leu Ile Ala Arg Ala Arg Arg Ala Arg Lys Ile Leu Gly Gly Ala

195 200 205

Met Arg Gln Ala Gly Val Leu Ala Ala Ala Gly Leu Tyr Ala Leu Glu

210 215 220

Asn Asn Val Glu Arg Leu Ala Glu Asp His Glu Asn Ala Arg Gly Leu

225 230 235 240

Ala Ser Gly Leu Arg Ala Leu Asp Leu Ala Val Glu Gln His Thr Asn

245 250 255

Met Val Phe Val Arg Val Ala Pro Glu His Val Glu Arg Leu Ala Ala

260 265 270

His Leu Ala Asn Arg Gly Val Ala Val Leu Pro Ala Ala Arg Met Arg

275 280 285

Leu Val Thr His Leu Asp Val Asp Ser Ala Ala Val Glu Arg Ala Val

290 295 300

Gly Ala Phe Arg Ala Phe Phe Ala

305 310

<210> 223

<211> 942

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_20序列

<400> 223

atggcggagg ccgaggtcgg cgacgatgtg tttggcgacg accccacggt gaaccggctg 60

caggcgcgcg cggcggaggt gctcgggttc gaggcggcgc tgatcttccc ctcgggaacg 120

cagtcgaacc tagccgcgct gatgacgcac tgtcagcggg gcgacgaaat gatcgtcggc 180

caagaggcgc atacctacgt ccatgaggcg ggcggcacct ccgtgctcgg tgcgatccat 240

ccgcacgttg tccccaacct gcccgacggc acgctcgacc tgagcgacgt ggaggcggcc 300

atcaagcccg acgatccgca ctacccgcgg acacggctcc tcgccctcga gaatacaatc 360

ggcggacggg cggtgccgcg cgcgtatctc gagcgcgcgg ttggtcttgc aaggcgccgc 420

cgcctcgcca cgcatctcga cggtgcgcgg atcttcaacg ccgcggtggc gctgaatacc 480

gatgtgagga atctgtgtgc gggattcgac tccgtgtccg tgtgtctgtc gaagggattg 540

ggcgcaccgg tcgggacgct cctcctcggc agcgccgact tcatcgcgcg cgcgacgcgc 600

gtgcggaaga tccttggggg tgggatgcgc caggtcggcg tactcgctgc ggcaggactg 660

tatgcgctgg aacacaactc gagccgcctg cacgtcgacc acgagcacgc cgcgcggctg 720

gcgcagggcc tcgggcggct gggccttccg gtcgagcatc acaccaacat ggtgttcgtg 780

cgcgtcggtg ctgacgcgga ggcgctggcg gggcatctgg agcgtcacgg agtcttggta 840

ctggcggagc cacgtatgcg gctcgtcacg catctcgatg tcgacgcggc ggggatcgat 900

cgagcggtcg aggcgtttac ggcttttcgc tggtcacgat aa 942

<210> 224

<211> 313

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_20序列

<400> 224

Met Ala Glu Ala Glu Val Gly Asp Asp Val Phe Gly Asp Asp Pro Thr

1 5 10 15

Val Asn Arg Leu Gln Ala Arg Ala Ala Glu Val Leu Gly Phe Glu Ala

20 25 30

Ala Leu Ile Phe Pro Ser Gly Thr Gln Ser Asn Leu Ala Ala Leu Met

35 40 45

Thr His Cys Gln Arg Gly Asp Glu Met Ile Val Gly Gln Glu Ala His

50 55 60

Thr Tyr Val His Glu Ala Gly Gly Thr Ser Val Leu Gly Ala Ile His

65 70 75 80

Pro His Val Val Pro Asn Leu Pro Asp Gly Thr Leu Asp Leu Ser Asp

85 90 95

Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His Tyr Pro Arg Thr Arg

100 105 110

Leu Leu Ala Leu Glu Asn Thr Ile Gly Gly Arg Ala Val Pro Arg Ala

115 120 125

Tyr Leu Glu Arg Ala Val Gly Leu Ala Arg Arg Arg Arg Leu Ala Thr

130 135 140

His Leu Asp Gly Ala Arg Ile Phe Asn Ala Ala Val Ala Leu Asn Thr

145 150 155 160

Asp Val Arg Asn Leu Cys Ala Gly Phe Asp Ser Val Ser Val Cys Leu

165 170 175

Ser Lys Gly Leu Gly Ala Pro Val Gly Thr Leu Leu Leu Gly Ser Ala

180 185 190

Asp Phe Ile Ala Arg Ala Thr Arg Val Arg Lys Ile Leu Gly Gly Gly

195 200 205

Met Arg Gln Val Gly Val Leu Ala Ala Ala Gly Leu Tyr Ala Leu Glu

210 215 220

His Asn Ser Ser Arg Leu His Val Asp His Glu His Ala Ala Arg Leu

225 230 235 240

Ala Gln Gly Leu Gly Arg Leu Gly Leu Pro Val Glu His His Thr Asn

245 250 255

Met Val Phe Val Arg Val Gly Ala Asp Ala Glu Ala Leu Ala Gly His

260 265 270

Leu Glu Arg His Gly Val Leu Val Leu Ala Glu Pro Arg Met Arg Leu

275 280 285

Val Thr His Leu Asp Val Asp Ala Ala Gly Ile Asp Arg Ala Val Glu

290 295 300

Ala Phe Thr Ala Phe Arg Trp Ser Arg

305 310

<210> 225

<211> 1029

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_21序列

<400> 225

gtgagcacga tcgaccttcg cagcgacacc atcacgcggc ccggccccgt catgcgtcgc 60

gccatggccg aggcggaagt gggcgacgac gtcttcggcg acgaccccac cgtcaaccgc 120

ctccaggacg cgtgcgccga gcggttcggg atggaggccg ggctgctgtt tcccaccggc 180

acgcagtcca atctcgccgc cctgatgtcc cactgcgccc gcggcgagga ggtgatcgtc 240

gggcaggagg cccacaccta ccggtacgag gccggcggca tggccgtcct cggctcgatc 300

cagccgcagc cgctccagaa ccggtccgag ggcacgctcg acctggccga ggtggaggcg 360

gcgatcaagc cggacgaccc ccacttcgcg gtgacgaaac tggttgccct ggagaacacg 420

atcggcggta aggtcctgcc tcgggcctac ttggccgacg cggtcgccct ggcgcgccgc 480

cggggacttt cgctgcacct cgacggcgcc cgcgtcttca atgcggcggt gaagctcggc 540

gtgccggtgg accggctgtg cgaggggttc gacacggtgt cggtgtgcct ctcgaaggga 600

ctgggcgcgc ccgcggggac agtgctcgtc ggccgccgcg acgtcatcga ccgcgcgaag 660

cgcgtgcgga agatgctggg cggcacgatg cgccagtccg gcgtcctcgc cgccgcgggc 720

ctctacgcgc tggcgcacca cgtggaccgc ctcgccgaag accacgccaa cgcccagcgc 780

ctcggccggg cgctcgaagg gctggggctg agggtggagc cggtgcagac gaacatggtc 840

ttcgtccacg tcccccgcga gtccgagacg gcgctgcgcg cccacctcgc gtcgcgaggc 900

gtgatgaccc tccccggccc gcggctgcgg cttgtcaccc acctcgacgt cgacgccgcc 960

ggaatcgacc gcgccgtcga agccttcgcc agtttcttcc gggcggcgaa tctgaaatct 1020

caaatttga 1029

<210> 226

<211> 342

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_21序列

<400> 226

Val Ser Thr Ile Asp Leu Arg Ser Asp Thr Ile Thr Arg Pro Gly Pro

1 5 10 15

Val Met Arg Arg Ala Met Ala Glu Ala Glu Val Gly Asp Asp Val Phe

20 25 30

Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Asp Ala Cys Ala Glu Arg

35 40 45

Phe Gly Met Glu Ala Gly Leu Leu Phe Pro Thr Gly Thr Gln Ser Asn

50 55 60

Leu Ala Ala Leu Met Ser His Cys Ala Arg Gly Glu Glu Val Ile Val

65 70 75 80

Gly Gln Glu Ala His Thr Tyr Arg Tyr Glu Ala Gly Gly Met Ala Val

85 90 95

Leu Gly Ser Ile Gln Pro Gln Pro Leu Gln Asn Arg Ser Glu Gly Thr

100 105 110

Leu Asp Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His

115 120 125

Phe Ala Val Thr Lys Leu Val Ala Leu Glu Asn Thr Ile Gly Gly Lys

130 135 140

Val Leu Pro Arg Ala Tyr Leu Ala Asp Ala Val Ala Leu Ala Arg Arg

145 150 155 160

Arg Gly Leu Ser Leu His Leu Asp Gly Ala Arg Val Phe Asn Ala Ala

165 170 175

Val Lys Leu Gly Val Pro Val Asp Arg Leu Cys Glu Gly Phe Asp Thr

180 185 190

Val Ser Val Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr Val

195 200 205

Leu Val Gly Arg Arg Asp Val Ile Asp Arg Ala Lys Arg Val Arg Lys

210 215 220

Met Leu Gly Gly Thr Met Arg Gln Ser Gly Val Leu Ala Ala Ala Gly

225 230 235 240

Leu Tyr Ala Leu Ala His His Val Asp Arg Leu Ala Glu Asp His Ala

245 250 255

Asn Ala Gln Arg Leu Gly Arg Ala Leu Glu Gly Leu Gly Leu Arg Val

260 265 270

Glu Pro Val Gln Thr Asn Met Val Phe Val His Val Pro Arg Glu Ser

275 280 285

Glu Thr Ala Leu Arg Ala His Leu Ala Ser Arg Gly Val Met Thr Leu

290 295 300

Pro Gly Pro Arg Leu Arg Leu Val Thr His Leu Asp Val Asp Ala Ala

305 310 315 320

Gly Ile Asp Arg Ala Val Glu Ala Phe Ala Ser Phe Phe Arg Ala Ala

325 330 335

Asn Leu Lys Ser Gln Ile

340

<210> 227

<211> 1023

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_22序列

<400> 227

atgagcgcgc ccgtcgatct tcgctccgat accgtcacgc gcccctcgcc cgggatgcgc 60

aaggcgatgc tggaagccga gctcggcgac gacgtgttcg gcgacgaccc gaccgtcaac 120

cgcctccagg cgcgcgccgc cgagatcttc ggcttcgagg cggcgctgct ctttccctcc 180

ggcacgcaat cgaacctcgc cgcgctcatg agccactgcc agcgcggcga cgaggtgatc 240

ctcggcatgg aggcgcacag ctaccgctac gaggcgggcg gcctctcggt gctcggctcg 300

attcagccgc aggcgattcc caaccgtccg gacgggaccc tcgacctggc cgaggttgaa 360

gccgcgatca agcccgacga cccgcacttt gcccgctccc ggttgctcgc tctggaaaac 420

acgatcaccg gccgcgttct caagcgtgag tacctgggca aggctgtgga gctggcaaga 480

cggaagaatc tctcgattca cctcgacggg gcgcgcgtct tcaacgccgc cacagcgctc 540

accatgaagg tgaaggagct gtgcgccggc ttcgactcgg tgtcgtcgtg cctctcgaag 600

gggctcggcg cgccggccgg caccgtcctg ctgggtaata gggatttcat tcaaaaagcc 660

aaacgggcaa gaaagatcct cggcggcggg atgcggcagg cgggcgtgat cgccgccgcg 720

ggcctctacg cgctcgagaa caacgtcgag cggctgcgcg aggaccacga gaacgccgag 780

cgcctcgcgc gcggactgcg cgagctcggg ctcgaagccc agctcaacac gaacatggtg 840

ctcctgaaga ttgagccggc aaaagcacaa ccactagccg agagacttct tcagtctaag 900

attctcgtat tgccgcgcgc gccgatgcgg ctggtgacgc atcttgacgt cgacaaggca 960

ggaattgatc gcgcgctatc ggcgttccgc gcgttcttct cgcaaaacga attacgaccc 1020

tga 1023

<210> 228

<211> 340

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_22序列

<400> 228

Met Ser Ala Pro Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Ser

1 5 10 15

Pro Gly Met Arg Lys Ala Met Leu Glu Ala Glu Leu Gly Asp Asp Val

20 25 30

Phe Gly Asp Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu

35 40 45

Ile Phe Gly Phe Glu Ala Ala Leu Leu Phe Pro Ser Gly Thr Gln Ser

50 55 60

Asn Leu Ala Ala Leu Met Ser His Cys Gln Arg Gly Asp Glu Val Ile

65 70 75 80

Leu Gly Met Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Leu Ser

85 90 95

Val Leu Gly Ser Ile Gln Pro Gln Ala Ile Pro Asn Arg Pro Asp Gly

100 105 110

Thr Leu Asp Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro

115 120 125

His Phe Ala Arg Ser Arg Leu Leu Ala Leu Glu Asn Thr Ile Thr Gly

130 135 140

Arg Val Leu Lys Arg Glu Tyr Leu Gly Lys Ala Val Glu Leu Ala Arg

145 150 155 160

Arg Lys Asn Leu Ser Ile His Leu Asp Gly Ala Arg Val Phe Asn Ala

165 170 175

Ala Thr Ala Leu Thr Met Lys Val Lys Glu Leu Cys Ala Gly Phe Asp

180 185 190

Ser Val Ser Ser Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr

195 200 205

Val Leu Leu Gly Asn Arg Asp Phe Ile Gln Lys Ala Lys Arg Ala Arg

210 215 220

Lys Ile Leu Gly Gly Gly Met Arg Gln Ala Gly Val Ile Ala Ala Ala

225 230 235 240

Gly Leu Tyr Ala Leu Glu Asn Asn Val Glu Arg Leu Arg Glu Asp His

245 250 255

Glu Asn Ala Glu Arg Leu Ala Arg Gly Leu Arg Glu Leu Gly Leu Glu

260 265 270

Ala Gln Leu Asn Thr Asn Met Val Leu Leu Lys Ile Glu Pro Ala Lys

275 280 285

Ala Gln Pro Leu Ala Glu Arg Leu Leu Gln Ser Lys Ile Leu Val Leu

290 295 300

Pro Arg Ala Pro Met Arg Leu Val Thr His Leu Asp Val Asp Lys Ala

305 310 315 320

Gly Ile Asp Arg Ala Leu Ser Ala Phe Arg Ala Phe Phe Ser Gln Asn

325 330 335

Glu Leu Arg Pro

340

<210> 229

<211> 1017

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_23序列

<400> 229

atgacggttg acctacgctc cgataccgtc acgcgtcctg gcactggaat gcgcaaggcg 60

atgatggaag ccgagctcgg cgacgatgtg ttcggcgacg acccgaccgt caaccgcctg 120

caggcgcgcg ccgccgagat cttcggcttc gaggcggcgc tcctctttcc ctccggcacg 180

caatcgaacc tggccgcgct catgagccat tgccagcgcg gcgacgaggt gatcctcggc 240

atggaggcgc atagctaccg ctacgaggcg ggcggcctct cggtgctcgg ctcgatccag 300

ccgcaggcga ttcccaaccg tccggacggc acgctcgatc tcgcggaagt ggaagcggcg 360

atcaagcccg acgatccgca cttcgcgcga acacggttgc tggcgcttga gaacaccatc 420

accggccgcg tgctctcgag aagttatctg gaacaggcca tcgggctggc gaagaaaaaa 480

aacctctcga ttcacctcga cggggctcgc gtcttcaacg ccgcaagcgc gctcaagatg 540

ccggtgaaag acctttgcgc cggattcgac tcggtgtcgt cgtgtctctc gaagggcctc 600

ggtgcgcccg ccggcacagt gctcttaggt agtaaacctt tcatagaaaa agcaaagcgc 660

gcacgcaaga tcctcggcgg cggcatgcgg caggcgggcg tgatcgcggc ggcggggctt 720

tatgcgctcg agcacaacgt tgaaagattg aataccgatc atgagaacgc ggagcgcctt 780

gcgcgcgggc tgcgcgagct cgggctcgag gcccagctca acacgaacat ggtgttgttg 840

cgacttcccg ccgagagggc ggcgccactg gaagcacacc tgaagaaaca tgatgtgctg 900

gtgctgccgc gcgcgccgat gcggctcgtg acgcacctcg acgtcgaccg ggccggcatc 960

gatcgggcgc tcgccgggtt ccgcgcgttc ttcacgcaga aagaattacg accctga 1017

<210> 230

<211> 338

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_23序列

<400> 230

Met Thr Val Asp Leu Arg Ser Asp Thr Val Thr Arg Pro Gly Thr Gly

1 5 10 15

Met Arg Lys Ala Met Met Glu Ala Glu Leu Gly Asp Asp Val Phe Gly

20 25 30

Asp Asp Pro Thr Val Asn Arg Leu Gln Ala Arg Ala Ala Glu Ile Phe

35 40 45

Gly Phe Glu Ala Ala Leu Leu Phe Pro Ser Gly Thr Gln Ser Asn Leu

50 55 60

Ala Ala Leu Met Ser His Cys Gln Arg Gly Asp Glu Val Ile Leu Gly

65 70 75 80

Met Glu Ala His Ser Tyr Arg Tyr Glu Ala Gly Gly Leu Ser Val Leu

85 90 95

Gly Ser Ile Gln Pro Gln Ala Ile Pro Asn Arg Pro Asp Gly Thr Leu

100 105 110

Asp Leu Ala Glu Val Glu Ala Ala Ile Lys Pro Asp Asp Pro His Phe

115 120 125

Ala Arg Thr Arg Leu Leu Ala Leu Glu Asn Thr Ile Thr Gly Arg Val

130 135 140

Leu Ser Arg Ser Tyr Leu Glu Gln Ala Ile Gly Leu Ala Lys Lys Lys

145 150 155 160

Asn Leu Ser Ile His Leu Asp Gly Ala Arg Val Phe Asn Ala Ala Ser

165 170 175

Ala Leu Lys Met Pro Val Lys Asp Leu Cys Ala Gly Phe Asp Ser Val

180 185 190

Ser Ser Cys Leu Ser Lys Gly Leu Gly Ala Pro Ala Gly Thr Val Leu

195 200 205

Leu Gly Ser Lys Pro Phe Ile Glu Lys Ala Lys Arg Ala Arg Lys Ile

210 215 220

Leu Gly Gly Gly Met Arg Gln Ala Gly Val Ile Ala Ala Ala Gly Leu

225 230 235 240

Tyr Ala Leu Glu His Asn Val Glu Arg Leu Asn Thr Asp His Glu Asn

245 250 255

Ala Glu Arg Leu Ala Arg Gly Leu Arg Glu Leu Gly Leu Glu Ala Gln

260 265 270

Leu Asn Thr Asn Met Val Leu Leu Arg Leu Pro Ala Glu Arg Ala Ala

275 280 285

Pro Leu Glu Ala His Leu Lys Lys His Asp Val Leu Val Leu Pro Arg

290 295 300

Ala Pro Met Arg Leu Val Thr His Leu Asp Val Asp Arg Ala Gly Ile

305 310 315 320

Asp Arg Ala Leu Ala Gly Phe Arg Ala Phe Phe Thr Gln Lys Glu Leu

325 330 335

Arg Pro

<210> 231

<211> 954

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_24序列

<400> 231

atggaagccg aactcggcga cgacgtcttc ggcgaagacc cgaccgtcaa ccgcctgcag 60

gcgcgcgcgg ccgagatgtt cggcttcgag gtggcgctcc tctttccctc cggcacgcaa 120

tcgaacctgg ccgcgctcat gagccactgc cagcgcggcg acgaggtgat cctcgggatg 180

gaggcgcaca gttaccgcta cgaagcgggc ggcctctcgg tgctcggctc gatccagccg 240

caggcgatcc ccaatcgccc cgacggcacg ctcgatctcg ccgaagtgga agccgcgatc 300

aagcccgacg atccgcactt cgcgcgcacc cgcttgctcg ctttggaaaa cacgatcacg 360

ggccgcgtgc tctcaagaag ttatctggaa caggccatcg gggtggcgaa gaaaaaaaac 420

ctctcgattc acctcgacgg cgcgcgcgtc ttcaatgctg ccacgcagct caagatgaag 480

gtaaaggacc tctgcgcggg cttcgactcg gtgtcctcgt gcctctcgaa ggggctcggt 540

gcgcccgccg gcacagtgct cttaggtagc aaagctttca tggaaaaagc aaagcgggca 600

agaaaaatcc tcggcggcgg gatgcggcag gcaggcgtga tcgccgccgc gggactctac 660

gcgctcgaga acaacgtcga gcgcctgcgc gaggaccacg agaacgccga gcgccttgcg 720

cgcgggctgc gcgaggtcgg gctcgaggcg cagctcaaca ccaacatggt tcttctcaag 780

atcccagtgg ataaagctgc accgctggaa gcgcatatga agaggaacaa tgtgctcgtg 840

ctgccgcgcg cgccgatgcg gctcgtgacg cacctcgacg tcgaccgggc cggcatcgat 900

cgcgcgctcg ccgggttccg cgcgttcttc gcgcagaaag aattacgacc ctga 954

<210> 232

<211> 317

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的ltaE_24序列

<400> 232

Met Glu Ala Glu Leu Gly Asp Asp Val Phe Gly Glu Asp Pro Thr Val

1 5 10 15

Asn Arg Leu Gln Ala Arg Ala Ala Glu Met Phe Gly Phe Glu Val Ala

20 25 30

Leu Leu Phe Pro Ser Gly Thr Gln Ser Asn Leu Ala Ala Leu Met Ser

35 40 45

His Cys Gln Arg Gly Asp Glu Val Ile Leu Gly Met Glu Ala His Ser

50 55 60

Tyr Arg Tyr Glu Ala Gly Gly Leu Ser Val Leu Gly Ser Ile Gln Pro

65 70 75 80

Gln Ala Ile Pro Asn Arg Pro Asp Gly Thr Leu Asp Leu Ala Glu Val

85 90 95

Glu Ala Ala Ile Lys Pro Asp Asp Pro His Phe Ala Arg Thr Arg Leu

100 105 110

Leu Ala Leu Glu Asn Thr Ile Thr Gly Arg Val Leu Ser Arg Ser Tyr

115 120 125

Leu Glu Gln Ala Ile Gly Val Ala Lys Lys Lys Asn Leu Ser Ile His

130 135 140

Leu Asp Gly Ala Arg Val Phe Asn Ala Ala Thr Gln Leu Lys Met Lys

145 150 155 160

Val Lys Asp Leu Cys Ala Gly Phe Asp Ser Val Ser Ser Cys Leu Ser

165 170 175

Lys Gly Leu Gly Ala Pro Ala Gly Thr Val Leu Leu Gly Ser Lys Ala

180 185 190

Phe Met Glu Lys Ala Lys Arg Ala Arg Lys Ile Leu Gly Gly Gly Met

195 200 205

Arg Gln Ala Gly Val Ile Ala Ala Ala Gly Leu Tyr Ala Leu Glu Asn

210 215 220

Asn Val Glu Arg Leu Arg Glu Asp His Glu Asn Ala Glu Arg Leu Ala

225 230 235 240

Arg Gly Leu Arg Glu Val Gly Leu Glu Ala Gln Leu Asn Thr Asn Met

245 250 255

Val Leu Leu Lys Ile Pro Val Asp Lys Ala Ala Pro Leu Glu Ala His

260 265 270

Met Lys Arg Asn Asn Val Leu Val Leu Pro Arg Ala Pro Met Arg Leu

275 280 285

Val Thr His Leu Asp Val Asp Arg Ala Gly Ile Asp Arg Ala Leu Ala

290 295 300

Gly Phe Arg Ala Phe Phe Ala Gln Lys Glu Leu Arg Pro

305 310 315

<210> 233

<211> 102

<212> PRT

<213> 人工序列

<220>

<223> 菌株331829 gapA截短

<400> 233

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Asn Glu Gly Leu Arg Gln His Arg Gln Gly Cys Phe Leu Val

65 70 75 80

Arg Gln Arg Val Gly Leu His Leu Pro Ala Pro Ala Ser Asp Arg Ala

85 90 95

Arg Ser Phe Gln Ala Leu

100

<210> 234

<211> 71

<212> PRT

<213> 人工序列

<220>

<223> 菌株331831 gapA截短

<400> 234

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu

50 55 60

Lys Phe Asp Ser Ile Ser Arg

65 70

<210> 235

<211> 93

<212> PRT

<213> 人工序列

<220>

<223> 菌株331897 gapA截短

<400> 235

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala Thr Thr Leu Thr Ser

85 90

<210> 236

<211> 33

<212> PRT

<213> 人工序列

<220>

<223> 菌株331904 gapA截短

<400> 236

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Val Ala Gly Ala Lys Lys Val Ile Ile Ser Arg Cys Lys Arg

20 25 30

Gly

<210> 237

<211> 63

<212> DNA

<213> 人工序列

<220>

<223> pMB038

<400> 237

accgtgcgtg ttgactattt tacctctggc ggtgatactg gttgcatgta ctaaggaggt 60

tgt 63

<210> 238

<211> 57

<212> DNA

<213> 大肠杆菌

<400> 238

ggaaacacag aaaaaagccc gcacctgaca gtgcgggctt tttttttcga ccaaagg 57

<210> 239

<211> 5953

<212> DNA

<213> 人工序列

<220>

<223> 线性p15A质粒骨架

<400> 239

ggcgtatcac gaggcccttt cgtctcgcgc gtttcggtga tgacggtgaa aacctctgac 60

acatgcagct cccggagacg gtcacagctt gtctgtaagc ggatgccggg agcagacaag 120

cccgtcaggg cgcgtcagcg ggtgttggcg ggtgtcgggg ctggcttaac tatgcggcat 180

cagagcagat tgtactgaga gtgcaccata ccactttttc gtgacgcgcg gttttgaaaa 240

catagacaag tttttggcgt cgttgttaat ttcgaagagg atgtccaata ttttttttaa 300

ggaataagga tacttcaaga ctagattccc ccctgcattc ccatcagaac cgtaaacctt 360

ggcgctttcc ttgggaagta ttcaagaagt gccttgtccg gtttctgtgg ctcacaaacc 420

agcgcgcccg atatggcttt cttttcactt atgaatgtac cagtacggga caattagaac 480

gctcctgtaa caatctcttt gcaaatgtgg ggttacattc taaccatgtc acactgctga 540

cgaaattcaa agtaaaaaaa aatgggacca cgtcttgaga acgatagatt ttctttattt 600

tacattgaac agtcgttgtc tcagcgcgct ttatgttttc attcatactt catattataa 660

aataacaaaa gaagaatttc atattcacgc ccaagaaatc aggctgcttt ccaaatgcaa 720

ttgacacttc attagccatc acacaaaact ctttcttgct ggagcttctt ttaaaaaaga 780

cctcagtaca ccaaacacgt tacccgacct cgttatttta cgacaactat gataaaattc 840

tgaagaaaaa ataaaaaaat tttcatactt cttgctttta tttaaaccat tgaatgattt 900

cttttgaaca aaactacctg tttcaccaaa ggaaatagaa agaaaaaatc aattagaaga 960

aaacaaaaaa caaaatgtct gttattaatt tcacaggtag ttctggtcca ttggtgaaag 1020

tttgcggctt gcagagcaca gaggccgcag aatgtgctct agattccgat gctgacttgc 1080

tgggtattat atgtgtgccc aatagaaaga gaacaattga cccggttatt gcaaggaaaa 1140

tttcaagtct tgtaaaagca tataaaaata gttcaggcac tccgaaatac ttggttggcg 1200

tgtttcgtaa tcaacctaag gaggatgttt tggctctggt caatgattac ggcattgata 1260

tcgtccaact gcatggagat gagtcgtggc aagaatacca agagttcctc ggtttgccag 1320

ttattaaaag actcgtattt ccaaaagact gcaacatact actcagtgca gcttcacaga 1380

aacctcattc gtttattccc ttgtttgatt cagaagcagg tgggacaggt gaacttttgg 1440

attggaactc gatttctgac tgggttggaa ggcaagagag ccccgaaagc ttacatttta 1500

tgttagctgg tggactgacg ccagaaaatg ttggtgatgc gcttagatta aatggcgtta 1560

ttggtgttga tgtaagcgga ggtgtggaga caaatggtgt aaaagactct aacaaaatag 1620

caaatttcgt caaaaatgct aagaaatagg ttattactga gtagtattta tttaagtatt 1680

gtttgtgcac ttgcctgcag gccttttgaa aagcaagcat aaaagatcta aacataaaat 1740

ctgtaaaata acaagatgta aagataatgc taaatcattt ggctttttga ttgattgtac 1800

aggactgggt ggaatccctt ctgcagcacc tggattaccc tgttatccct agtcatggtc 1860

gtcacagagc tggaagcggc agcgagaatt atcgcgatcg tggcggtgcc cgcaggcatg 1920

acaaacatcg taaatgccgc gtttcgtgtg ccgtggccgc ccaggacgtg tcagcgccgc 1980

caccacctgc accgaatcgg cagcagcgtc gcgcgtcgaa aaagcgcaca ggcggcaaga 2040

agcgataagc tgcacgaata cctgaaaaat gttgaacgcc ccgtgagcgg taactcacag 2100

ggcgtcggct aacccccagt ccaaacctgg gagaaagcgc tcaaaaatga ctctagcgga 2160

ttcacgagac attgacacac cggcctggaa attttccgct gatctgttcg acacccatcc 2220

cgagctcgcg ctgcgatcac gtggctggac gagcgaagac cgccgcgaat tcctcgctca 2280

cctgggcaga gaaaatttcc agggcagcaa gacccgcgac ttcgccagcg cttggatcaa 2340

agacccggac acggagaaac acagccgaag ttataccgag ttggttcaaa atcgcttgcc 2400

cggtgccagt atgttgctct gacgcacgcg cagcacgcag ccgtgcttgt cctggacatt 2460

gatgtgccga gccaccaggc cggcgggaaa atcgagcacg taaaccccga ggtctacgcg 2520

attttggagc gctgggcacg cctggaaaaa gcgccagctt ggatcggcgt gaatccactg 2580

agcgggaaat gccagctcat ctggctcatt gatccggtgt atgccgcagc aggcatgagc 2640

agcccgaata tgcgcctgct ggctgcaacg accgaggaaa tgacccgcgt tttcggcgct 2700

gaccaggctt tttcacatag gctgagccgt ggccactgca ctctccgacg atcccagccg 2760

taccgctggc atgcccagca caatcgcgtg gatcgcctag ctgatcttat ggaggttgct 2820

cgcatgatct caggcacaga aaaacctaaa aaacgctatg agcaggagtt ttctagcgga 2880

cgggcacgta tcgaagcggc aagaaaagcc actgcggaag caaaagcact tgccacgctt 2940

gaagcaagcc tgccgagcgc cgctgaagcg tctggagagc tgatcgacgg cgtccgtgtc 3000

ctctggactg ctccagggcg tgccgcccgt gatgagacgg cttttcgcca cgctttgact 3060

gtgggatacc agttaaaagc ggctggtgag cgcctaaaag acaccaaggg tcatcgagcc 3120

tacgagcgtg cctacaccgt cgctcaggcg gtcggaggag gccgtgagcc tgatctgccg 3180

ccggactgtg accgccagac ggattggccg cgacgtgtgc gcggctacgt cgctaaaggc 3240

cagccagtcg tccctgctcg tcagacagag acgcagagcc agccgaggcg aaaagctctg 3300

gccactatgg gaagacgtgg cggtaaaaag gccgcagaac gctggaaaga cccaaacagt 3360

gagtacgccc gagcacagcg agaaaaacta gctaagtcca gtcaacgaca agctaggaaa 3420

gctaaaggaa atcgcttgac cattgcaggt tggtttatga ctgttgaggg agagactggc 3480

tcgtggccga caatcaatga agctatgtct gaatttagcg tgtcacgtca gaccgtgaat 3540

agagcactta aggtctgcgg gcattgaact tccacgagga cgccgaaagc ttcccagtaa 3600

atgtgccatc tcgtaggcag aaaacggttc ccccgtaggg tctctctctt ggcctccttt 3660

ctaggtcggg ctgattgctc ttgaagctct ctaggggggc tcacaccata ggcagataac 3720

gttccccacc ggctcgcctc gtaagcgcac aaggactgct cccaaataat gagtagtcct 3780

catctccctc aagcaggcgc cggcggtact gccatcctcg agactaccta gctgcatttt 3840

caggaggaag cgatgggcgg ccgcacacct tcttaataag atgatcttct tgagatcgtt 3900

ttggtctgcg cgtaatctct tgctctgaaa acgaaaaaac cgccttgcag ggcggttttt 3960

cgaaggttct ctgagctacc aactctttga accgaggtaa ctggcttgga ggagcgcagt 4020

caccaaaact tgtcctttca gtttagcctt aaccggcgca tgacttcaag actaactcct 4080

ctaaatcaat taccagtggc tgctgccagt ggtgcttttg catgtctttc cgggttggac 4140

tcaagacgat agttaccgga taaggcgcag cggtcggact gaacgggggg ttcgtgcata 4200

cagtccagct tggagcgaac tgcctacccg gaactgagtg tcaggcgtgg aatgagacaa 4260

acgcggccat aacagcggaa tgacaccggt aaaccgaaag gcaggaacag gagagcgcac 4320

gagggagccg ccaggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccaccac 4380

tgatttgagc gtcagatttc gtgatgcttg tcaggggggc ggagcctatg gaaaaacggc 4440

tttgccgcgg ccctctcact tccctgttaa gtatcttcct ggcatcttcc aggaaatctc 4500

cgccccgttc gtaagccatt tccgctcgcc gcagtcgaac gaccgagcgt agcgagtcag 4560

tgagcgagga agcggaatat atcctgtatc acatattctg ctgacgcacc ggtgcagcct 4620

tttttctcct gccacatgaa gcacttcact gacaccctca tcagtgccaa catagtaagc 4680

cagtatacac tccgctatga taatgggtga gtgagtgtgt gcgtgtgggg cgcgccagat 4740

gggaacactt agctttgacc tgcacaaata gttgcaaatt gtcccacata cacataaagt 4800

agcttgcgta tttaaaatta tgaacctaag gggtttagca cttcacgctg ccgcaagcac 4860

tcagggcgca agggctgcta aaggaagcgg aacacgtaga aagccagtcc gcagaaacgg 4920

tgctgacccc ggatgaatgt cagctactgg gctatctgga caagggaaaa cgcaagcgca 4980

aagagaaagc aggtagcttg cagtgggctt acatggcgat agctagactg ggcggtttta 5040

tggacagcaa gcgaaccgga attgccagct ggggcgccct ctggtaaggt tgggaagccc 5100

tgcaaagtaa actggatggc tttcttgccg ccaaggatct gatggcgcag gggatcaaga 5160

tctgatcaag agacaggatg aggatcgttt cgcatggaga aaaagatcac gggctacact 5220

accgtggaca tctcgcaatg gcatcgcaag gaacacttcg aggcatttca aagcgtggca 5280

caatgtactt acaaccagac cgtccagctg gatattaccg cgtttttgaa gaccgttaag 5340

aaaaacaagc acaagtttta tccagccttt atccatattc tcgcccgctt gatgaatgcg 5400

caccccgaat ttcgtatggc catgaaagat ggtgagctcg ttatctggga ctcagtccat 5460

ccatgctata ccgttttcca cgaacaaact gaaacttttt cttcgctgtg gtccgaatat 5520

cacgatgatt tccgccaatt tttgcatatc tacagccaag atgtcgcgtg ctatggtgaa 5580

aacctggctt actttcccaa gggattcatc gagaatatgt tctttgtttc agcaaacccc 5640

tgggtgtcct tcacgtcgtt tgacttgaac gtggccaata tggataattt cttcgctcca 5700

gttttcacca tgggtaagta ctatacccaa ggagacaagg tccttatgcc acttgcaatc 5760

caagtacacc acgcagtctg cgatggtttc catgtgggac gcatgcttaa cgaactccaa 5820

cagtactgtg atgaatggca aggcggcgcg tagcccccca accgaagttg aggggatttt 5880

tactccagtc tttctagaag atggcaaaca gctattatgg gtattatggg tgctccccga 5940

aaagtgccac ctg 5953

<210> 240

<211> 3423

<212> DNA

<213> 谷氨酸棒状杆菌

<400> 240

gtgtcgactc acacatcttc aacgcttcca gcattcaaaa agatcttggt agcaaaccgc 60

ggcgaaatcg cggtccgtgc tttccgtgca gcactcgaaa ccggtgcagc cacggtagct 120

atttaccccc gtgaagatcg gggatcattc caccgctctt ttgcttctga agctgtccgc 180

attggtactg aaggctcacc agtcaaggcg tacctggaca tcgatgaaat tatcggtgca 240

gctaaaaaag ttaaagcaga tgctatttac ccgggatatg gcttcctgtc tgaaaatgcc 300

cagcttgccc gcgagtgcgc ggaaaacggc attactttta ttggcccaac cccagaggtt 360

cttgatctca ccggtgataa gtctcgtgcg gtaaccgccg cgaagaaggc tggtctgcca 420

gttttggcgg aatccacccc gagcaaaaac atcgatgaca tcgttaaaag cgctgaaggc 480

cagacttacc ccatctttgt aaaggcagtt gccggtggtg gcggacgcgg tatgcgcttt 540

gtttcttcac ctgatgagct ccgcaaattg gcaacagaag catctcgtga agctgaagcg 600

gcattcggcg acggttcggt atatgtcgaa cgtgctgtga ttaaccccca gcacattgaa 660

gtgcagatcc ttggcgatcg cactggagaa gttgtacacc tttatgaacg tgactgctca 720

ctgcagcgtc gtcaccaaaa agttgtcgaa attgcgccag cacagcattt ggatccagaa 780

ctgcgtgatc gcatttgtgc ggatgcagta aagttctgcc gctccattgg ttaccagggc 840

gcgggaaccg tggaattctt ggtcgatgaa aagggcaacc acgtcttcat cgaaatgaac 900

ccacgtatcc aggttgagca caccgtgact gaagaagtca ccgaggtgga cctggtgaag 960

gcgcagatgc gcttggctgc tggtgcaacc ttgaaggaat tgggtctgac ccaagataag 1020

atcaagaccc acggtgcagc actgcagtgc cgcatcacca cggaagatcc aaacaacggc 1080

ttccgcccag ataccggaac tatcaccgcg taccgctcac caggcggagc tggcgttcgt 1140

cttgacggtg cagctcagct cggtggcgaa atcaccgcac actttgactc catgctggtg 1200

aaaatgacct gccgtggttc cgattttgaa actgctgttg ctcgtgcaca gcgcgcgttg 1260

gctgagttca ccgtgtctgg tgttgcaacc aacattggtt tcttgcgtgc gttgctgcgt 1320

gaagaggact tcacttccaa gcgcatcgcc accggattta tcggcgatca cccacacctc 1380

cttcaggctc cacctgcgga tgatgagcag ggacgcatcc tggattactt ggcagatgtc 1440

accgtgaaca agcctcatgg tgtgcgtcca aaggatgttg cagcaccaat cgataagctg 1500

cccaacatca aggatctgcc actgccacgc ggttcccgtg accgcctgaa gcagcttgga 1560

ccagcagcgt ttgcccgcga tctccgtgag caggacgcac tggcagttac tgataccacc 1620

ttccgcgatg cacaccagtc tttgcttgcg acccgagtcc gctcattcgc actgaagcct 1680

gcggcagagg ccgtcgcaaa gctgactcct gagcttttgt ccgtggaggc ctggggcggt 1740

gcgacctacg atgtggcgat gcgtttcctc tttgaggatc cgtgggacag gctcgacgag 1800

ctgcgcgagg cgatgccgaa tgtgaacatt cagatgctgc ttcgcggccg caacaccgtg 1860

ggatacaccc catacccaga ctccgtctgt cgcgcgtttg ttaaggaagc tgccacctcc 1920

ggcgtggaca tcttccgcat cttcgacgcg cttaacgacg tctcccagat gcgtccagca 1980

atcgacgcag tcctggagac caacaccgcg gtcgctgaag tggctatggc ttattctggt 2040

gatctttccg atccgaatga aaagctctac accctggatt actacctgaa gatggcagag 2100

gagatcgtca agtctggcgc tcacattctg gctattaagg atatggctgg tctgcttcgc 2160

ccagctgcag ccaccaagct ggtcaccgca ctgcgccgtg aatttgatct gccagtgcac 2220

gtgcacaccc acgacactgc gggtggccag ctggcaacct actttgctgc agctcaagct 2280

ggtgcagatg ctgttgacgg tgcttccgca ccactgtctg gcaccacctc ccagccatcc 2340

ctgtctgcca ttgttgctgc attcgcgcac acccgtcgcg ataccggttt gagcctcgag 2400

gctgtttctg acctcgagcc atactgggaa gcagtgcgcg gactgtacct gccatttgag 2460

tctggaaccc caggcccaac cggtcgcgtc taccgccacg aaatcccagg cggacagttg 2520

tccaacctgc gtgcacaggc caccgcactg ggccttgcgg atcgtttcga actcatcgaa 2580

gacaactacg cggcagttaa tgagatgctg ggacgcccaa ccaaggtcac cccatcctcc 2640

aaggttgttg gcgacctcgc actccacctc gttggtgcgg gtgtggatcc agcagacttt 2700

gctgccgatc cacaaaagta cgacatccca gactctgtca tcgcgttcct gcgcggcgag 2760

cttggtaacc ctccaggtgg ctggccagag ccactgcgca cccgcgcact ggaaggccgc 2820

tccgaaggca aagcaccttt gacggaagtt cctgaggaag agcaggcgca cctcgacgct 2880

gatgattcca aggaacgtcg caacagcctc aaccgcctgc tgttcccgaa gccaactgaa 2940

gagttcctcg agcaccgtcg ccgcttcggc aacacctctg cgctggatga tcgtgaattc 3000

ttctacggcc tggtcgaagg ccgcgagact ttgatccgcc tgccagatgt gcgcacccca 3060

ctgcttgttc gcctggatgc gatctctgag ccagacgata agggtatgcg caatgttgtg 3120

gctaacgtca acggccagat ccgcccaatg cgtgtgcgtg accgctccgt tgagtctgtc 3180

accgcaaccg cagaaaaggc agattcctcc aacaagggcc atgttgctgc accattcgct 3240

ggtgttgtca ctgtgactgt tgctgaaggt gatgaggtca aggctggaga tgcagtcgca 3300

atcatcgagg ctatgaagat ggaagcaaca atcactgctt ctgttgacgg caaaatcgat 3360

cgcgttgtgg ttcctgctgc aacgaaggtg gaaggtggcg acttgatcgt cgtcatttcc 3420

taa 3423

<210> 241

<211> 1140

<212> PRT

<213> 谷氨酸棒状杆菌

<400> 241

Val Ser Thr His Thr Ser Ser Thr Leu Pro Ala Phe Lys Lys Ile Leu

1 5 10 15

Val Ala Asn Arg Gly Glu Ile Ala Val Arg Ala Phe Arg Ala Ala Leu

20 25 30

Glu Thr Gly Ala Ala Thr Val Ala Ile Tyr Pro Arg Glu Asp Arg Gly

35 40 45

Ser Phe His Arg Ser Phe Ala Ser Glu Ala Val Arg Ile Gly Thr Glu

50 55 60

Gly Ser Pro Val Lys Ala Tyr Leu Asp Ile Asp Glu Ile Ile Gly Ala

65 70 75 80

Ala Lys Lys Val Lys Ala Asp Ala Ile Tyr Pro Gly Tyr Gly Phe Leu

85 90 95

Ser Glu Asn Ala Gln Leu Ala Arg Glu Cys Ala Glu Asn Gly Ile Thr

100 105 110

Phe Ile Gly Pro Thr Pro Glu Val Leu Asp Leu Thr Gly Asp Lys Ser

115 120 125

Arg Ala Val Thr Ala Ala Lys Lys Ala Gly Leu Pro Val Leu Ala Glu

130 135 140

Ser Thr Pro Ser Lys Asn Ile Asp Asp Ile Val Lys Ser Ala Glu Gly

145 150 155 160

Gln Thr Tyr Pro Ile Phe Val Lys Ala Val Ala Gly Gly Gly Gly Arg

165 170 175

Gly Met Arg Phe Val Ser Ser Pro Asp Glu Leu Arg Lys Leu Ala Thr

180 185 190

Glu Ala Ser Arg Glu Ala Glu Ala Ala Phe Gly Asp Gly Ser Val Tyr

195 200 205

Val Glu Arg Ala Val Ile Asn Pro Gln His Ile Glu Val Gln Ile Leu

210 215 220

Gly Asp Arg Thr Gly Glu Val Val His Leu Tyr Glu Arg Asp Cys Ser

225 230 235 240

Leu Gln Arg Arg His Gln Lys Val Val Glu Ile Ala Pro Ala Gln His

245 250 255

Leu Asp Pro Glu Leu Arg Asp Arg Ile Cys Ala Asp Ala Val Lys Phe

260 265 270

Cys Arg Ser Ile Gly Tyr Gln Gly Ala Gly Thr Val Glu Phe Leu Val

275 280 285

Asp Glu Lys Gly Asn His Val Phe Ile Glu Met Asn Pro Arg Ile Gln

290 295 300

Val Glu His Thr Val Thr Glu Glu Val Thr Glu Val Asp Leu Val Lys

305 310 315 320

Ala Gln Met Arg Leu Ala Ala Gly Ala Thr Leu Lys Glu Leu Gly Leu

325 330 335

Thr Gln Asp Lys Ile Lys Thr His Gly Ala Ala Leu Gln Cys Arg Ile

340 345 350

Thr Thr Glu Asp Pro Asn Asn Gly Phe Arg Pro Asp Thr Gly Thr Ile

355 360 365

Thr Ala Tyr Arg Ser Pro Gly Gly Ala Gly Val Arg Leu Asp Gly Ala

370 375 380

Ala Gln Leu Gly Gly Glu Ile Thr Ala His Phe Asp Ser Met Leu Val

385 390 395 400

Lys Met Thr Cys Arg Gly Ser Asp Phe Glu Thr Ala Val Ala Arg Ala

405 410 415

Gln Arg Ala Leu Ala Glu Phe Thr Val Ser Gly Val Ala Thr Asn Ile

420 425 430

Gly Phe Leu Arg Ala Leu Leu Arg Glu Glu Asp Phe Thr Ser Lys Arg

435 440 445

Ile Ala Thr Gly Phe Ile Gly Asp His Pro His Leu Leu Gln Ala Pro

450 455 460

Pro Ala Asp Asp Glu Gln Gly Arg Ile Leu Asp Tyr Leu Ala Asp Val

465 470 475 480

Thr Val Asn Lys Pro His Gly Val Arg Pro Lys Asp Val Ala Ala Pro

485 490 495

Ile Asp Lys Leu Pro Asn Ile Lys Asp Leu Pro Leu Pro Arg Gly Ser

500 505 510

Arg Asp Arg Leu Lys Gln Leu Gly Pro Ala Ala Phe Ala Arg Asp Leu

515 520 525

Arg Glu Gln Asp Ala Leu Ala Val Thr Asp Thr Thr Phe Arg Asp Ala

530 535 540

His Gln Ser Leu Leu Ala Thr Arg Val Arg Ser Phe Ala Leu Lys Pro

545 550 555 560

Ala Ala Glu Ala Val Ala Lys Leu Thr Pro Glu Leu Leu Ser Val Glu

565 570 575

Ala Trp Gly Gly Ala Thr Tyr Asp Val Ala Met Arg Phe Leu Phe Glu

580 585 590

Asp Pro Trp Asp Arg Leu Asp Glu Leu Arg Glu Ala Met Pro Asn Val

595 600 605

Asn Ile Gln Met Leu Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro

610 615 620

Tyr Pro Asp Ser Val Cys Arg Ala Phe Val Lys Glu Ala Ala Thr Ser

625 630 635 640

Gly Val Asp Ile Phe Arg Ile Phe Asp Ala Leu Asn Asp Val Ser Gln

645 650 655

Met Arg Pro Ala Ile Asp Ala Val Leu Glu Thr Asn Thr Ala Val Ala

660 665 670

Glu Val Ala Met Ala Tyr Ser Gly Asp Leu Ser Asp Pro Asn Glu Lys

675 680 685

Leu Tyr Thr Leu Asp Tyr Tyr Leu Lys Met Ala Glu Glu Ile Val Lys

690 695 700

Ser Gly Ala His Ile Leu Ala Ile Lys Asp Met Ala Gly Leu Leu Arg

705 710 715 720

Pro Ala Ala Val Thr Lys Leu Val Thr Ala Leu Arg Arg Glu Phe Asp

725 730 735

Leu Pro Val His Val His Thr His Asp Thr Ala Gly Gly Gln Leu Ala

740 745 750

Thr Tyr Phe Ala Ala Ala Gln Ala Gly Ala Asp Ala Val Asp Gly Ala

755 760 765

Ser Ala Pro Leu Ser Gly Thr Thr Ser Gln Pro Ser Leu Ser Ala Ile

770 775 780

Val Ala Ala Phe Ala His Thr Arg Arg Asp Thr Gly Leu Ser Leu Glu

785 790 795 800

Ala Val Ser Asp Leu Glu Pro Tyr Trp Glu Ala Val Arg Gly Leu Tyr

805 810 815

Leu Pro Phe Glu Ser Gly Thr Pro Gly Pro Thr Gly Arg Val Tyr Arg

820 825 830

His Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu Arg Ala Gln Ala Thr

835 840 845

Ala Leu Gly Leu Ala Asp Arg Phe Glu Leu Ile Glu Asp Asn Tyr Ala

850 855 860

Ala Val Asn Glu Met Leu Gly Arg Pro Thr Lys Val Thr Pro Ser Ser

865 870 875 880

Lys Val Val Gly Asp Leu Ala Leu His Leu Val Gly Ala Gly Val Asp

885 890 895

Pro Ala Asp Phe Ala Ala Asp Pro Gln Lys Tyr Asp Ile Pro Asp Ser

900 905 910

Val Ile Ala Phe Leu Arg Gly Glu Leu Gly Asn Pro Pro Gly Gly Trp

915 920 925

Pro Glu Pro Leu Arg Thr Arg Ala Leu Glu Gly Arg Ser Glu Gly Lys

930 935 940

Ala Pro Leu Thr Glu Val Pro Glu Glu Glu Gln Ala His Leu Asp Ala

945 950 955 960

Asp Asp Ser Lys Glu Arg Arg Asn Ser Leu Asn Arg Leu Leu Phe Pro

965 970 975

Lys Pro Thr Glu Glu Phe Leu Glu His Arg Arg Arg Phe Gly Asn Thr

980 985 990

Ser Ala Leu Asp Asp Arg Glu Phe Phe Tyr Gly Leu Val Glu Gly Arg

995 1000 1005

Glu Thr Leu Ile Arg Leu Pro Asp Val Arg Thr Pro Leu Leu Val

1010 1015 1020

Arg Leu Asp Ala Ile Ser Glu Pro Asp Asp Lys Gly Met Arg Asn

1025 1030 1035

Val Val Ala Asn Val Asn Gly Gln Ile Arg Pro Met Arg Val Arg

1040 1045 1050

Asp Arg Ser Val Glu Ser Val Thr Ala Thr Ala Glu Lys Ala Asp

1055 1060 1065

Ser Ser Asn Lys Gly His Val Ala Ala Pro Phe Ala Gly Val Val

1070 1075 1080

Thr Val Thr Val Ala Glu Gly Asp Glu Val Lys Ala Gly Asp Ala

1085 1090 1095

Val Ala Ile Ile Glu Ala Met Lys Met Glu Ala Thr Ile Thr Ala

1100 1105 1110

Ser Val Asp Gly Lys Ile Glu Arg Val Val Val Pro Ala Ala Thr

1115 1120 1125

Lys Val Glu Gly Gly Asp Leu Ile Val Val Val Ser

1130 1135 1140

<210> 242

<211> 3549

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_1序列

<400> 242

gtgttcagca aggtgctggt cgccaaccgc ggggagatcg cgatccgggc gtttcgtgcc 60

gcctacgaac tgggttgcca gacggtggcg gtgttccctt acgaggatcg caactccgaa 120

caccggctca aggcaaacga ggcctacgag atcggcgaga agggccatcc ggttcgggcc 180

tatctctcgg tggaagagat cgtccgggcg gcgcagcgcg ccggggccga cgccgtctac 240

cccggttacg gcttcctgtc ggagaatccg aagctggcca cggcctgccg gcaggccggt 300

atcactttcg tcgggccgcc acctgcggtg ctggccctgg ccggcaacaa gtcacgggcg 360

gtggccgcgg cgcgcgaagc cagagtgccg gtgctggaat cctgcgcccc gtccgccgat 420

atcgaccagc tgatggccgc ggccgacgag atcgggttcc ccatattcgt caaggccgtc 480

gccggcggcg gtgggcgtgg catgcgcagg gtcacctcgc tgcgtggcct gcgcgatgca 540

ctggaggcgg cctcccgcga ggctgaatcg acgttcgggg acccgaccgt gttcctggag 600

cgggccgtca tcgaacctcg gcacatcgag gcgcaggtcc tcgcagactc gaccggcgag 660

gtgatccacc tctatgagcg agactgctcg gtacagcggc gccaccagaa agtcgtagag 720

atcgcccccg cgcccaacct ggaccctgat ctgcgggcgc gaatctgtgc cgatgcggtg 780

gccttcgccc ggcagatcgg ctacgtcaac gcgggcaccg tggagttcct tgtcgaccgc 840

gcgggcaacc atgtgttcat cgagatgaac ccgcgcatcc aagtcgagca caccgtcacc 900

gaggaaatca ccgatatcga cctggtcgcc tcccaactgc gcatcgcggc gggggagacg 960

ctggccgacc tcgggttgtc ccaagacgcg atcgtgccgc atggcgccgc gctgcaatgc 1020

cggatcacca ccgaggaccc ggccaacgac ttccgtccgg acatcggaac ggtcaccgcg 1080

taccgctccg ccagcggcgc cggagttcgg ctggacggcg gcaccgtata cccgggtgcg 1140

cagataggtc cgcatttcga ctccttgctg gtcaaggtga cgtgccgggg acgggacctg 1200

gggtctgccg tgctgcgggc gcggcgcgcg atcgccgagt tccggatccg cggggtacac 1260

acgaacatcc cgttcctgct cgccctgctc gacgaaccgg atctccaggc gggcaaggtc 1320

accacctcgt tcatcgagca acggccgtac ttgctcacca cgcgccagtc cgccgaccgc 1380

ggcacccggt tgctgaccta cctcggccac atgacggtga atcggccgca cggtgagccc 1440

cccgagctgg tcgacccgat gctcaagctc ccgccgatcg acctggacgc gccaccacct 1500

accggatccc ggcagcagct ccgcgcgctg ggcccggaag gcttcgcgcg ctggttgcgg 1560

acccgcgaca gtgtcggcgt caccgacacc accttccgcg acgcccacca gtcactgctc 1620

gctacccgag tgcgcagcaa ggacctcgtt gcggtggcgc cctacgtgac ccggatgacc 1680

tcgcaactgc tgtcgttgga gtgctggggt ggcgcgacct acgacgtggc gctgcgcttc 1740

ctcgccgagg acccctggga gcgactggcc gcattgcgtg aagcggtccc caacctgtgc 1800

ctgcagatgt tgctgcgcgg gcgcaacacc gtcggctaca cgccttaccc caccgaggtg 1860

actgcggcct tcgtcgagca ggccgtcgag accggcctgg acatttttcg catcttcgac 1920

gcgctcaacg acatctccca gatgcgcccc gccatcgaca cggtgcgcga gaccggccgg 1980

gccatcgccg aggtggcgct gtgctatacc gccgatctgt ccgatccggc ggagaagctg 2040

tacacgctgg actattacct gcggctggcc gaggagatcg tggcggccgg cgcgcacgtg 2100

ctggccatca aggacatggc gggcctgcta cgaccccccg cggcccgcac gctggtgacc 2160

gcgctgcgca gccggttcga tctccccgtg cacctgcata cccacgacac acctggtggg 2220

cagctggcca ccctgctcgc agcgatcgac gccggggtag atgcggtcga tgccgccacc 2280

gcctcgatgg ccggcaccac gtcgcagcca tcgctatccg cgctggtagc ggctactgac 2340

cacaccgagc gcagcaccgg gctgaacctg caggcggtct gtgatctgga gccctactgg 2400

gaactggtgc gcaaggtcta cgcgcccttc gagtccggcc tggcctcgcc caccggtcgg 2460

gtgtatcacc acgagatccc tggtggccag ctatccaacc tgcgccagca agccgtggcg 2520

ttgggcctgg cggacaagtt cgagcagatc gaacaagcct acgcagcggc tgaccggatg 2580

ctcggcaggt tgatcaaggt gaccccttcc tccaaggtcg tgggagatct ggcgctgcac 2640

ctggtcggtg ccggtgtcga gccgacagac ttcgaggccg atccggctcg gttcgacatc 2700

cccgactcgg tgatcggatt cctgcacggc gaactcggtg atcctcccgg cggctggccc 2760

gaacccctgc gcagcaaggc gctcaagggt cgcagcgacc ccaaggggat cgcggagctg 2820

tccgccgagg accgcaaggg ccttcgcgag gaccgggcgc gcaccctcaa ccggctgctg 2880

ttcgctggtc ccaccgcaga cttcgaagag catcgcgagt cctacgggga tacctcggta 2940

ctgcccagca aggagttctt ctacgggctg cgttccggtc aggagcatgc ggtagatctc 3000

gaaccggggg tgcggctgct gatccagctt caggcgatcg gcgatgctga cgaacgcggc 3060

ctgcgcaccg tgatgtgcac cctcaacggg cagttgcggc cactgcagat ccgggatcac 3120

tccatcgact cggagatccc ggttgcagag cgagccaaca aatccgacag caaccatgtt 3180

gcggcgccgt tcgcgggggt ggtcaccctg caggtgtccg aaggggacac cgtgtctgcc 3240

gggcagctgg tggccggtgc ccttctggcg ccacggcttc ttgcccgagc cgctcacctc 3300

cgcacgcgtc ttcgtcgcgt gcgtgccgcg acgctcggcg gcgttggcgt gcaccacgga 3360

ctcgtggatc aggtccgtct tgatgcggcc cccgaacagc tcgtcgctca ggtcgaccga 3420

gccgaccttc tcgttctgct ggttcacgac atctacggtc atggctactt cttgcccttc 3480

ttcttcgccg gctcggcctg tgcgaccttg atgcgcttgg cggccaccgc tttgcggatc 3540

gtcacgtag 3549

<210> 243

<211> 1182

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_1序列

<400> 243

Val Phe Ser Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Cys Gln Thr Val Ala Val Phe

20 25 30

Pro Tyr Glu Asp Arg Asn Ser Glu His Arg Leu Lys Ala Asn Glu Ala

35 40 45

Tyr Glu Ile Gly Glu Lys Gly His Pro Val Arg Ala Tyr Leu Ser Val

50 55 60

Glu Glu Ile Val Arg Ala Ala Gln Arg Ala Gly Ala Asp Ala Val Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Lys Leu Ala Thr Ala Cys

85 90 95

Arg Gln Ala Gly Ile Thr Phe Val Gly Pro Pro Pro Ala Val Leu Ala

100 105 110

Leu Ala Gly Asn Lys Ser Arg Ala Val Ala Ala Ala Arg Glu Ala Arg

115 120 125

Val Pro Val Leu Glu Ser Cys Ala Pro Ser Ala Asp Ile Asp Gln Leu

130 135 140

Met Ala Ala Ala Asp Glu Ile Gly Phe Pro Ile Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Thr Ser Leu Arg Gly

165 170 175

Leu Arg Asp Ala Leu Glu Ala Ala Ser Arg Glu Ala Glu Ser Thr Phe

180 185 190

Gly Asp Pro Thr Val Phe Leu Glu Arg Ala Val Ile Glu Pro Arg His

195 200 205

Ile Glu Ala Gln Val Leu Ala Asp Ser Thr Gly Glu Val Ile His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu

225 230 235 240

Ile Ala Pro Ala Pro Asn Leu Asp Pro Asp Leu Arg Ala Arg Ile Cys

245 250 255

Ala Asp Ala Val Ala Phe Ala Arg Gln Ile Gly Tyr Val Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Asp Arg Ala Gly Asn His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Ile Thr

290 295 300

Asp Ile Asp Leu Val Ala Ser Gln Leu Arg Ile Ala Ala Gly Glu Thr

305 310 315 320

Leu Ala Asp Leu Gly Leu Ser Gln Asp Ala Ile Val Pro His Gly Ala

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Asp Phe Arg

340 345 350

Pro Asp Ile Gly Thr Val Thr Ala Tyr Arg Ser Ala Ser Gly Ala Gly

355 360 365

Val Arg Leu Asp Gly Gly Thr Val Tyr Pro Gly Ala Gln Ile Gly Pro

370 375 380

His Phe Asp Ser Leu Leu Val Lys Val Thr Cys Arg Gly Arg Asp Leu

385 390 395 400

Gly Ser Ala Val Leu Arg Ala Arg Arg Ala Ile Ala Glu Phe Arg Ile

405 410 415

Arg Gly Val His Thr Asn Ile Pro Phe Leu Leu Ala Leu Leu Asp Glu

420 425 430

Pro Asp Leu Gln Ala Gly Lys Val Thr Thr Ser Phe Ile Glu Gln Arg

435 440 445

Pro Tyr Leu Leu Thr Thr Arg Gln Ser Ala Asp Arg Gly Thr Arg Leu

450 455 460

Leu Thr Tyr Leu Gly His Met Thr Val Asn Arg Pro His Gly Glu Pro

465 470 475 480

Pro Glu Leu Val Asp Pro Met Leu Lys Leu Pro Pro Ile Asp Leu Asp

485 490 495

Ala Pro Pro Pro Thr Gly Ser Arg Gln Gln Leu Arg Ala Leu Gly Pro

500 505 510

Glu Gly Phe Ala Arg Trp Leu Arg Thr Arg Asp Ser Val Gly Val Thr

515 520 525

Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Val

530 535 540

Arg Ser Lys Asp Leu Val Ala Val Ala Pro Tyr Val Thr Arg Met Thr

545 550 555 560

Ser Gln Leu Leu Ser Leu Glu Cys Trp Gly Gly Ala Thr Tyr Asp Val

565 570 575

Ala Leu Arg Phe Leu Ala Glu Asp Pro Trp Glu Arg Leu Ala Ala Leu

580 585 590

Arg Glu Ala Val Pro Asn Leu Cys Leu Gln Met Leu Leu Arg Gly Arg

595 600 605

Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Glu Val Thr Ala Ala Phe

610 615 620

Val Glu Gln Ala Val Glu Thr Gly Leu Asp Ile Phe Arg Ile Phe Asp

625 630 635 640

Ala Leu Asn Asp Ile Ser Gln Met Arg Pro Ala Ile Asp Thr Val Arg

645 650 655

Glu Thr Gly Arg Ala Ile Ala Glu Val Ala Leu Cys Tyr Thr Ala Asp

660 665 670

Leu Ser Asp Pro Ala Glu Lys Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg

675 680 685

Leu Ala Glu Glu Ile Val Ala Ala Gly Ala His Val Leu Ala Ile Lys

690 695 700

Asp Met Ala Gly Leu Leu Arg Pro Pro Ala Ala Arg Thr Leu Val Thr

705 710 715 720

Ala Leu Arg Ser Arg Phe Asp Leu Pro Val His Leu His Thr His Asp

725 730 735

Thr Pro Gly Gly Gln Leu Ala Thr Leu Leu Ala Ala Ile Asp Ala Gly

740 745 750

Val Asp Ala Val Asp Ala Ala Thr Ala Ser Met Ala Gly Thr Thr Ser

755 760 765

Gln Pro Ser Leu Ser Ala Leu Val Ala Ala Thr Asp His Thr Glu Arg

770 775 780

Ser Thr Gly Leu Asn Leu Gln Ala Val Cys Asp Leu Glu Pro Tyr Trp

785 790 795 800

Glu Leu Val Arg Lys Val Tyr Ala Pro Phe Glu Ser Gly Leu Ala Ser

805 810 815

Pro Thr Gly Arg Val Tyr His His Glu Ile Pro Gly Gly Gln Leu Ser

820 825 830

Asn Leu Arg Gln Gln Ala Val Ala Leu Gly Leu Ala Asp Lys Phe Glu

835 840 845

Gln Ile Glu Gln Ala Tyr Ala Ala Ala Asp Arg Met Leu Gly Arg Leu

850 855 860

Ile Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu His

865 870 875 880

Leu Val Gly Ala Gly Val Glu Pro Thr Asp Phe Glu Ala Asp Pro Ala

885 890 895

Arg Phe Asp Ile Pro Asp Ser Val Ile Gly Phe Leu His Gly Glu Leu

900 905 910

Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Leu Arg Ser Lys Ala Leu

915 920 925

Lys Gly Arg Ser Asp Pro Lys Gly Ile Ala Glu Leu Ser Ala Glu Asp

930 935 940

Arg Lys Gly Leu Arg Glu Asp Arg Ala Arg Thr Leu Asn Arg Leu Leu

945 950 955 960

Phe Ala Gly Pro Thr Ala Asp Phe Glu Glu His Arg Glu Ser Tyr Gly

965 970 975

Asp Thr Ser Val Leu Pro Ser Lys Glu Phe Phe Tyr Gly Leu Arg Ser

980 985 990

Gly Gln Glu His Ala Val Asp Leu Glu Pro Gly Val Arg Leu Leu Ile

995 1000 1005

Gln Leu Gln Ala Ile Gly Asp Ala Asp Glu Arg Gly Leu Arg Thr

1010 1015 1020

Val Met Cys Thr Leu Asn Gly Gln Leu Arg Pro Leu Gln Ile Arg

1025 1030 1035

Asp His Ser Ile Asp Ser Glu Ile Pro Val Ala Glu Arg Ala Asn

1040 1045 1050

Lys Ser Asp Ser Asn His Val Ala Ala Pro Phe Ala Gly Val Val

1055 1060 1065

Thr Leu Gln Val Ser Glu Gly Asp Thr Val Ser Ala Gly Gln Leu

1070 1075 1080

Val Ala Gly Ala Leu Leu Ala Pro Arg Leu Leu Ala Arg Ala Ala

1085 1090 1095

His Leu Arg Thr Arg Leu Arg Arg Val Arg Ala Ala Thr Leu Gly

1100 1105 1110

Gly Val Gly Val His His Gly Leu Val Asp Gln Val Arg Leu Asp

1115 1120 1125

Ala Ala Pro Glu Gln Leu Val Ala Gln Val Asp Arg Ala Asp Leu

1130 1135 1140

Leu Val Leu Leu Val His Asp Ile Tyr Gly His Gly Tyr Phe Leu

1145 1150 1155

Pro Phe Phe Phe Ala Gly Ser Ala Cys Ala Thr Leu Met Arg Leu

1160 1165 1170

Ala Ala Thr Ala Leu Arg Ile Val Thr

1175 1180

<210> 244

<211> 3405

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_2序列

<400> 244

gtgttccaga agattctcgt ggccaaccgc ggtgagatcg cgatccgcgc gttccgcgcc 60

gcgtacgagc tcggcgtgcg caccgtcgct gtcttcccct acgaggaccg cggctccacc 120

caccgcatga cggcggacga ggcgtaccag atcggcgagc cgggacaccc cgtgcgcgcc 180

tacctcgatg tcgacgagat catccgcgtg gcgaaggagt gcggcgccga cgccatctat 240

cccggctacg ggttcctctc ggagaacccg gcgctcgccg aggcggcgca ggaggcgggc 300

atcacgttcg tcgggccgcc cgcccgcgtg ctcgagatag ccgggaacaa ggtcaccgcg 360

aaggagcgcg cgatcgccgc cggggtgccg gtgctggcgt cgacgccggc gtcgcgcgac 420

ctcgacgagc tcgtgcgcgc ggccgacgac ctcggcttcc cggtgttcgc caaggcggtc 480

gccggcggag gggggcgcgg catgcgccgc gtcgacacgc gcgaagagct gccggccgcc 540

ctcgaggagg ccatgcgcga ggcggagacc gcgttcggcg accccacgat gttcctcgag 600

caggccttcc cgcagccccg gcacatcgag gtgcagatcc tcgcggacgg ccacggcgac 660

gtggtgcacc tcttcgagcg cgactgctcc gtgcagcggc gccaccagaa ggtgatcgag 720

atcgcgcccg cgcccaacgt cgaccagccg ctgcgcgagg cgctctaccg cgacgccgtc 780

gcgttcgcgc gatccatcgg ctacgtgaac gccggcacgg tcgagttcct cgtggacacc 840

gccggcgagc gcgccgggca gcacgtgttc atcgagatga acccacgcat ccaggtggag 900

cacacggtca ccgaggaggt gacggacgtc gacctcgtgc aggcgcagat gcgcatcgcg 960

gccggcgagc ggctgagcga cctcggcatc cggcaggaga gcctccaact gcgaggcgcc 1020

gcgatgcagt gccgcatcac gaccgaggat ccgatgaacg gattccgccc ggacgtcggc 1080

cgcatcacaa cgtaccgctc gcccggcggc gccggcatcc gcctggacgg cggcacgatc 1140

aacctcggca gcgagatcgg accgtacttc gactcgatgc tcgtgaagct cacgtcgcgc 1200

gcgggcgact tccccgccgc ggtgagccgc gcgcggcgcg ccctcgcgga gttccgcatc 1260

cgaggcgtat cgacgaacat tccgttcctg caggcggtcg tcgccgaccc ggacttcatc 1320

gccggcaact tcacgacctc gttcatcgac gagcggccgt acctgctcaa cgcgaaccgc 1380

tcgaacgacc gcggcaccaa ggtgctgagc tggctcgccg acgtcacggt gaaccagccg 1440

cacggacgcc gcggcaagat cgtcagtccg tggcacaagc tgcccgccgt ggacatcgag 1500

gcgcccgcac cgccgggctc gcgcgaccgg ctgcgggcgc tcggcccggc ggcgttcgcg 1560

cgttcgctgc gggagcagac gccgctcgcc gtgacggaga cgacgttccg cgatgcgcac 1620

cagtcgctgc tggcgacgcg cgtgcgcacg cgcgacctcg tgagtgtggc gccgtacgtg 1680

gcgcgcacga cgccgcagtt gctctcgatc gaggcgtggg gcggcgcgac gtacgacgcg 1740

tcgctgcgct tcctgggcga ggacccgtgg gagcggctgg cggcgctgcg cgaggcgctg 1800

cccaacgtga acatccagat gctgctgcgc ggccgcaaca ccgtcggcta cacgccctac 1860

cccgaggagg tgaccgacgc cttcgtgcgg gaggcggcga ccaccggcgt cgacatcttc 1920

cgcatcttcg acgcgctgaa cgacgtctcg cagatgcgcc cggcgatcga gtcggtgctc 1980

gcgacgggta ccggcatcgc cgaggtcgcg ttctgctaca cgggcgacct gctggacccc 2040

aacgagacgc tgtacacgct cgactactac ctgaagctcg ccgaggagat cgtcggcacc 2100

ggcgcgcaca tcctcgcgat caaggacatg gccggactgc tccgcccgcg cgccgccgag 2160

gtgctcgtgc gcgcgctgcg cgagcgcttc gacctgccgg tccacctgca cacgcacgac 2220

accccgggcg gccagctcgc gacgctgctc gcggcgagcg gcgccggcgt cgacgccgtc 2280

gacgtcgcga gcgcgccgat ggccggcacc acgagccagc cgccgatctc ggcgctggtc 2340

gcggcgctcg cgcacaccga ccgcgacacg ggcctctcgc tgcaggccgt gtgcgacctc 2400

gagccgtact gggaggccgt gcgtacggcg taccggccgt tcgagtcggg tctgtccgcg 2460

cccaccggcc gcgtctacaa gcacgagatc ccgggcgggc agctctccaa cctccgccag 2520

caggcgatca gcatgggcat gggcgagcag ttcgagaagg tcgaggactg gtacgccgcc 2580

gcgaacgaga tcctcggccg gccgccgaag gtcacgccgt cctcgaaggc cgtgggcgac 2640

ctcgcgatct acctcgcggc ggtgaacgcg gaccgcgcgg acttcgaggc gcacccggag 2700

cgctacgaca tcccggagtc ggtgatcggc ttcatggccg gcgagctcgg cgacctgccg 2760

ggcggctggc ccgagccatt ccgcagcaag atcctcgagg gccggcacgt cgacatcgcg 2820

gtcacgccga tcagcgacgc cgaccgcgag gcgctcgagg gcgacaccgc ctcgcggcgg 2880

caggtgctga accggctgct cttcccggac gcgctcgcgg tgttccagga ggtgagcgac 2940

cagtacggcg acctgtcggt ggtcgacacc gtggactacc tctacggcct cgaccgcgcg 3000

accgagcacc tcgtgcacat cagcaagggc gtcacgctct acatcggcct cgaggcgatc 3060

ggcgaggtgg acgagcgcgg catccgcacg gtgatgacga ccctcaacgg ccagctgcgg 3120

ccggtgtacg tgcacgaccg cagcgtctcc gccgccatcc tcggcgcgga gaaggcagac 3180

ctgagccagc ccggtcacgt ggccgcgccc ttctcgggct tcgtgacggt gcaggtgcac 3240

gtcggcgaca ccgtcaccgc cggccagacg gtcgcgacca tcgaggcgat gaagatggag 3300

gccgcgatca cggccgccgt cggcggcgtg gtccgccgcg tcgtgatcac ggagacgcgc 3360

caggtgaacg gcggcgacct gctgatgctg atcgagccgg tctga 3405

<210> 245

<211> 1134

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_2序列

<400> 245

Val Phe Gln Lys Ile Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Val Arg Thr Val Ala Val Phe

20 25 30

Pro Tyr Glu Asp Arg Gly Ser Thr His Arg Met Thr Ala Asp Glu Ala

35 40 45

Tyr Gln Ile Gly Glu Pro Gly His Pro Val Arg Ala Tyr Leu Asp Val

50 55 60

Asp Glu Ile Ile Arg Val Ala Lys Glu Cys Gly Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Ala Leu Ala Glu Ala Ala

85 90 95

Gln Glu Ala Gly Ile Thr Phe Val Gly Pro Pro Ala Arg Val Leu Glu

100 105 110

Ile Ala Gly Asn Lys Val Thr Ala Lys Glu Arg Ala Ile Ala Ala Gly

115 120 125

Val Pro Val Leu Ala Ser Thr Pro Ala Ser Arg Asp Leu Asp Glu Leu

130 135 140

Val Arg Ala Ala Asp Asp Leu Gly Phe Pro Val Phe Ala Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asp Thr Arg Glu Glu

165 170 175

Leu Pro Ala Ala Leu Glu Glu Ala Met Arg Glu Ala Glu Thr Ala Phe

180 185 190

Gly Asp Pro Thr Met Phe Leu Glu Gln Ala Phe Pro Gln Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Gly His Gly Asp Val Val His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Ile Ala Pro Ala Pro Asn Val Asp Gln Pro Leu Arg Glu Ala Leu Tyr

245 250 255

Arg Asp Ala Val Ala Phe Ala Arg Ser Ile Gly Tyr Val Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Asp Thr Ala Gly Glu Arg Ala Gly Gln His

275 280 285

Val Phe Ile Glu Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr

290 295 300

Glu Glu Val Thr Asp Val Asp Leu Val Gln Ala Gln Met Arg Ile Ala

305 310 315 320

Ala Gly Glu Arg Leu Ser Asp Leu Gly Ile Arg Gln Glu Ser Leu Gln

325 330 335

Leu Arg Gly Ala Ala Met Gln Cys Arg Ile Thr Thr Glu Asp Pro Met

340 345 350

Asn Gly Phe Arg Pro Asp Val Gly Arg Ile Thr Thr Tyr Arg Ser Pro

355 360 365

Gly Gly Ala Gly Ile Arg Leu Asp Gly Gly Thr Ile Asn Leu Gly Ser

370 375 380

Glu Ile Gly Pro Tyr Phe Asp Ser Met Leu Val Lys Leu Thr Ser Arg

385 390 395 400

Ala Gly Asp Phe Pro Ala Ala Val Ser Arg Ala Arg Arg Ala Leu Ala

405 410 415

Glu Phe Arg Ile Arg Gly Val Ser Thr Asn Ile Pro Phe Leu Gln Ala

420 425 430

Val Val Ala Asp Pro Asp Phe Ile Ala Gly Asn Phe Thr Thr Ser Phe

435 440 445

Ile Asp Glu Arg Pro Tyr Leu Leu Asn Ala Asn Arg Ser Asn Asp Arg

450 455 460

Gly Thr Lys Val Leu Ser Trp Leu Ala Asp Val Thr Val Asn Gln Pro

465 470 475 480

His Gly Arg Arg Gly Lys Ile Val Ser Pro Trp His Lys Leu Pro Ala

485 490 495

Val Asp Ile Glu Ala Pro Ala Pro Pro Gly Ser Arg Asp Arg Leu Arg

500 505 510

Ala Leu Gly Pro Ala Ala Phe Ala Arg Ser Leu Arg Glu Gln Thr Pro

515 520 525

Leu Ala Val Thr Glu Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu

530 535 540

Ala Thr Arg Val Arg Thr Arg Asp Leu Val Ser Val Ala Pro Tyr Val

545 550 555 560

Ala Arg Thr Thr Pro Gln Leu Leu Ser Ile Glu Ala Trp Gly Gly Ala

565 570 575

Thr Tyr Asp Ala Ser Leu Arg Phe Leu Gly Glu Asp Pro Trp Glu Arg

580 585 590

Leu Ala Ala Leu Arg Glu Ala Leu Pro Asn Val Asn Ile Gln Met Leu

595 600 605

Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro Tyr Pro Glu Glu Val

610 615 620

Thr Asp Ala Phe Val Arg Glu Ala Ala Thr Thr Gly Val Asp Ile Phe

625 630 635 640

Arg Ile Phe Asp Ala Leu Asn Asp Val Ser Gln Met Arg Pro Ala Ile

645 650 655

Glu Ser Val Leu Ala Thr Gly Thr Gly Ile Ala Glu Val Ala Phe Cys

660 665 670

Tyr Thr Gly Asp Leu Leu Asp Pro Asn Glu Thr Leu Tyr Thr Leu Asp

675 680 685

Tyr Tyr Leu Lys Leu Ala Glu Glu Ile Val Gly Thr Gly Ala His Ile

690 695 700

Leu Ala Ile Lys Asp Met Ala Gly Leu Leu Arg Pro Arg Ala Ala Glu

705 710 715 720

Val Leu Val Arg Ala Leu Arg Glu Arg Phe Asp Leu Pro Val His Leu

725 730 735

His Thr His Asp Thr Pro Gly Gly Gln Leu Ala Thr Leu Leu Ala Ala

740 745 750

Ser Gly Ala Gly Val Asp Ala Val Asp Val Ala Ser Ala Pro Met Ala

755 760 765

Gly Thr Thr Ser Gln Pro Pro Ile Ser Ala Leu Val Ala Ala Leu Ala

770 775 780

His Thr Asp Arg Asp Thr Gly Leu Ser Leu Gln Ala Val Cys Asp Leu

785 790 795 800

Glu Pro Tyr Trp Glu Ala Val Arg Thr Ala Tyr Arg Pro Phe Glu Ser

805 810 815

Gly Leu Ser Ala Pro Thr Gly Arg Val Tyr Lys His Glu Ile Pro Gly

820 825 830

Gly Gln Leu Ser Asn Leu Arg Gln Gln Ala Ile Ser Met Gly Met Gly

835 840 845

Glu Gln Phe Glu Lys Val Glu Asp Trp Tyr Ala Ala Ala Asn Glu Ile

850 855 860

Leu Gly Arg Pro Pro Lys Val Thr Pro Ser Ser Lys Ala Val Gly Asp

865 870 875 880

Leu Ala Ile Tyr Leu Ala Ala Val Asn Ala Asp Arg Ala Asp Phe Glu

885 890 895

Ala His Pro Glu Arg Tyr Asp Ile Pro Glu Ser Val Ile Gly Phe Met

900 905 910

Ala Gly Glu Leu Gly Asp Leu Pro Gly Gly Trp Pro Glu Pro Phe Arg

915 920 925

Ser Lys Ile Leu Glu Gly Arg His Val Asp Ile Ala Val Thr Pro Ile

930 935 940

Ser Asp Ala Asp Arg Glu Ala Leu Glu Gly Asp Thr Ala Ser Arg Arg

945 950 955 960

Gln Val Leu Asn Arg Leu Leu Phe Pro Asp Ala Leu Ala Val Phe Gln

965 970 975

Glu Val Ser Asp Gln Tyr Gly Asp Leu Ser Val Val Asp Thr Val Asp

980 985 990

Tyr Leu Tyr Gly Leu Asp Arg Ala Thr Glu His Leu Val His Ile Ser

995 1000 1005

Lys Gly Val Thr Leu Tyr Ile Gly Leu Glu Ala Ile Gly Glu Val

1010 1015 1020

Asp Glu Arg Gly Ile Arg Thr Val Met Thr Thr Leu Asn Gly Gln

1025 1030 1035

Leu Arg Pro Val Tyr Val His Asp Arg Ser Val Ser Ala Ala Ile

1040 1045 1050

Leu Gly Ala Glu Lys Ala Asp Leu Ser Gln Pro Gly His Val Ala

1055 1060 1065

Ala Pro Phe Ser Gly Phe Val Thr Val Gln Val His Val Gly Asp

1070 1075 1080

Thr Val Thr Ala Gly Gln Thr Val Ala Thr Ile Glu Ala Met Lys

1085 1090 1095

Met Glu Ala Ala Ile Thr Ala Ala Val Gly Gly Val Val Arg Arg

1100 1105 1110

Val Val Ile Thr Glu Thr Arg Gln Val Asn Gly Gly Asp Leu Leu

1115 1120 1125

Met Leu Ile Glu Pro Val

1130

<210> 246

<211> 3408

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_3序列

<400> 246

atgtttggca aggtactggt cgctaatcgt ggcgagatcg cggtccgagc ctttcgcgcg 60

gcgtacgagc tgggcgtgag gaccgtcgcg gtgttcgcct acgaggaccg aaacgcggtc 120

caccggatca aggcggatga ggcgtacttg atcggtgagc ggggtcaccc ggtacgcgcc 180

tatctcgata tcaacgagat catgcgggct gccaagcagt ctgaggcaga tgcgatctat 240

cccggctacg gcttcctgag cgaaaatccc gaccttgccc gggcgtgtga agacgccggc 300

ataaccttca tcggtccgcc tgccaaagtg ctggagcttg ccgggaacaa ggtccacgcc 360

atcgaggcag ccaaggccgc aggcgtgcca accctcacct caacgccacc gtcggccaac 420

atcgaggagc tgatggcaag tgccgaaagc atcggctttc cggcgttcgt taaggcggtt 480

gccggtggcg gtggccgcgg catgcggcgg atcgcggacc gcgatcagct cagagaatcg 540

ctatctgccg cgatgcgcga ggccgaaggc gcgttcggcg acccgacggc atacatcgag 600

caggcggtag gccggccgcg gcacatcgag gtgcaggtac tcgctgacag tcagggcgac 660

accatccatc tgttcgagcg cgactgttcg gtgcagcgac ggcaccaaaa gatcattgag 720

attgcaccgg ctccgcacat ctcgactgag ttgcgcgagg cgttgtgtcg tgatgcggtg 780

cggttcgctg aatcgatcga ttactcctgc gccggcactg tcgagtttct ggttgagacc 840

gagggtgagc gggccggtca gcacgtcttc atcgagatga atcctcgaat ccaggtggag 900

cacaccatta ctgaggagat caccgatgtt gatcttgtgc aggcccagat gcggattgct 960

gcaggggaaa gccttactga tcttggtctg tcacaggaag cgattcggat caacggggcc 1020

gcgctgcaat gccgaatcac caccgaggat cctgcgaacg atttccggcc cgatacgggc 1080

accattaccg cctatcgttc cgcgggcggt gcgggcgtgc ggatcgacgg tggcacggtc 1140

gacatcggcg ttgaaatcag cgcatatttc gactccctgc tggtcaagct catctgtcac 1200

gggtgggatt tccaggcggc agtgacccga gcccggcggg cgcttgccga gttccggatc 1260

cgcggcgtaa gcaccaacat tccttttctg caagccgttc tggccgatcc caagttcaga 1320

gcaggtgacg tctcgacatc ctttatcgag gagcggcccg acctgctgac cgcgcacgcg 1380

cccgccgacc ggggcaccaa attgctgcgc tggctggcgg aggtaacggt caaccagccg 1440

catggaccag caccgactca actcgatcca ggcctgaagc gacctaccgg catcgatctc 1500

accatcccat cccccaccgg ctcacggcag cggcttcttg atcttggtcc agaggctttc 1560

gccgaggatc tgcggcaacg ggtgccgatc gaggtaactg acacaacctt ccgcgacgct 1620

caccagtccc tgcttgcgac tcgggttcgt accaaggatt tgatacgtat tgcgccctat 1680

gtgggacgca tgacaccgca gctgctctcg gtcgagtgct ggggcggggc aacctacgac 1740

gtggcgctgc gcttcatcgc cgaggacccg tgggaacgcc tcgccgcgct gcgctacaac 1800

atgccaggtc tgtgcctgca aatgctctta cgcggacgca acacggtcgg ttatacgccg 1860

tacccaacga aggtcaccac ttcattcgtg gccgaagctg cccaagtcgg tatcgacatc 1920

ttccgcatct tcgatgcgct caacgacgtc gagcagatgc ggccagcgat cgaggcggta 1980

cgagagaccg gcagcaccat cgccgaggta gccctgtgct acaccggcga cctcaactca 2040

ccggccgagg acctgtacac cctggactac tacctgcgac tggccgagaa gatggtcaag 2100

gccggtgctc acatcatcgg gatcaaggac atggctgggc tgctccggcc gcctgcggct 2160

cggaagttgg tcaccgcact acggcagaac ttcgacctcc cggtgcatct gcacacccac 2220

gacaccgccg gtggtcagct ggcgacactg ctcgccgcca tcgaggttgg agtcgatgca 2280

gtcgatgtgg ccagcgcccc gatggccgga accaccagtc aggtgcccgc ttcagcactg 2340

gtcgcggcct gcgcgaatac tgagcggccc accaaccttg acctgcgaga cgtgatggag 2400

cttgagccgt actgggaggc ggttcggaag gtgtatgcgc cgttcgagtc aggactgccg 2460

agcccaaccg gccgggtcta tgaccacgag atccccggtg gccagctctc caatctccgg 2520

cagcaggcaa ttgctctggg actgggggag aagtttgagc aaattgaggc tatgtatacc 2580

gccgcgagcc gcatcttggg taggccgccc aaggtcacac cgtcctcgaa agtcgttggt 2640

gatcttgctt tacacctagt tgcggttgga gccgatccgg acgacttcgc ccagaaccct 2700

cacaagtacg acatcccgga ttcagtgatc ggtttcctca acggtgagct aggtgatccg 2760

cccggcggct ggccggagcc gttccgcacc aaggcgctac aggggcgtac cgtgccggta 2820

cgcgacatcg agctttcgcc cgaggactca gccaaccttg acgacaaagg tttggtgcga 2880

cagagcacgc tgaaccgctt gctgttcccc gggccgacca aggagttcct ggccaaccgg 2940

gaaacctacg gcgatgtggg tcggcttaac accctggact tcctctacgg gttacagccc 3000

ggtcaggagc accttgccaa gatcggtaag ggtgtcagcc tgatactcgg gctttcggcg 3060

atcggcaacg ccgacgaacg gggcatgcgt accgttatgt gcacgatcaa cggacagctg 3120

cggcccatcc gagttcgcga caagtcgatc aaggtcgatg ttaagactgc cgagcgtgcg 3180

gatccgaaca atccgggtca tgtggcggca ccgttcgccg gcgtggtcac cgttacggtg 3240

cgtgagggcg atcaggtcca ggctggtgcc accgttgcca cgatcgaggc gatgaagatg 3300

gaagccgcga ttactacgcc ggtgtcaggc gtggtacagc ggctggcact cgctgacgtg 3360

cagcaagttg agggcggtga cctcgtcctg gtggttgccg ctgcctaa 3408

<210> 247

<211> 1135

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_3序列

<400> 247

Met Phe Gly Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Val Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Val Arg Thr Val Ala Val Phe

20 25 30

Ala Tyr Glu Asp Arg Asn Ala Val His Arg Ile Lys Ala Asp Glu Ala

35 40 45

Tyr Leu Ile Gly Glu Arg Gly His Pro Val Arg Ala Tyr Leu Asp Ile

50 55 60

Asn Glu Ile Met Arg Ala Ala Lys Gln Ser Glu Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Asp Leu Ala Arg Ala Cys

85 90 95

Glu Asp Ala Gly Ile Thr Phe Ile Gly Pro Pro Ala Lys Val Leu Glu

100 105 110

Leu Ala Gly Asn Lys Val His Ala Ile Glu Ala Ala Lys Ala Ala Gly

115 120 125

Val Pro Thr Leu Thr Ser Thr Pro Pro Ser Ala Asn Ile Glu Glu Leu

130 135 140

Met Ala Ser Ala Glu Ser Ile Gly Phe Pro Ala Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Ile Ala Asp Arg Asp Gln

165 170 175

Leu Arg Glu Ser Leu Ser Ala Ala Met Arg Glu Ala Glu Gly Ala Phe

180 185 190

Gly Asp Pro Thr Ala Tyr Ile Glu Gln Ala Val Gly Arg Pro Arg His

195 200 205

Ile Glu Val Gln Val Leu Ala Asp Ser Gln Gly Asp Thr Ile His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Ile Ile Glu

225 230 235 240

Ile Ala Pro Ala Pro His Ile Ser Thr Glu Leu Arg Glu Ala Leu Cys

245 250 255

Arg Asp Ala Val Arg Phe Ala Glu Ser Ile Asp Tyr Ser Cys Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Glu Thr Glu Gly Glu Arg Ala Gly Gln His

275 280 285

Val Phe Ile Glu Met Asn Pro Arg Ile Gln Val Glu His Thr Ile Thr

290 295 300

Glu Glu Ile Thr Asp Val Asp Leu Val Gln Ala Gln Met Arg Ile Ala

305 310 315 320

Ala Gly Glu Ser Leu Thr Asp Leu Gly Leu Ser Gln Glu Ala Ile Arg

325 330 335

Ile Asn Gly Ala Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala

340 345 350

Asn Asp Phe Arg Pro Asp Thr Gly Thr Ile Thr Ala Tyr Arg Ser Ala

355 360 365

Gly Gly Ala Gly Val Arg Ile Asp Gly Gly Thr Val Asp Ile Gly Val

370 375 380

Glu Ile Ser Ala Tyr Phe Asp Ser Leu Leu Val Lys Leu Ile Cys His

385 390 395 400

Gly Trp Asp Phe Gln Ala Ala Val Thr Arg Ala Arg Arg Ala Leu Ala

405 410 415

Glu Phe Arg Ile Arg Gly Val Ser Thr Asn Ile Pro Phe Leu Gln Ala

420 425 430

Val Leu Ala Asp Pro Lys Phe Arg Ala Gly Asp Val Ser Thr Ser Phe

435 440 445

Ile Glu Glu Arg Pro Asp Leu Leu Thr Ala His Ala Pro Ala Asp Arg

450 455 460

Gly Thr Lys Leu Leu Arg Trp Leu Ala Glu Val Thr Val Asn Gln Pro

465 470 475 480

His Gly Pro Ala Pro Thr Gln Leu Asp Pro Gly Leu Lys Arg Pro Thr

485 490 495

Gly Ile Asp Leu Thr Ile Pro Ser Pro Thr Gly Ser Arg Gln Arg Leu

500 505 510

Leu Asp Leu Gly Pro Glu Ala Phe Ala Glu Asp Leu Arg Gln Arg Val

515 520 525

Pro Ile Glu Val Thr Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu

530 535 540

Leu Ala Thr Arg Val Arg Thr Lys Asp Leu Ile Arg Ile Ala Pro Tyr

545 550 555 560

Val Gly Arg Met Thr Pro Gln Leu Leu Ser Val Glu Cys Trp Gly Gly

565 570 575

Ala Thr Tyr Asp Val Ala Leu Arg Phe Ile Ala Glu Asp Pro Trp Glu

580 585 590

Arg Leu Ala Ala Leu Arg Tyr Asn Met Pro Gly Leu Cys Leu Gln Met

595 600 605

Leu Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Lys

610 615 620

Val Thr Thr Ser Phe Val Ala Glu Ala Ala Gln Val Gly Ile Asp Ile

625 630 635 640

Phe Arg Ile Phe Asp Ala Leu Asn Asp Val Glu Gln Met Arg Pro Ala

645 650 655

Ile Glu Ala Val Arg Glu Thr Gly Ser Thr Ile Ala Glu Val Ala Leu

660 665 670

Cys Tyr Thr Gly Asp Leu Asn Ser Pro Ala Glu Asp Leu Tyr Thr Leu

675 680 685

Asp Tyr Tyr Leu Arg Leu Ala Glu Lys Met Val Lys Ala Gly Ala His

690 695 700

Ile Ile Gly Ile Lys Asp Met Ala Gly Leu Leu Arg Pro Pro Ala Ala

705 710 715 720

Arg Lys Leu Val Thr Ala Leu Arg Gln Asn Phe Asp Leu Pro Val His

725 730 735

Leu His Thr His Asp Thr Ala Gly Gly Gln Leu Ala Thr Leu Leu Ala

740 745 750

Ala Ile Glu Val Gly Val Asp Ala Val Asp Val Ala Ser Ala Pro Met

755 760 765

Ala Gly Thr Thr Ser Gln Val Pro Ala Ser Ala Leu Val Ala Ala Cys

770 775 780

Ala Asn Thr Glu Arg Pro Thr Asn Leu Asp Leu Arg Asp Val Met Glu

785 790 795 800

Leu Glu Pro Tyr Trp Glu Ala Val Arg Lys Val Tyr Ala Pro Phe Glu

805 810 815

Ser Gly Leu Pro Ser Pro Thr Gly Arg Val Tyr Asp His Glu Ile Pro

820 825 830

Gly Gly Gln Leu Ser Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu

835 840 845

Gly Glu Lys Phe Glu Gln Ile Glu Ala Met Tyr Thr Ala Ala Ser Arg

850 855 860

Ile Leu Gly Arg Pro Pro Lys Val Thr Pro Ser Ser Lys Val Val Gly

865 870 875 880

Asp Leu Ala Leu His Leu Val Ala Val Gly Ala Asp Pro Asp Asp Phe

885 890 895

Ala Gln Asn Pro His Lys Tyr Asp Ile Pro Asp Ser Val Ile Gly Phe

900 905 910

Leu Asn Gly Glu Leu Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Phe

915 920 925

Arg Thr Lys Ala Leu Gln Gly Arg Thr Val Pro Val Arg Asp Ile Glu

930 935 940

Leu Ser Pro Glu Asp Ser Ala Asn Leu Asp Asp Lys Gly Leu Val Arg

945 950 955 960

Gln Ser Thr Leu Asn Arg Leu Leu Phe Pro Gly Pro Thr Lys Glu Phe

965 970 975

Leu Ala Asn Arg Glu Thr Tyr Gly Asp Val Gly Arg Leu Asn Thr Leu

980 985 990

Asp Phe Leu Tyr Gly Leu Gln Pro Gly Gln Glu His Leu Ala Lys Ile

995 1000 1005

Gly Lys Gly Val Ser Leu Ile Leu Gly Leu Ser Ala Ile Gly Asn

1010 1015 1020

Ala Asp Glu Arg Gly Met Arg Thr Val Met Cys Thr Ile Asn Gly

1025 1030 1035

Gln Leu Arg Pro Ile Arg Val Arg Asp Lys Ser Ile Lys Val Asp

1040 1045 1050

Val Lys Thr Ala Glu Arg Ala Asp Pro Asn Asn Pro Gly His Val

1055 1060 1065

Ala Ala Pro Phe Ala Gly Val Val Thr Val Thr Val Arg Glu Gly

1070 1075 1080

Asp Gln Val Gln Ala Gly Ala Thr Val Ala Thr Ile Glu Ala Met

1085 1090 1095

Lys Met Glu Ala Ala Ile Thr Thr Pro Val Ser Gly Val Val Gln

1100 1105 1110

Arg Leu Ala Leu Ala Asp Val Gln Gln Val Glu Gly Gly Asp Leu

1115 1120 1125

Val Leu Val Val Ala Ala Ala

1130 1135

<210> 248

<211> 3486

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_4序列

<400> 248

ttggggaccc tcctggacgg gcagccggac aggcgggagg gtccgccgtc ccatgagaag 60

gtgtcccccg gagtgacgac ccgaggggga cacatgttcg gcaaggtcct ggtcgccaac 120

cgaggcgaga tcgcgatccg cgcgttccgc gccgcctacg agatgggcgc gcagaccgtc 180

gcggtgttcc cctacgagga ccgcaactcc gagcaccggc tcaaggccga cgaggcctac 240

cagatcggcg agctgggccg accggtacgc gcctacctcg acgtcgacgc gatcgtgcgt 300

acggcggtcc gcgccggcgc cgacgcggtc taccccggat acggcttcct gtccgagaac 360

ccccagctcg cggaggcctg tgccgccgcg gggatcgcct tcatcggtcc cagcgccgag 420

gtgctcgagc tcaccgggaa caaggcccgc gcgatcgcgg cggcgcgcaa ggccggagtg 480

ccgacgctca gcagcgtcgc ccccgggacc gaccccgcag cgctagtcga ggctgcccga 540

gagctcgcct tcccgctgtt cgtcaaggcg gtcgccggtg gcggcggtcg cggcatgcgg 600

cgcgtggacg accccgcggt cctcgaggag gccgtgcgga cctgtatgcg cgaggccgac 660

agtgccttcg gcgacccgac ggtcttcatc gagcaggccg tcgtcgaccc gcgccacatc 720

gaggtccaga tcctcgccga cgggcagggc gaggtccttc acctgttcga gcgcgactgc 780

tcggtgcagc gacgccacca gaaggtcgtc gagatcgcgc cggcgcccaa cctcgacccc 840

gggctgcgcg accggatgtg cgccgacgcg gtgcggttcg cgcgcgagat cggctacacg 900

aatgccggca ccgtcgagtt cctcctggac ccgcagggcc gctacgtgtt catcgagatg 960

aacccccgca tccaggtcga gcacaccgtg accgaggagg tcaccgacgt cgacctggtc 1020

cgcagccaga tgcgcatcgc gtcgggggag acgctggccg acctcgggct cacccaggag 1080

gacatccggc tccgcggcgc cgcgctgcag tgccggatca ccaccgagga cccggcgaac 1140

ggcttccgcc ccgacacggg cgtcatcacg acgtatcgct ccccgggcgg cgccggtatc 1200

cggctcgacg gcgggacgac ctacaccggc gcggagatct ccggtcactt cgactcgatg 1260

ctcgccaagc tcacctgccg cggccgcgac ttcaccaccg cggtcgagcg ctcgcgccgg 1320

gcggtggcgg agttccggat ccgcggcgtc gccacgaaca tccccttcct ccaggccgtc 1380

ctcgacgacc cggacttcgc ccgcggcggg gtgaccaccg gcttcatcga ggagcggccc 1440

cacctgctca ccgcgcgctc gagcgccgac cggggcacca agctgctcaa ctacctcgcc 1500

gacgtcaccg tcaaccagcc gtacggcgcg cttcaggtcg gggtcgaccc ccgggcgaag 1560

ctcccgcccg tcgacctcgc cgcaacgcca ccctccggaa cccgtcaact cctctgcgac 1620

gtcgggcctg aggagttcgc tcgccggctg cgccgtcaga cccgggtcgc cgtcaccgac 1680

accacgttcc gcgacgccca ccagtcgctg ctcgccaccc gcatccgcac ccgcgacctg 1740

ctgggggtcg ccggccacgt cgcccggacc accccagagc tctggtcgat cgaggcgtgg 1800

ggcggagcca cctacgacgt cgcgctgcgc ttcctctccg aggacccgtg ggaccggctc 1860

gcgcggctgc gtcgagcggt gcccaacatc tgcctgcaga tgcttctccg cggtcgcaac 1920

accgtgggct acacgccgta cccgaccgag gtgaccgacg ccttcgtcga ggaggcggcc 1980

gccaccggga tcgacgtctt ccgggtcttc gacgcgctca acgacgtgga gcagatgcgc 2040

ccggcggtcg aggccgtccg gaggaccggg accgcggtgg ccgaggtcgc gctgtgctac 2100

accggcgacc tctccgaccc ggccgagcgg ctctacaccc tcgactacta cctgcggctc 2160

gccgagcgca tcgtcgaggc cggtgcccac gtgctggcga tcaaggacat ggccgggctg 2220

ctccgggcgc cggccgccca ccggctggtg accgcgctgc gcgagcgttt cgacctgccg 2280

gtgcacctcc acacccacga cacccctggc ggccagctgg cgacgctgct cgccgcgatc 2340

gacgcggggg tggacgcggt cgacgccgcg agcgccgcga tggcggggac gaccagccag 2400

ccggcactgt ccgccctggt cgccgccaca gaccaccccg tgacggaggg ccgcgacacc 2460

gggctggacc tccgcgccgt ctgcgacctc gagccctact gggaggccac gagacgggtc 2520

tacgcgccgt tcgagtcggg gctgccctcg cccactggtc gggtctacac ccacgagatc 2580

cccggcgggc agctctcgaa cctccggcag caggcgatcg cgctcgggct gggggagaag 2640

ttcgagcaga tcgaggacat gtacgccgcg gcgaaccgga tcctcgggaa catcgtgaag 2700

gtgaccccgt cctccaaggt cgtcggtgac ctggcgctcc acctcgtggc ggtcgacgcc 2760

gaccccacgg cgttcgccga ggaccccggg aagttcgacg tgcccgactc ggtcgtcggc 2820

ttcctcagcg gcgacctcgg cgatccgccc gggggctggc cggagccgtt ccggacccgg 2880

gccctcgagg gccgtacgac gcgaccggcg gtcaccgagc tcaccgaggg cgatcgcgac 2940

gggctcgccg cggaccggag ggcgaccctc aaccggttgc tgttcccggg ccccacccgc 3000

gagttcgagg agtcccgaca gcggtacggc gacctgtcgg tgctgccgac gcggcagtac 3060

ctctacggtt tgcagcaggg tgaggagcac caggtggagc tcgccgaggg caagacgctc 3120

atcctggggc tcgaggcggt cgggggcgcg gacgagcgcg gcttccgcac cgtcatgtgc 3180

acgatcaacg gccacctgcg accggtcccc gtccgcgacc gttcggtggc ggcggacacc 3240

ccgacggcgg agaaggcgga ccccaccaag ccagggcagg tcgcggcgcc gttcggcggg 3300

gtcgtcaccc cgacggtggc ggagggcgac ccggtggagg ccggggccac cgtggcgacc 3360

atcgaggcga tgaagatgga ggcgtcgatc acggcgccgg tcgggggcac ggtgcagcgg 3420

gtcgcgctcg gcggtccgca gcaggtggag ggcggcgacc tggtgctcgt gatcggacga 3480

ggctga 3486

<210> 249

<211> 1161

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_4序列

<400> 249

Leu Gly Thr Leu Leu Asp Gly Gln Pro Asp Arg Arg Glu Gly Pro Pro

1 5 10 15

Ser His Glu Lys Val Ser Pro Gly Val Thr Thr Arg Gly Gly His Met

20 25 30

Phe Gly Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg Ala

35 40 45

Phe Arg Ala Ala Tyr Glu Met Gly Ala Gln Thr Val Ala Val Phe Pro

50 55 60

Tyr Glu Asp Arg Asn Ser Glu His Arg Leu Lys Ala Asp Glu Ala Tyr

65 70 75 80

Gln Ile Gly Glu Leu Gly Arg Pro Val Arg Ala Tyr Leu Asp Val Asp

85 90 95

Ala Ile Val Arg Thr Ala Val Arg Ala Gly Ala Asp Ala Val Tyr Pro

100 105 110

Gly Tyr Gly Phe Leu Ser Glu Asn Pro Gln Leu Ala Glu Ala Cys Ala

115 120 125

Ala Ala Gly Ile Ala Phe Ile Gly Pro Ser Ala Glu Val Leu Glu Leu

130 135 140

Thr Gly Asn Lys Ala Arg Ala Ile Ala Ala Ala Arg Lys Ala Gly Val

145 150 155 160

Pro Thr Leu Ser Ser Val Ala Pro Gly Thr Asp Pro Ala Ala Leu Val

165 170 175

Glu Ala Ala Arg Glu Leu Ala Phe Pro Leu Phe Val Lys Ala Val Ala

180 185 190

Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asp Asp Pro Ala Val Leu

195 200 205

Glu Glu Ala Val Arg Thr Cys Met Arg Glu Ala Asp Ser Ala Phe Gly

210 215 220

Asp Pro Thr Val Phe Ile Glu Gln Ala Val Val Asp Pro Arg His Ile

225 230 235 240

Glu Val Gln Ile Leu Ala Asp Gly Gln Gly Glu Val Leu His Leu Phe

245 250 255

Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu Ile

260 265 270

Ala Pro Ala Pro Asn Leu Asp Pro Gly Leu Arg Asp Arg Met Cys Ala

275 280 285

Asp Ala Val Arg Phe Ala Arg Glu Ile Gly Tyr Thr Asn Ala Gly Thr

290 295 300

Val Glu Phe Leu Leu Asp Pro Gln Gly Arg Tyr Val Phe Ile Glu Met

305 310 315 320

Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr Asp

325 330 335

Val Asp Leu Val Arg Ser Gln Met Arg Ile Ala Ser Gly Glu Thr Leu

340 345 350

Ala Asp Leu Gly Leu Thr Gln Glu Asp Ile Arg Leu Arg Gly Ala Ala

355 360 365

Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Gly Phe Arg Pro

370 375 380

Asp Thr Gly Val Ile Thr Thr Tyr Arg Ser Pro Gly Gly Ala Gly Ile

385 390 395 400

Arg Leu Asp Gly Gly Thr Thr Tyr Thr Gly Ala Glu Ile Ser Gly His

405 410 415

Phe Asp Ser Met Leu Ala Lys Leu Thr Cys Arg Gly Arg Asp Phe Thr

420 425 430

Thr Ala Val Glu Arg Ser Arg Arg Ala Val Ala Glu Phe Arg Ile Arg

435 440 445

Gly Val Ala Thr Asn Ile Pro Phe Leu Gln Ala Val Leu Asp Asp Pro

450 455 460

Asp Phe Ala Arg Gly Gly Val Thr Thr Gly Phe Ile Glu Glu Arg Pro

465 470 475 480

His Leu Leu Thr Ala Arg Ser Ser Ala Asp Arg Gly Thr Lys Leu Leu

485 490 495

Asn Tyr Leu Ala Asp Val Thr Val Asn Gln Pro Tyr Gly Ala Leu Gln

500 505 510

Val Gly Val Asp Pro Arg Ala Lys Leu Pro Pro Val Asp Leu Ala Ala

515 520 525

Thr Pro Pro Ser Gly Thr Arg Gln Leu Leu Cys Asp Val Gly Pro Glu

530 535 540

Glu Phe Ala Arg Arg Leu Arg Arg Gln Thr Arg Val Ala Val Thr Asp

545 550 555 560

Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Ile Arg

565 570 575

Thr Arg Asp Leu Leu Gly Val Ala Gly His Val Ala Arg Thr Thr Pro

580 585 590

Glu Leu Trp Ser Ile Glu Ala Trp Gly Gly Ala Thr Tyr Asp Val Ala

595 600 605

Leu Arg Phe Leu Ser Glu Asp Pro Trp Asp Arg Leu Ala Arg Leu Arg

610 615 620

Arg Ala Val Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Arg Asn

625 630 635 640

Thr Val Gly Tyr Thr Pro Tyr Pro Thr Glu Val Thr Asp Ala Phe Val

645 650 655

Glu Glu Ala Ala Ala Thr Gly Ile Asp Val Phe Arg Val Phe Asp Ala

660 665 670

Leu Asn Asp Val Glu Gln Met Arg Pro Ala Val Glu Ala Val Arg Arg

675 680 685

Thr Gly Thr Ala Val Ala Glu Val Ala Leu Cys Tyr Thr Gly Asp Leu

690 695 700

Ser Asp Pro Ala Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu

705 710 715 720

Ala Glu Arg Ile Val Glu Ala Gly Ala His Val Leu Ala Ile Lys Asp

725 730 735

Met Ala Gly Leu Leu Arg Ala Pro Ala Ala His Arg Leu Val Thr Ala

740 745 750

Leu Arg Glu Arg Phe Asp Leu Pro Val His Leu His Thr His Asp Thr

755 760 765

Pro Gly Gly Gln Leu Ala Thr Leu Leu Ala Ala Ile Asp Ala Gly Val

770 775 780

Asp Ala Val Asp Ala Ala Ser Ala Ala Met Ala Gly Thr Thr Ser Gln

785 790 795 800

Pro Ala Leu Ser Ala Leu Val Ala Ala Thr Asp His Pro Val Thr Glu

805 810 815

Gly Arg Asp Thr Gly Leu Asp Leu Arg Ala Val Cys Asp Leu Glu Pro

820 825 830

Tyr Trp Glu Ala Thr Arg Arg Val Tyr Ala Pro Phe Glu Ser Gly Leu

835 840 845

Pro Ser Pro Thr Gly Arg Val Tyr Thr His Glu Ile Pro Gly Gly Gln

850 855 860

Leu Ser Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu Gly Glu Lys

865 870 875 880

Phe Glu Gln Ile Glu Asp Met Tyr Ala Ala Ala Asn Arg Ile Leu Gly

885 890 895

Asn Ile Val Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala

900 905 910

Leu His Leu Val Ala Val Asp Ala Asp Pro Thr Ala Phe Ala Glu Asp

915 920 925

Pro Gly Lys Phe Asp Val Pro Asp Ser Val Val Gly Phe Leu Ser Gly

930 935 940

Asp Leu Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Phe Arg Thr Arg

945 950 955 960

Ala Leu Glu Gly Arg Thr Thr Arg Pro Ala Val Thr Glu Leu Thr Glu

965 970 975

Gly Asp Arg Asp Gly Leu Ala Ala Asp Arg Arg Ala Thr Leu Asn Arg

980 985 990

Leu Leu Phe Pro Gly Pro Thr Arg Glu Phe Glu Glu Ser Arg Gln Arg

995 1000 1005

Tyr Gly Asp Leu Ser Val Leu Pro Thr Arg Gln Tyr Leu Tyr Gly

1010 1015 1020

Leu Gln Gln Gly Glu Glu His Gln Val Glu Leu Ala Glu Gly Lys

1025 1030 1035

Thr Leu Ile Leu Gly Leu Glu Ala Val Gly Gly Ala Asp Glu Arg

1040 1045 1050

Gly Phe Arg Thr Val Met Cys Thr Ile Asn Gly His Leu Arg Pro

1055 1060 1065

Val Pro Val Arg Asp Arg Ser Val Ala Ala Asp Thr Pro Thr Ala

1070 1075 1080

Glu Lys Ala Asp Pro Thr Lys Pro Gly Gln Val Ala Ala Pro Phe

1085 1090 1095

Gly Gly Val Val Thr Pro Thr Val Ala Glu Gly Asp Pro Val Glu

1100 1105 1110

Ala Gly Ala Thr Val Ala Thr Ile Glu Ala Met Lys Met Glu Ala

1115 1120 1125

Ser Ile Thr Ala Pro Val Gly Gly Thr Val Gln Arg Val Ala Leu

1130 1135 1140

Gly Gly Pro Gln Gln Val Glu Gly Gly Asp Leu Val Leu Val Ile

1145 1150 1155

Gly Arg Gly

1160

<210> 250

<211> 3462

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_5序列

<400> 250

gtgttcgaga aggtgttggt cgccaaccgg ggtgagatcg cggtgcgggt gttccgggcc 60

gcgtacgagc tgggcgcgcg cacggtcgcg gtgttcccgc acgaggaccg ggactcggtg 120

caccggctga aggcggacga ggcatatctg atcggccaac cgggccatcc ggtgcgcgcg 180

tacctcgacg ttgacgagat cgtccgggtg gcctcggcgt gcggtgcgga cgcggtccac 240

ccgggctacg gtttcctgtc cgagaatccg gagctggcgc gggcgtgtgc ggcggcgggg 300

atcgcgttcg tcggcccgcc gcccgaggtg ctggagctga ccggcaacaa ggtgcgggcg 360

gtggcggcgg cccgggcggc cggggtgccg gtgttgcgct cgacgccgcc ctcgtcggag 420

gtggatgagc tggtcgtcgc ggcggcggag gtcggctttc cgatcttcgt caaggcggtg 480

gccgggggtg gcggccgcgg gatgcgccgg gtggacgcgc cgggggagct gcccgatgcg 540

gtggccgccg cggtccggga ggctgaggcg gcgttcggtg atccgaccgt gttctgcgag 600

caggcggtgc tgcgcccgcg gcacgtcgag gtgcaggtcc tcgccgacgc ggcgggggag 660

atgatccact tgttcgagcg ggactgctcg gtgcagaggc ggcaccagaa ggtgatcgag 720

atcgcgccgg cgccgaacct ggacgagccg atccggcgac ggctgcacgc ggacgcgctc 780

gcgttcgccc gcgcggtcgg ctaccgcaac gccggcaccg tcgagttcct ggtcggcacc 840

gctggggatc gggccggcga gcatgtgttc atcgagatga acccgcggat ccaggtggag 900

cacacggtca ccgaggaggt gaccgatgtg gacctcgtgc aggcgcagct gcggatcgcg 960

gccggcgcga ccctgggcga cctgggattg gcgcaggaga cgatccactg taacgggacc 1020

gccgtgcaga cccggatcac caccgaggat cccgcccacg gctttcgccc ggacaccggc 1080

cggatcaccg cgtaccgctc gccgggtggg gcgggggtcc ggctggacgg cggtacggca 1140

cacgcgaacg ccgagatcag cgcgcacttc gactcgatgc tggtgaagct gacctgccgg 1200

ggccgcgacc tgcgcaccgc ggtgggccgg gtccggcggg cgctggcgga gttccggatc 1260

cgcggggtcg cgaccaacct cccgttcctg caggcggtgc tggacgagga cgacttcctg 1320

gccggccggc tcactacctc ctttatcgac gaacgtcccc acctgctgcg ggcgcgttcc 1380

agcgccgacc gcggcacccg gctgctgcgg tggctggcgg aggtcaccgt caaccgcccg 1440

tacggcgacg ccccggtcac ggtcgacccg gccgacaagt tgccggcggt cggccaagtc 1500

cggcgcctcc actccgaccc cgccgattgc tccggccccg gacaaggccg cgaggacccg 1560

accggagcgc cggccggcag ccggcagcgg ctgcgcgagc tcggcccgga agggttcgcc 1620

cgggcgctac gcgaccaggc cacggtcgcg gtcaccgaca cgaccttccg ggacgcccac 1680

cagtcgttgc tggcgacccg ggtccggacc aaggacctgc tggtggccgc accgtacgtc 1740

gcgcaccggc tggccgggtt gtggagcctg gaggcctggg gcggcgccac ctacgacgtc 1800

gcgctccggt tcctgggcga ggacccgtgg gagcggctgg cggcgctgcg cgaggcggtg 1860

ccgaacatcg cgctgcagat gctgctgcgc ggccgcaaca ccgtcggcta cacgccctac 1920

ccggagcagg tgacccgggc gttcgtggac gaggcggtcg ccaccggcat cgacgtgttc 1980

cggatcttcg atgcattgaa cgacgtggga cagatgaccc cggcgatcga ggcggtacgc 2040

gagtccggcc gggcggtcgc cgaggtggcg ctgtgctaca ccgctgacct gtccgacccg 2100

ggcgagccgc tctacaccct cgactactac ctggcgctgg ccgagcggat cgtggcggcg 2160

ggcgcgcacg tgctggcgat caaggacatg gccgggctgc tgcgcccgcc ggcggcacgg 2220

cggctggtcg ccgcgctgcg cgagcggttc gacctgccgg tccacctgca cacccacgac 2280

accgccggcg gccagctcgc caccctgctg gcggcggtcg acgccggggt ggacgcggtc 2340

gacgtggcct gcgcctcgat ggccgggacc accagccagc cgccgatgtc ggcactgctg 2400

gcggcgctgg cccataccgg gcgcgccccc gggctggacc tcgccgccgc ccaggagtac 2460

gagccgtact gggaggccgt ccgtcgggtg tatgcgccgt tcgagtccgg cctgcccggc 2520

ccgaccgggc gcgtctaccg gcacgagatc cccggcgggc agctcagcaa cctgcggcag 2580

caggcgatcg cgctcgggct gggggagaag ttcgagcaga tcgaggacac ctatgcggcg 2640

accgaccgga tcctcggccg gctggtcaag gtcaccccgt ccagcaaggt cgtcggcgac 2700

ctggcgctgc atctggtcgc gctgggcgcc gacgctgacg agttcgcccg cgacccggag 2760

cggttcgata tccccgactc ggtgatcggc ttcctcgccg gcgagctcgg taccccgccg 2820

ggcggctggc cggagccgct gcgcacccgc gccctggcgg gccgggagcc gacttccggc 2880

cgcgccgagc tcgacgcgac cgacgcgaag gcgctcgccg accccggacc gcagcggcgc 2940

gacaccctga accggctgct ctttccgggc ccgacccgcg agttcaccga ggtccggcag 3000

acctatggcg acctctcggt gctgggcacc gtcgactacc tgtacgggtt gcgccccggc 3060

gtggagtcga tcatcgagct ggaacggggc gtccacctga tcgtccggct ggaggcggtc 3120

ggcgacgccg acgagcgcgg cttccgtacc gtcatgtgca cgctcaacgg ccagctgcgc 3180

ccggtgtggg tgcgcgaccg gtcgatcgct gccgacgtac cggaggccga gaaggccgac 3240

cccgccaacc ctcggcatct ggccgtgccg ttcgccgggg tggtgaccgc ggtggtggcc 3300

gagggcgacg aggtcgaggc gggtcagacc gtcgccacca tcgaggccat gaagctggcg 3360

gcctccatca cggcaccggt cggcggccgg gtcgcccggc tggcgatcac cggcccgcgg 3420

caggccgagg ccggcgacct gatcgccgtc ctggagcagt aa 3462

<210> 251

<211> 1153

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_5序列

<400> 251

Val Phe Glu Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Val Arg

1 5 10 15

Val Phe Arg Ala Ala Tyr Glu Leu Gly Ala Arg Thr Val Ala Val Phe

20 25 30

Pro His Glu Asp Arg Asp Ser Val His Arg Leu Lys Ala Asp Glu Ala

35 40 45

Tyr Leu Ile Gly Gln Pro Gly His Pro Val Arg Ala Tyr Leu Asp Val

50 55 60

Asp Glu Ile Val Arg Val Ala Ser Ala Cys Gly Ala Asp Ala Val His

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Glu Leu Ala Arg Ala Cys

85 90 95

Ala Ala Ala Gly Ile Ala Phe Val Gly Pro Pro Pro Glu Val Leu Glu

100 105 110

Leu Thr Gly Asn Lys Val Arg Ala Val Ala Ala Ala Arg Ala Ala Gly

115 120 125

Val Pro Val Leu Arg Ser Thr Pro Pro Ser Ser Glu Val Asp Glu Leu

130 135 140

Val Val Ala Ala Ala Glu Val Gly Phe Pro Ile Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asp Ala Pro Gly Glu

165 170 175

Leu Pro Asp Ala Val Ala Ala Ala Val Arg Glu Ala Glu Ala Ala Phe

180 185 190

Gly Asp Pro Thr Val Phe Cys Glu Gln Ala Val Leu Arg Pro Arg His

195 200 205

Val Glu Val Gln Val Leu Ala Asp Ala Ala Gly Glu Met Ile His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Ile Ala Pro Ala Pro Asn Leu Asp Glu Pro Ile Arg Arg Arg Leu His

245 250 255

Ala Asp Ala Leu Ala Phe Ala Arg Ala Val Gly Tyr Arg Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Gly Thr Ala Gly Asp Arg Ala Gly Glu His

275 280 285

Val Phe Ile Glu Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr

290 295 300

Glu Glu Val Thr Asp Val Asp Leu Val Gln Ala Gln Leu Arg Ile Ala

305 310 315 320

Ala Gly Ala Thr Leu Gly Asp Leu Gly Leu Ala Gln Glu Thr Ile His

325 330 335

Cys Asn Gly Thr Ala Val Gln Thr Arg Ile Thr Thr Glu Asp Pro Ala

340 345 350

His Gly Phe Arg Pro Asp Thr Gly Arg Ile Thr Ala Tyr Arg Ser Pro

355 360 365

Gly Gly Ala Gly Val Arg Leu Asp Gly Gly Thr Ala His Ala Asn Ala

370 375 380

Glu Ile Ser Ala His Phe Asp Ser Met Leu Val Lys Leu Thr Cys Arg

385 390 395 400

Gly Arg Asp Leu Arg Thr Ala Val Gly Arg Val Arg Arg Ala Leu Ala

405 410 415

Glu Phe Arg Ile Arg Gly Val Ala Thr Asn Leu Pro Phe Leu Gln Ala

420 425 430

Val Leu Asp Glu Asp Asp Phe Leu Ala Gly Arg Leu Thr Thr Ser Phe

435 440 445

Ile Asp Glu Arg Pro His Leu Leu Arg Ala Arg Ser Ser Ala Asp Arg

450 455 460

Gly Thr Arg Leu Leu Arg Trp Leu Ala Glu Val Thr Val Asn Arg Pro

465 470 475 480

Tyr Gly Asp Ala Pro Val Thr Val Asp Pro Ala Asp Lys Leu Pro Ala

485 490 495

Val Gly Gln Val Arg Arg Leu His Ser Asp Pro Ala Asp Cys Ser Gly

500 505 510

Pro Gly Gln Gly Arg Glu Asp Pro Thr Gly Ala Pro Ala Gly Ser Arg

515 520 525

Gln Arg Leu Arg Glu Leu Gly Pro Glu Gly Phe Ala Arg Ala Leu Arg

530 535 540

Asp Gln Ala Thr Val Ala Val Thr Asp Thr Thr Phe Arg Asp Ala His

545 550 555 560

Gln Ser Leu Leu Ala Thr Arg Val Arg Thr Lys Asp Leu Leu Val Ala

565 570 575

Ala Pro Tyr Val Ala His Arg Leu Ala Gly Leu Trp Ser Leu Glu Ala

580 585 590

Trp Gly Gly Ala Thr Tyr Asp Val Ala Leu Arg Phe Leu Gly Glu Asp

595 600 605

Pro Trp Glu Arg Leu Ala Ala Leu Arg Glu Ala Val Pro Asn Ile Ala

610 615 620

Leu Gln Met Leu Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro Tyr

625 630 635 640

Pro Glu Gln Val Thr Arg Ala Phe Val Asp Glu Ala Val Ala Thr Gly

645 650 655

Ile Asp Val Phe Arg Ile Phe Asp Ala Leu Asn Asp Val Gly Gln Met

660 665 670

Thr Pro Ala Ile Glu Ala Val Arg Glu Ser Gly Arg Ala Val Ala Glu

675 680 685

Val Ala Leu Cys Tyr Thr Ala Asp Leu Ser Asp Pro Gly Glu Pro Leu

690 695 700

Tyr Thr Leu Asp Tyr Tyr Leu Ala Leu Ala Glu Arg Ile Val Ala Ala

705 710 715 720

Gly Ala His Val Leu Ala Ile Lys Asp Met Ala Gly Leu Leu Arg Pro

725 730 735

Pro Ala Ala Arg Arg Leu Val Ala Ala Leu Arg Glu Arg Phe Asp Leu

740 745 750

Pro Val His Leu His Thr His Asp Thr Ala Gly Gly Gln Leu Ala Thr

755 760 765

Leu Leu Ala Ala Val Asp Ala Gly Val Asp Ala Val Asp Val Ala Cys

770 775 780

Ala Ser Met Ala Gly Thr Thr Ser Gln Pro Pro Met Ser Ala Leu Leu

785 790 795 800

Ala Ala Leu Ala His Thr Gly Arg Ala Pro Gly Leu Asp Leu Ala Ala

805 810 815

Ala Gln Glu Tyr Glu Pro Tyr Trp Glu Ala Val Arg Arg Val Tyr Ala

820 825 830

Pro Phe Glu Ser Gly Leu Pro Gly Pro Thr Gly Arg Val Tyr Arg His

835 840 845

Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu Arg Gln Gln Ala Ile Ala

850 855 860

Leu Gly Leu Gly Glu Lys Phe Glu Gln Ile Glu Asp Thr Tyr Ala Ala

865 870 875 880

Thr Asp Arg Ile Leu Gly Arg Leu Val Lys Val Thr Pro Ser Ser Lys

885 890 895

Val Val Gly Asp Leu Ala Leu His Leu Val Ala Leu Gly Ala Asp Ala

900 905 910

Asp Glu Phe Ala Arg Asp Pro Glu Arg Phe Asp Ile Pro Asp Ser Val

915 920 925

Ile Gly Phe Leu Ala Gly Glu Leu Gly Thr Pro Pro Gly Gly Trp Pro

930 935 940

Glu Pro Leu Arg Thr Arg Ala Leu Ala Gly Arg Glu Pro Thr Ser Gly

945 950 955 960

Arg Ala Glu Leu Asp Ala Thr Asp Ala Lys Ala Leu Ala Asp Pro Gly

965 970 975

Pro Gln Arg Arg Asp Thr Leu Asn Arg Leu Leu Phe Pro Gly Pro Thr

980 985 990

Arg Glu Phe Thr Glu Val Arg Gln Thr Tyr Gly Asp Leu Ser Val Leu

995 1000 1005

Gly Thr Val Asp Tyr Leu Tyr Gly Leu Arg Pro Gly Val Glu Ser

1010 1015 1020

Ile Ile Glu Leu Glu Arg Gly Val His Leu Ile Val Arg Leu Glu

1025 1030 1035

Ala Val Gly Asp Ala Asp Glu Arg Gly Phe Arg Thr Val Met Cys

1040 1045 1050

Thr Leu Asn Gly Gln Leu Arg Pro Val Trp Val Arg Asp Arg Ser

1055 1060 1065

Ile Ala Ala Asp Val Pro Glu Ala Glu Lys Ala Asp Pro Ala Asn

1070 1075 1080

Pro Arg His Leu Ala Val Pro Phe Ala Gly Val Val Thr Ala Val

1085 1090 1095

Val Ala Glu Gly Asp Glu Val Glu Ala Gly Gln Thr Val Ala Thr

1100 1105 1110

Ile Glu Ala Met Lys Leu Ala Ala Ser Ile Thr Ala Pro Val Gly

1115 1120 1125

Gly Arg Val Ala Arg Leu Ala Ile Thr Gly Pro Arg Gln Ala Glu

1130 1135 1140

Ala Gly Asp Leu Ile Ala Val Leu Glu Gln

1145 1150

<210> 252

<211> 3450

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_6序列

<400> 252

ttgatctcca aggtgctggt cgccaaccgc ggagagatcg ccatccgcgc gtttcgcgcc 60

gcctacgaga tggggctcgc caccgtcgcg gtttatccgg tcgaggaccg caactcggtg 120

caccgcctga aggccgatga ggcctatcag atcggccaac ccgggcaccc ggtgcgggcc 180

tatctctcgg ttgacgagat catgcgcgct gcggagatct ccggcgccga tgtgatctat 240

cctggttacg gattcctttc ggagaaccca gaattggccg cggggtgtga aactcgcggg 300

ttgacgtttg tcggtccgcc cgctcggata ctcgagctga cgggaaacaa ggcgagggcg 360

atcgccgcgg cgaaggcagc ggggctgcct gtcttgtccg cgaccgcgcc gtccgacgat 420

gtcgacgcgc tcgtcgaggc cgccggatcg atggccttcc cggtattcgt gaaggccgtt 480

gcgggcggtg gcgggcgcgg tatgcgccgg gtcagcgaat acgcgcagtt gcgcgagtcc 540

atcgaagcag cggctcggga ggccgagtcc gcgttcggtg accccaccgt gttcctcgaa 600

caagcggtga tcaacccgcg gcacatcgag gtccagatac tcgcggacaa ccagggcaac 660

gtcgtccatc tgtacgagcg tgactgttca gtgcaacgcc ggcaccagaa ggtcatcgag 720

ctcgcgcccg cccccaacct ggatccggcg ctgcgcgatc ggatctgcgc cgacgccgtg 780

gccttcgccc gggagatcgg ctactcgtgt gcagggacgg tggagttcct cgtcgacgag 840

aacggtcgac atgtattcat cgagatgaac ccgcgcatcc aggtcgagca cacggtcacc 900

gaagagatca cggacgtcga cctcgtgcag gcccagatgc gcatcgccgc cggggagtcg 960

ctgtcggagc tgggcctcag ccaggacacg gtccggatcc gcggcgcggc tctgcaatgc 1020

cgcattacca ccgaggatcc cgagaacgag ttccggccgg acaccggacg tatcagcgga 1080

taccgcactc ccggcggagc gggcgtgcga ctcgacggcg gaaccatgct cggcgcccag 1140

gtcggagcgc acttcgattc cttgctggtc aagttgacgt gccggggccg tgacttcgat 1200

gccgccgtcg ccagggcgcg gcgcggcgtg gccgaattcc gcatccgcgg tgtggccacc 1260

aatattccgt tcctgcaagc ggttctcgac aatgaggatt tccgagcggg cctggtcacg 1320

acgtcgttca tcgaaaccca tccggggttg ctcaatggct ataacccggc gaaccgcggt 1380

agcaagatct tggcgtacct cgccgacgtg actgtcaaca agcctcacgg cgaggccccg 1440

gatatcagcc acccgggcga caagctgccg cccgtggatc tgtcgaaacc gattcccgat 1500

ggctcacggc agcgacttat ggcgctgggc ccgcaggcct tcgcagatgc gctgcgccgc 1560

cagcctgccc tggcggtcac cgataccacc ttccgcgacg ctcaccaatc cttgctggcg 1620

acgcggcttc gcacccatga ccttgttgcc gtcgccgacc acatcgcgcg cactacaccg 1680

cagctgttct ccgtcgaggc ttggggcggc gcgacttacg atgtggcgct tcggcttctg 1740

cacgaggatc cgtggcaacg gttggctgaa ctacgagcgg cgattcccaa catctgcctg 1800

cagatgctgc tgcgaggacg caacaccgtt ggttatacgc cctaccccga ccaggtgaca 1860

gaggcattcg tcgccgaagc cgcggccacg ggtgtcgaca tcttccggat cttcgacgcg 1920

ctcaacaaca tcgaccagat gcgcccggct atcgacgccg tgcacacgac ggggacggcc 1980

gtcgctgagg tcgcgatgtc gtacaccggc gacctcagtg acccgaacga gcgcctctac 2040

acgctcgact actacctgcg tctggccgag cagttcgtcg aggcgggggc gcacatcctg 2100

gcgatcaagg acatggcggg gctgctgcgg gcaccggcgg ccgcgacctt ggtttccgcg 2160

ctacgcaaca actttgacct tcctgtgcat gtgcacaccc atgacacacc gggtggccaa 2220

ctcgcaacgt accttgccgc ctggcaggcc ggtgccgatg cggtcgatgg cgccgcagcg 2280

ccgctggcag gaaccacgag ccagcccgcg ctgtcgggga tcgttgcggc aacggcgaat 2340

acaggtcgcg acaccgggat cgaactgcag tcgctgtgcg acctggaacc gtactgggag 2400

tcggtgcgac gcatctatgc cccgttcgag gctggcctgc ccgccccgac cggccgggtg 2460

tacacccacg agattcccgg tgggcagctg tcgaacctgc ggacccaggc cgtggcgctg 2520

gggcttgggg agcggttcga ggacatcgag gccgcctatg cgggcgccga tcggctgctg 2580

ggccgtttgg tcaaggtcac gccgtcatcg aaggtggtcg gcgatctcgc gctggccctc 2640

gtcggcgccg gcgtcagtgc cgaacggttc gcggccgagc ctgcccgcta cgacataccg 2700

gactcggtga tcgggttcct gcgcggcgaa ctcggtgtgc ccgtcggtgg atggccagaa 2760

cccttgcgta ccaaggcact tgagggccgc ggggcggcca ggccggaaca ggtgctcact 2820

gccgaggacc gcacggcgct cggcgggtta ccgcaggagc gtcgcgcggc gttgaaccgg 2880

ttgctgtttc ccgggccgac ccgggaattc accgagcacc gggctcgcta tggcgacacc 2940

tcgctgctcg ccagtccgca gttcttctac ggcctgcgcc aggacgagga gacgcaggta 3000

acgctgagcc ccggggtgac gttgaacgtc ggcctggagg cgatcgcgga tgccgacgag 3060

cgcgggtacc gcacggtgat gtgtctgctg aacggccagc tacggccaat ccaagtgcga 3120

gacaactcga ttgcgacggc gcaccccgcc gccgagaagg ccgaccgcga cgatccgcgg 3180

catgtcgcgg cgccgttcgc tggcacggtc actttgtcgg tcggtgctgg cgatcaggta 3240

agcgcgggtg atccgatcgc caccatagaa gcgatgaaga tggaggccgc catcaccgcg 3300

cccgccgtcg gcagggtgtc gcgcgtcgca atcgacccga tcgcgcaggt tgagggtggt 3360

gacttgctgc tggtggtcga cgtcgaggac gctgtgaagg gtgagcagca ctcgaatcgc 3420

ggcgccgagg tcgtcggtgt cggcagctga 3450

<210> 253

<211> 1149

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_6序列

<400> 253

Leu Ile Ser Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Met Gly Leu Ala Thr Val Ala Val Tyr

20 25 30

Pro Val Glu Asp Arg Asn Ser Val His Arg Leu Lys Ala Asp Glu Ala

35 40 45

Tyr Gln Ile Gly Gln Pro Gly His Pro Val Arg Ala Tyr Leu Ser Val

50 55 60

Asp Glu Ile Met Arg Ala Ala Glu Ile Ser Gly Ala Asp Val Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Glu Leu Ala Ala Gly Cys

85 90 95

Glu Thr Arg Gly Leu Thr Phe Val Gly Pro Pro Ala Arg Ile Leu Glu

100 105 110

Leu Thr Gly Asn Lys Ala Arg Ala Ile Ala Ala Ala Lys Ala Ala Gly

115 120 125

Leu Pro Val Leu Ser Ala Thr Ala Pro Ser Asp Asp Val Asp Ala Leu

130 135 140

Val Glu Ala Ala Gly Ser Met Ala Phe Pro Val Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Ser Glu Tyr Ala Gln

165 170 175

Leu Arg Glu Ser Ile Glu Ala Ala Ala Arg Glu Ala Glu Ser Ala Phe

180 185 190

Gly Asp Pro Thr Val Phe Leu Glu Gln Ala Val Ile Asn Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Asn Gln Gly Asn Val Val His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Leu Ala Pro Ala Pro Asn Leu Asp Pro Ala Leu Arg Asp Arg Ile Cys

245 250 255

Ala Asp Ala Val Ala Phe Ala Arg Glu Ile Gly Tyr Ser Cys Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Asp Glu Asn Gly Arg His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Ile Thr

290 295 300

Asp Val Asp Leu Val Gln Ala Gln Met Arg Ile Ala Ala Gly Glu Ser

305 310 315 320

Leu Ser Glu Leu Gly Leu Ser Gln Asp Thr Val Arg Ile Arg Gly Ala

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Glu Asn Glu Phe Arg

340 345 350

Pro Asp Thr Gly Arg Ile Ser Gly Tyr Arg Thr Pro Gly Gly Ala Gly

355 360 365

Val Arg Leu Asp Gly Gly Thr Met Leu Gly Ala Gln Val Gly Ala His

370 375 380

Phe Asp Ser Leu Leu Val Lys Leu Thr Cys Arg Gly Arg Asp Phe Asp

385 390 395 400

Ala Ala Val Ala Arg Ala Arg Arg Gly Val Ala Glu Phe Arg Ile Arg

405 410 415

Gly Val Ala Thr Asn Ile Pro Phe Leu Gln Ala Val Leu Asp Asn Glu

420 425 430

Asp Phe Arg Ala Gly Leu Val Thr Thr Ser Phe Ile Glu Thr His Pro

435 440 445

Gly Leu Leu Asn Gly Tyr Asn Pro Ala Asn Arg Gly Ser Lys Ile Leu

450 455 460

Ala Tyr Leu Ala Asp Val Thr Val Asn Lys Pro His Gly Glu Ala Pro

465 470 475 480

Asp Ile Ser His Pro Gly Asp Lys Leu Pro Pro Val Asp Leu Ser Lys

485 490 495

Pro Ile Pro Asp Gly Ser Arg Gln Arg Leu Met Ala Leu Gly Pro Gln

500 505 510

Ala Phe Ala Asp Ala Leu Arg Arg Gln Pro Ala Leu Ala Val Thr Asp

515 520 525

Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Leu Arg

530 535 540

Thr His Asp Leu Val Ala Val Ala Asp His Ile Ala Arg Thr Thr Pro

545 550 555 560

Gln Leu Phe Ser Val Glu Ala Trp Gly Gly Ala Thr Tyr Asp Val Ala

565 570 575

Leu Arg Leu Leu His Glu Asp Pro Trp Gln Arg Leu Ala Glu Leu Arg

580 585 590

Ala Ala Ile Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Arg Asn

595 600 605

Thr Val Gly Tyr Thr Pro Tyr Pro Asp Gln Val Thr Glu Ala Phe Val

610 615 620

Ala Glu Ala Ala Ala Thr Gly Val Asp Ile Phe Arg Ile Phe Asp Ala

625 630 635 640

Leu Asn Asn Ile Asp Gln Met Arg Pro Ala Ile Asp Ala Val His Thr

645 650 655

Thr Gly Thr Ala Val Ala Glu Val Ala Met Ser Tyr Thr Gly Asp Leu

660 665 670

Ser Asp Pro Asn Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu

675 680 685

Ala Glu Gln Phe Val Glu Ala Gly Ala His Ile Leu Ala Ile Lys Asp

690 695 700

Met Ala Gly Leu Leu Arg Ala Pro Ala Ala Ala Thr Leu Val Ser Ala

705 710 715 720

Leu Arg Asn Asn Phe Asp Leu Pro Val His Val His Thr His Asp Thr

725 730 735

Pro Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Trp Gln Ala Gly Ala

740 745 750

Asp Ala Val Asp Gly Ala Ala Ala Pro Leu Ala Gly Thr Thr Ser Gln

755 760 765

Pro Ala Leu Ser Gly Ile Val Ala Ala Thr Ala Asn Thr Gly Arg Asp

770 775 780

Thr Gly Ile Glu Leu Gln Ser Leu Cys Asp Leu Glu Pro Tyr Trp Glu

785 790 795 800

Ser Val Arg Arg Ile Tyr Ala Pro Phe Glu Ala Gly Leu Pro Ala Pro

805 810 815

Thr Gly Arg Val Tyr Thr His Glu Ile Pro Gly Gly Gln Leu Ser Asn

820 825 830

Leu Arg Thr Gln Ala Val Ala Leu Gly Leu Gly Glu Arg Phe Glu Asp

835 840 845

Ile Glu Ala Ala Tyr Ala Gly Ala Asp Arg Leu Leu Gly Arg Leu Val

850 855 860

Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu Ala Leu

865 870 875 880

Val Gly Ala Gly Val Ser Ala Glu Arg Phe Ala Ala Glu Pro Ala Arg

885 890 895

Tyr Asp Ile Pro Asp Ser Val Ile Gly Phe Leu Arg Gly Glu Leu Gly

900 905 910

Val Pro Val Gly Gly Trp Pro Glu Pro Leu Arg Thr Lys Ala Leu Glu

915 920 925

Gly Arg Gly Ala Ala Arg Pro Glu Gln Val Leu Thr Ala Glu Asp Arg

930 935 940

Thr Ala Leu Gly Gly Leu Pro Gln Glu Arg Arg Ala Ala Leu Asn Arg

945 950 955 960

Leu Leu Phe Pro Gly Pro Thr Arg Glu Phe Thr Glu His Arg Ala Arg

965 970 975

Tyr Gly Asp Thr Ser Leu Leu Ala Ser Pro Gln Phe Phe Tyr Gly Leu

980 985 990

Arg Gln Asp Glu Glu Thr Gln Val Thr Leu Ser Pro Gly Val Thr Leu

995 1000 1005

Asn Val Gly Leu Glu Ala Ile Ala Asp Ala Asp Glu Arg Gly Tyr

1010 1015 1020

Arg Thr Val Met Cys Leu Leu Asn Gly Gln Leu Arg Pro Ile Gln

1025 1030 1035

Val Arg Asp Asn Ser Ile Ala Thr Ala His Pro Ala Ala Glu Lys

1040 1045 1050

Ala Asp Arg Asp Asp Pro Arg His Val Ala Ala Pro Phe Ala Gly

1055 1060 1065

Thr Val Thr Leu Ser Val Gly Ala Gly Asp Gln Val Ser Ala Gly

1070 1075 1080

Asp Pro Ile Ala Thr Ile Glu Ala Met Lys Met Glu Ala Ala Ile

1085 1090 1095

Thr Ala Pro Ala Val Gly Arg Val Ser Arg Val Ala Ile Asp Pro

1100 1105 1110

Ile Ala Gln Val Glu Gly Gly Asp Leu Leu Leu Val Val Asp Val

1115 1120 1125

Glu Asp Ala Val Lys Gly Glu Gln His Ser Asn Arg Gly Ala Glu

1130 1135 1140

Val Val Gly Val Gly Ser

1145

<210> 254

<211> 3414

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_7序列

<400> 254

atgcgaaagt tgttggtcgc gaaccggggg gagatcgcga ttcgtgcgtt ccgcgctgcg 60

ttcgagctcg acctcgcgac ggtcgcggtg ttcgcgtggg aggaccgcgg gtcgctgcac 120

cgcctgaaag ccgatgaggc gtacttgatc ggtgagcgcg gccacccggt gcgggcgtac 180

ctcgacgtgg accagatcgt gacaaccgcg ttgtcgtgcg gagccgatgc gatctacccg 240

ggctacggat tcctatcgga aaaccccggg ttggccgagg catgcgagca tgccggtatc 300

gcattcgtcg gtccgaccgc agcggtacta gcgatggccg gcaacaaggt gcgcgcgatt 360

gaagtggcgc ggcgcgctgg cgttccgacg cttcgtagtg tgcatgcgca ggacaacgaa 420

ggtttggtcg ccggcgcaga gcagatcgat cttcctgtgt tcgtcaaagc gcaagctggc 480

ggcggcggcc gtggcatgcg tcgcgtcgac agtcgtgcgg atctgctgcc ctacatcgag 540

gcggcgcgcc gtgaagcgct gtcagcgttc ggtgacgcgt ccgtatacat cgaggaggcg 600

gtcgttcggc cgcgccacat tgagatccag attctcggcg acgccacggg cgcggtggtc 660

catctcttcg aacgcgactg ctcggtacag cgccggcacc agaaggtggt cgagatcgca 720

cccgcacctg gccttgaccc cgtgttgcgc gaccgtttgt gcgcagacgc ggttcgcttc 780

gcccagtcga tcggctacac caacgcaggc acggtcgagt ttctggttgc cgacgatggg 840

cgctacgcgt tcatcgagat caatccgcgg atccaggtcg agcacaccgt caccgaggag 900

gtgaccgacg tcgatctcgt ccacgcgcag atccggatcg cctcgggcgc aacactggcc 960

gagttgggtt tgttgcagga cgacatcgtt cagcgcggct gcgcgctcca gtgtcgcatc 1020

acgaccgagg atccacacaa cgacttccgt cctgacgcgg ggcgtatctc cgcgtaccgc 1080

gctcccggcg gcgcgggcgt acgcctcgac gcagcaagcg gatacgtcgg tgccgagatc 1140

tccgcgtact tcgactcgct cctcgtcaag ttgacatgtc gagggaatga tcgtcacagc 1200

gccgcagcgc gagcgcgacg cgccctcgcg gagtttcgca tccggggcgt agccacaaac 1260

ctcgcgttcc tacagagcct cctcgcggat cctgacttcg tggctggtcg actccatacg 1320

tcgtttatcg aggaccgccc gtatctgctc aaggcacgag cggcagcgga taggggcacg 1380

aaactgttaa cctatcttgc cgacgtgacg gtcaacaagc cgttcggtga cgcccccagc 1440

cgcatagacc cgcgctcgaa gctgcctcta gcgaccgact acgacgaagc agcgacgcca 1500

ggtagtcgtc aactccttgc agaactcggt ccacaagcat tcgcagctcg tctgcgcacc 1560

caggcagcaa ccgcagtgac cgacaccacc tttcgcgatg cgcaccagtc attgctcgcc 1620

acccgacttc gcacgtatga catgttggcg gccgccccga ccgtggcgag aacgcttcct 1680

cagctgctga gtctcgaagc ttggggcggc gcgacctatg acgtcgcgct gcggtttcta 1740

aaggaagacc cgtgggagcg actcgacgcc ctacgagaaa cggtgccgaa catctgcctg 1800

caaatgctgc tgcgcggcgc caacactgtc ggctacggac cgtcgccgcc ctcgacgacg 1860

aaacgattcg ttcaggaggc ggcgcgcagc ggaatcgata tcttccgcat cttcgacgct 1920

ctcaacgacg taaaccagat gcgagccgca atcgacgctg tcctcgaaac cgacgcgctc 1980

gtcgaggcat gcctgtgcta cacgggcgac ctcggcgagc ctaccgagaa cctctacacc 2040

ctcgactact acctcgcagt cgccgagcgg ctcgtcgcca ccggcgcgca cgtgatcgcc 2100

atcaaggaca tggcgggtct gttgcgccct ccggctgcac gtactctcgt atcggcacta 2160

cgcgcactgt tcgacgcacc aatccacgta cacacccatg acaccgcggg cgggcaactc 2220

gctacgtacc tcgcagctgt tgacgctgga gccgacgcca tcgacggcgc cattgctccc 2280

ttcgcaggca cgacgagcca gccatccctc gcggcgatcg tcgctgccac cgaccacacg 2340

ccacgtgcga ccggtctctc cctcgatgcg ctcatcgatc tcgagccata ttgggacgcc 2400

gttcgcgacc actaccgacc attcgacgaa gcccttcgcg ccccgaccgg cgcggtttac 2460

cgtcacgaga ttccaggcgg ccaactcacg aacctgcgcc agcaagccat cgcgctcgga 2520

ttcgggcacc gcttcgaaga cgtccagcgt tggtatacga cggtcaacca cctcctcggc 2580

aacatcatca aggtcacacc cacgagcaag gtcgttggcg atctcgccat cgccttgtgc 2640

gggaccggcc tcacccccga ggaattcgag accgatccga gccgcgtcga catacccgac 2700

agcgtggtcg cgtttctcca aggcgcgctc ggcgagccgc ccggaggatg gcgcgagccg 2760

ttcagaacgc gcgcactcgc tggtcgcacg cgcatcacag actcaactcc cgaggatgcg 2820

gaaacgctcc caccgccggg ccccgactgc cgcgtggcaa tcagccagga gttgttcccg 2880

gacccggcca ccgacttcga acaaacccga accctctacg gcgacctctc cgtgctccca 2940

agcagtctgt tcttctacgg tctgcgacca ggcgaagagt ttgacgtccg cctcggcccc 3000

ggcgtcgact tgatcatcgg cctcgaagca atcgccgaac ccgacgcgcg cggtatgcgc 3060

accgtgctct gccgcatcaa cggacaggtc cgacccatcg tggtccgcga ccaccaggcc 3120

aactccaccg ccgcaactgc tgaacgcgca aatccgcaaa ctccaggcca cgtcgcggcg 3180

ccctacgatg gcgtcgtcac cgtccgcgtc accgcgggtc aagtagtgac cgctggtgac 3240

cctgttgcca gcatcgaggc tatgaagatg gaaagcacca tcaccgcgcc catctcagga 3300

accgtcgaac gcatcgcaat cagcccaatc ggacacgtac aagcaggcga cctcatcctc 3360

acgatcaaac ccgcatcgct cgcctcccca catgttgaac cgaccgcgct ctaa 3414

<210> 255

<211> 1137

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_7序列

<400> 255

Met Arg Lys Leu Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg Ala

1 5 10 15

Phe Arg Ala Ala Phe Glu Leu Asp Leu Ala Thr Val Ala Val Phe Ala

20 25 30

Trp Glu Asp Arg Gly Ser Leu His Arg Leu Lys Ala Asp Glu Ala Tyr

35 40 45

Leu Ile Gly Glu Arg Gly His Pro Val Arg Ala Tyr Leu Asp Val Asp

50 55 60

Gln Ile Val Thr Thr Ala Leu Ser Cys Gly Ala Asp Ala Ile Tyr Pro

65 70 75 80

Gly Tyr Gly Phe Leu Ser Glu Asn Pro Gly Leu Ala Glu Ala Cys Glu

85 90 95

His Ala Gly Ile Ala Phe Val Gly Pro Thr Ala Ala Val Leu Ala Met

100 105 110

Ala Gly Asn Lys Val Arg Ala Ile Glu Val Ala Arg Arg Ala Gly Val

115 120 125

Pro Thr Leu Arg Ser Val His Ala Gln Asp Asn Glu Gly Leu Val Ala

130 135 140

Gly Ala Glu Gln Ile Asp Leu Pro Val Phe Val Lys Ala Gln Ala Gly

145 150 155 160

Gly Gly Gly Arg Gly Met Arg Arg Val Asp Ser Arg Ala Asp Leu Leu

165 170 175

Pro Tyr Ile Glu Ala Ala Arg Arg Glu Ala Leu Ser Ala Phe Gly Asp

180 185 190

Ala Ser Val Tyr Ile Glu Glu Ala Val Val Arg Pro Arg His Ile Glu

195 200 205

Ile Gln Ile Leu Gly Asp Ala Thr Gly Ala Val Val His Leu Phe Glu

210 215 220

Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu Ile Ala

225 230 235 240

Pro Ala Pro Gly Leu Asp Pro Val Leu Arg Asp Arg Leu Cys Ala Asp

245 250 255

Ala Val Arg Phe Ala Gln Ser Ile Gly Tyr Thr Asn Ala Gly Thr Val

260 265 270

Glu Phe Leu Val Ala Asp Asp Gly Arg Tyr Ala Phe Ile Glu Ile Asn

275 280 285

Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr Asp Val

290 295 300

Asp Leu Val His Ala Gln Ile Arg Ile Ala Ser Gly Ala Thr Leu Ala

305 310 315 320

Glu Leu Gly Leu Leu Gln Asp Asp Ile Val Gln Arg Gly Cys Ala Leu

325 330 335

Gln Cys Arg Ile Thr Thr Glu Asp Pro His Asn Asp Phe Arg Pro Asp

340 345 350

Ala Gly Arg Ile Ser Ala Tyr Arg Ala Pro Gly Gly Ala Gly Val Arg

355 360 365

Leu Asp Ala Ala Ser Gly Tyr Val Gly Ala Glu Ile Ser Ala Tyr Phe

370 375 380

Asp Ser Leu Leu Val Lys Leu Thr Cys Arg Gly Asn Asp Arg His Ser

385 390 395 400

Ala Ala Ala Arg Ala Arg Arg Ala Leu Ala Glu Phe Arg Ile Arg Gly

405 410 415

Val Ala Thr Asn Leu Ala Phe Leu Gln Ser Leu Leu Ala Asp Pro Asp

420 425 430

Phe Val Ala Gly Arg Leu His Thr Ser Phe Ile Glu Asp Arg Pro Tyr

435 440 445

Leu Leu Lys Ala Arg Ala Ala Ala Asp Arg Gly Thr Lys Leu Leu Thr

450 455 460

Tyr Leu Ala Asp Val Thr Val Asn Lys Pro Phe Gly Asp Ala Pro Ser

465 470 475 480

Arg Ile Asp Pro Arg Ser Lys Leu Pro Leu Ala Thr Asp Tyr Asp Glu

485 490 495

Ala Ala Thr Pro Gly Ser Arg Gln Leu Leu Ala Glu Leu Gly Pro Gln

500 505 510

Ala Phe Ala Ala Arg Leu Arg Thr Gln Ala Ala Thr Ala Val Thr Asp

515 520 525

Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Leu Arg

530 535 540

Thr Tyr Asp Met Leu Ala Ala Ala Pro Thr Val Ala Arg Thr Leu Pro

545 550 555 560

Gln Leu Leu Ser Leu Glu Ala Trp Gly Gly Ala Thr Tyr Asp Val Ala

565 570 575

Leu Arg Phe Leu Lys Glu Asp Pro Trp Glu Arg Leu Asp Ala Leu Arg

580 585 590

Glu Thr Val Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Ala Asn

595 600 605

Thr Val Gly Tyr Gly Pro Ser Pro Pro Ser Thr Thr Lys Arg Phe Val

610 615 620

Gln Glu Ala Ala Arg Ser Gly Ile Asp Ile Phe Arg Ile Phe Asp Ala

625 630 635 640

Leu Asn Asp Val Asn Gln Met Arg Ala Ala Ile Asp Ala Val Leu Glu

645 650 655

Thr Asp Ala Leu Val Glu Ala Cys Leu Cys Tyr Thr Gly Asp Leu Gly

660 665 670

Glu Pro Thr Glu Asn Leu Tyr Thr Leu Asp Tyr Tyr Leu Ala Val Ala

675 680 685

Glu Arg Leu Val Ala Thr Gly Ala His Val Ile Ala Ile Lys Asp Met

690 695 700

Ala Gly Leu Leu Arg Pro Pro Ala Ala Arg Thr Leu Val Ser Ala Leu

705 710 715 720

Arg Ala Leu Phe Asp Ala Pro Ile His Val His Thr His Asp Thr Ala

725 730 735

Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Val Asp Ala Gly Ala Asp

740 745 750

Ala Ile Asp Gly Ala Ile Ala Pro Phe Ala Gly Thr Thr Ser Gln Pro

755 760 765

Ser Leu Ala Ala Ile Val Ala Ala Thr Asp His Thr Pro Arg Ala Thr

770 775 780

Gly Leu Ser Leu Asp Ala Leu Ile Asp Leu Glu Pro Tyr Trp Asp Ala

785 790 795 800

Val Arg Asp His Tyr Arg Pro Phe Asp Glu Ala Leu Arg Ala Pro Thr

805 810 815

Gly Ala Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Thr Asn Leu

820 825 830

Arg Gln Gln Ala Ile Ala Leu Gly Phe Gly His Arg Phe Glu Asp Val

835 840 845

Gln Arg Trp Tyr Thr Thr Val Asn His Leu Leu Gly Asn Ile Ile Lys

850 855 860

Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Ile Ala Leu Cys

865 870 875 880

Gly Thr Gly Leu Thr Pro Glu Glu Phe Glu Thr Asp Pro Ser Arg Val

885 890 895

Asp Ile Pro Asp Ser Val Val Ala Phe Leu Gln Gly Ala Leu Gly Glu

900 905 910

Pro Pro Gly Gly Trp Arg Glu Pro Phe Arg Thr Arg Ala Leu Ala Gly

915 920 925

Arg Thr Arg Ile Thr Asp Ser Thr Pro Glu Asp Ala Glu Thr Leu Pro

930 935 940

Pro Pro Gly Pro Asp Cys Arg Val Ala Ile Ser Gln Glu Leu Phe Pro

945 950 955 960

Asp Pro Ala Thr Asp Phe Glu Gln Thr Arg Thr Leu Tyr Gly Asp Leu

965 970 975

Ser Val Leu Pro Ser Ser Leu Phe Phe Tyr Gly Leu Arg Pro Gly Glu

980 985 990

Glu Phe Asp Val Arg Leu Gly Pro Gly Val Asp Leu Ile Ile Gly Leu

995 1000 1005

Glu Ala Ile Ala Glu Pro Asp Ala Arg Gly Met Arg Thr Val Leu

1010 1015 1020

Cys Arg Ile Asn Gly Gln Val Arg Pro Ile Val Val Arg Asp His

1025 1030 1035

Gln Ala Asn Ser Thr Ala Ala Thr Ala Glu Arg Ala Asn Pro Gln

1040 1045 1050

Thr Pro Gly His Val Ala Ala Pro Tyr Asp Gly Val Val Thr Val

1055 1060 1065

Arg Val Thr Ala Gly Gln Val Val Thr Ala Gly Asp Pro Val Ala

1070 1075 1080

Ser Ile Glu Ala Met Lys Met Glu Ser Thr Ile Thr Ala Pro Ile

1085 1090 1095

Ser Gly Thr Val Glu Arg Ile Ala Ile Ser Pro Ile Gly His Val

1100 1105 1110

Gln Ala Gly Asp Leu Ile Leu Thr Ile Lys Pro Ala Ser Leu Ala

1115 1120 1125

Ser Pro His Val Glu Pro Thr Ala Leu

1130 1135

<210> 256

<211> 3387

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_8序列

<400> 256

gtgcgtaccg tgctggtcgc caaccgcggc gagatcgcca tccgggcgtt ccgggccgcg 60

gtcgagctgg gtctgcagac ggtcgcgatc tacacccacc tcgaccgcgg gtcggtgcac 120

cgcatcaagg ccgaccgggc ctacgaggtc ggcccgcccg agcgcccgct ggcgggctac 180

ctcgatgtcg gcgctattgt cgaggcggcg gtcgccagcg gcgccgacgc cgtctacccg 240

gggtacgggt tcctctccga gagtgcggcc ttcgcggccg catgccgcga cgccggcctg 300

acatggatcg gcccacctcc cgaggtgctg gcgctcaccg gcgacaaggt gcgcgcgcgg 360

gaggcggccg tcgccgccgg gctgccggtg cttgccgcgt cgccaccggt cgacgagcac 420

gacgcccccg agcaggccga aggcctgggc tacccggtgt tcgtcaaggc cgccgctggg 480

ggtggcggac gcgggctgcg cgtcgtgcgg cgcgcgcagg acctcgtcgc cgcagtctcg 540

acggcccgcc gcgaggccga ggcggcgttc ggcgacccga ccgtgttcct ggagcgcgcc 600

ctcgagcggc cgcgccacgt cgaggtgcag attctcggcg acgccaccgg cggtctcgtc 660

cacctgggcg agcgcgactg ctcgatccag cgccgccacc agaaagtcgt cgagcttgcg 720

ccggcgccga acctcgaccc tggcgtccgc gaccagctgc atgccgatgc cgtcgcgctc 780

gggcaggccg ttggttacgt caacgccggc accgtcgagt tcctcgtcgc cgaggacggc 840

agtcacatct tcctggaggt caacccgcgg atccaggtcg agcacacggt gaccgaggag 900

gtcaccggcg tcgacctcgt cgcggcgcag ctgcgcatcg ccgacggcgc gtcgctcgcg 960

gacctgggta tcgcgcagga gtcggtgcgc tggcagggcg tcgcgatcca atgccggatc 1020

acgaccgagg acccggcgac ggggttccgg cccgacaccg gcacggtgat ggcctaccgc 1080

tcgcctggcg gcgctggcgt gcggctggac ggcggtgcga tcgatctcgg cagcgagatc 1140

acgccgtggt tcgactcgct gctggtgaag ctgacctgcc gcgggcccga cctcgacacg 1200

gccgcgcgtc gcgcgcgccg ggcgctcgcg gagttccgcg tgcgcggcct cgccaccaac 1260

atcgcgttcc tgcaggcgct gctgtcggag ccggacctgc tcgaggggcg cctgtcgacc 1320

gcgttcctcg acgagcaccc tcacctgctg cacgccccgg gcgggcacga ccgggtgtcg 1380

aagctgctct cctacgtcgc ggacgtgacc gtgaaccggc cccatgggcg ggcgcctgcg 1440

tccgtcgacc cggtgtcgct cctgcccgca ccgccggagt ggccgccgcc ggagggctcg 1500

aagcaactgc tggaccgtct gggcgcggag gggttcgcaa ggtggaccgc ggagcagccg 1560

tcgaccggcg tcaccgatac caccatgcgc gacgcgcacc agtcgctgat cgcgacccgg 1620

atgcgcaccg ccgacatggt cgcggcggca cgccacgtcg ccgcgatgct gccgcagctg 1680

tggagcatgg aggtgtgggg cggtgcgatc cacgacgtat cgcttcgctt cctgctggag 1740

gatccgtggc agcggctggc agcgctgcgt gaggcgatcc ccaacatctg cctacagatg 1800

ctgctgcgcg ggcggaacct ggtcggctac ggcagtgtcg acgacgcggt cgtgcgcgcg 1860

ttcgtggacg aggcggcgaa gaccggcatc gacgtgttcc gcatcttcga cgccttcaac 1920

gacgtcgagc ggatgcgccc cgccatcgat gcggtccgta cgacgcacgc ggtcgccgag 1980

gcggtcgtct gctacacggc tcacgccgtc gacccgcgcg agcggctgta cacggtgtcc 2040

tactacgccg acatcgcggc gcggctcgcg gccgcgggcg cacacaccct cgcgatcaag 2100

gacatggcgg ggctgctgcg cgcgggcgcg gccaccgcgc tggtacgagc ggtgcgcgac 2160

gccaccgggc tgccggtgca catccacacg cacgacacgg cgggcgggca gctcgccacc 2220

tatctcgccg cggtcggggc gggcgcatcg gtggtggacg ccgcggccgc gccatggtcg 2280

ggcggcacca gccagccgtc gctgagcgcg ctgatcgcgg cgctggatgc caccgactcc 2340

ccgacggcgc tgtcgctgga cgcggcgttg gacctggagc cgtactggga ggcggtccgg 2400

cggctctacg cgccgttcga ccaggggatc cccgcgccga gcggcgcggt ctaccgccat 2460

gagatcccgg gcggccagct gtcgaacctg cgccagcagg ccgcggcgct gggcctggcg 2520

gagcggttcg acgagatcgg tcgggtgtac cagcgggtcg atcggatgct cgggcggctc 2580

gtcaaggtga cgccgtcgag caaggtggtc ggcgacctcg cgctgtacct gatctcggcc 2640

ggcatcgacc cggacgcctt ggaggccgac cccggtgcgt acgacgtgcc cgcgtcggtg 2700

atccggttcc tgcaaggtga tctcggcacc ccgccaggtg ggtgggcgga gccgttccgc 2760

agcctggcgc tagcgcggca cggcgcggcc caggcgccgt ccgatgcagg ccccgccgtc 2820

gaccacgccg cgctggaggc cacctccgcc aatcgccgcg acgccctcaa cgccagtcag 2880

ttcccggcgg aggcgcgcga gcgcaaggag gcggtcgagc gctacgccga cgtgtcggtg 2940

ctgccgacgc gcacgttctt ctacggcctc gacccgctcg aggagatcgt cgtggaactg 3000

gaacccggcg tgcgcgtctt cctcgacctg gacgccgtcg gcgaggccga cgacaagggt 3060

cgccgcaccg tggtgatgcg ggtcaacggc caactgcgcg ccgtgaccgc gcacgaccgt 3120

tcggtcgccc ccgccgacgc gcccgccgag cgtgcggacc cgagcagccc gggagatatc 3180

gccgccccgt tgaccggcat cgtcaccgtg ctcgtcgccg acggcgaaca ggtgcaggcg 3240

ggcgcgcgcc tgtgcgcgct ggaggcgatg aagatggagt cgacggtcac tgcgccgttc 3300

gccggccgcg ttgctcgtgt ggtggcgagc aacggcgccc gcgtcgagcc cggcgacctg 3360

ctcgtcgtcc tcgagcccga cgagtga 3387

<210> 257

<211> 1128

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_8序列

<400> 257

Val Arg Thr Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg Ala

1 5 10 15

Phe Arg Ala Ala Val Glu Leu Gly Leu Gln Thr Val Ala Ile Tyr Thr

20 25 30

His Leu Asp Arg Gly Ser Val His Arg Ile Lys Ala Asp Arg Ala Tyr

35 40 45

Glu Val Gly Pro Pro Glu Arg Pro Leu Ala Gly Tyr Leu Asp Val Gly

50 55 60

Ala Ile Val Glu Ala Ala Val Ala Ser Gly Ala Asp Ala Val Tyr Pro

65 70 75 80

Gly Tyr Gly Phe Leu Ser Glu Ser Ala Ala Phe Ala Ala Ala Cys Arg

85 90 95

Asp Ala Gly Leu Thr Trp Ile Gly Pro Pro Pro Glu Val Leu Ala Leu

100 105 110

Thr Gly Asp Lys Val Arg Ala Arg Glu Ala Ala Val Ala Ala Gly Leu

115 120 125

Pro Val Leu Ala Ala Ser Pro Pro Val Asp Glu His Asp Ala Pro Glu

130 135 140

Gln Ala Glu Gly Leu Gly Tyr Pro Val Phe Val Lys Ala Ala Ala Gly

145 150 155 160

Gly Gly Gly Arg Gly Leu Arg Val Val Arg Arg Ala Gln Asp Leu Val

165 170 175

Ala Ala Val Ser Thr Ala Arg Arg Glu Ala Glu Ala Ala Phe Gly Asp

180 185 190

Pro Thr Val Phe Leu Glu Arg Ala Leu Glu Arg Pro Arg His Val Glu

195 200 205

Val Gln Ile Leu Gly Asp Ala Thr Gly Gly Leu Val His Leu Gly Glu

210 215 220

Arg Asp Cys Ser Ile Gln Arg Arg His Gln Lys Val Val Glu Leu Ala

225 230 235 240

Pro Ala Pro Asn Leu Asp Pro Gly Val Arg Asp Gln Leu His Ala Asp

245 250 255

Ala Val Ala Leu Gly Gln Ala Val Gly Tyr Val Asn Ala Gly Thr Val

260 265 270

Glu Phe Leu Val Ala Glu Asp Gly Ser His Ile Phe Leu Glu Val Asn

275 280 285

Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr Gly Val

290 295 300

Asp Leu Val Ala Ala Gln Leu Arg Ile Ala Asp Gly Ala Ser Leu Ala

305 310 315 320

Asp Leu Gly Ile Ala Gln Glu Ser Val Arg Trp Gln Gly Val Ala Ile

325 330 335

Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Thr Gly Phe Arg Pro Asp

340 345 350

Thr Gly Thr Val Met Ala Tyr Arg Ser Pro Gly Gly Ala Gly Val Arg

355 360 365

Leu Asp Gly Gly Ala Ile Asp Leu Gly Ser Glu Ile Thr Pro Trp Phe

370 375 380

Asp Ser Leu Leu Val Lys Leu Thr Cys Arg Gly Pro Asp Leu Asp Thr

385 390 395 400

Ala Ala Arg Arg Ala Arg Arg Ala Leu Ala Glu Phe Arg Val Arg Gly

405 410 415

Leu Ala Thr Asn Ile Ala Phe Leu Gln Ala Leu Leu Ser Glu Pro Asp

420 425 430

Leu Leu Glu Gly Arg Leu Ser Thr Ala Phe Leu Asp Glu His Pro His

435 440 445

Leu Leu His Ala Pro Gly Gly His Asp Arg Val Ser Lys Leu Leu Ser

450 455 460

Tyr Val Ala Asp Val Thr Val Asn Arg Pro His Gly Arg Ala Pro Ala

465 470 475 480

Ser Val Asp Pro Val Ser Leu Leu Pro Ala Pro Pro Glu Trp Pro Pro

485 490 495

Pro Glu Gly Ser Lys Gln Leu Leu Asp Arg Leu Gly Ala Glu Gly Phe

500 505 510

Ala Arg Trp Thr Ala Glu Gln Pro Ser Thr Gly Val Thr Asp Thr Thr

515 520 525

Met Arg Asp Ala His Gln Ser Leu Ile Ala Thr Arg Met Arg Thr Ala

530 535 540

Asp Met Val Ala Ala Ala Arg His Val Ala Ala Met Leu Pro Gln Leu

545 550 555 560

Trp Ser Met Glu Val Trp Gly Gly Ala Ile His Asp Val Ser Leu Arg

565 570 575

Phe Leu Leu Glu Asp Pro Trp Gln Arg Leu Ala Ala Leu Arg Glu Ala

580 585 590

Ile Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Arg Asn Leu Val

595 600 605

Gly Tyr Gly Ser Val Asp Asp Ala Val Val Arg Ala Phe Val Asp Glu

610 615 620

Ala Ala Lys Thr Gly Ile Asp Val Phe Arg Ile Phe Asp Ala Phe Asn

625 630 635 640

Asp Val Glu Arg Met Arg Pro Ala Ile Asp Ala Val Arg Thr Thr His

645 650 655

Ala Val Ala Glu Ala Val Val Cys Tyr Thr Ala His Ala Val Asp Pro

660 665 670

Arg Glu Arg Leu Tyr Thr Val Ser Tyr Tyr Ala Asp Ile Ala Ala Arg

675 680 685

Leu Ala Ala Ala Gly Ala His Thr Leu Ala Ile Lys Asp Met Ala Gly

690 695 700

Leu Leu Arg Ala Gly Ala Ala Thr Ala Leu Val Arg Ala Val Arg Asp

705 710 715 720

Ala Thr Gly Leu Pro Val His Ile His Thr His Asp Thr Ala Gly Gly

725 730 735

Gln Leu Ala Thr Tyr Leu Ala Ala Val Gly Ala Gly Ala Ser Val Val

740 745 750

Asp Ala Ala Ala Ala Pro Trp Ser Gly Gly Thr Ser Gln Pro Ser Leu

755 760 765

Ser Ala Leu Ile Ala Ala Leu Asp Ala Thr Asp Ser Pro Thr Ala Leu

770 775 780

Ser Leu Asp Ala Ala Leu Asp Leu Glu Pro Tyr Trp Glu Ala Val Arg

785 790 795 800

Arg Leu Tyr Ala Pro Phe Asp Gln Gly Ile Pro Ala Pro Ser Gly Ala

805 810 815

Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu Arg Gln

820 825 830

Gln Ala Ala Ala Leu Gly Leu Ala Glu Arg Phe Asp Glu Ile Gly Arg

835 840 845

Val Tyr Gln Arg Val Asp Arg Met Leu Gly Arg Leu Val Lys Val Thr

850 855 860

Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu Tyr Leu Ile Ser Ala

865 870 875 880

Gly Ile Asp Pro Asp Ala Leu Glu Ala Asp Pro Gly Ala Tyr Asp Val

885 890 895

Pro Ala Ser Val Ile Arg Phe Leu Gln Gly Asp Leu Gly Thr Pro Pro

900 905 910

Gly Gly Trp Ala Glu Pro Phe Arg Ser Leu Ala Leu Ala Arg His Gly

915 920 925

Ala Ala Gln Ala Pro Ser Asp Ala Gly Pro Ala Val Asp His Ala Ala

930 935 940

Leu Glu Ala Thr Ser Ala Asn Arg Arg Asp Ala Leu Asn Ala Ser Gln

945 950 955 960

Phe Pro Ala Glu Ala Arg Glu Arg Lys Glu Ala Val Glu Arg Tyr Ala

965 970 975

Asp Val Ser Val Leu Pro Thr Arg Thr Phe Phe Tyr Gly Leu Asp Pro

980 985 990

Leu Glu Glu Ile Val Val Glu Leu Glu Pro Gly Val Arg Val Phe Leu

995 1000 1005

Asp Leu Asp Ala Val Gly Glu Ala Asp Asp Lys Gly Arg Arg Thr

1010 1015 1020

Val Val Met Arg Val Asn Gly Gln Leu Arg Ala Val Thr Ala His

1025 1030 1035

Asp Arg Ser Val Ala Pro Ala Asp Ala Pro Ala Glu Arg Ala Asp

1040 1045 1050

Pro Ser Ser Pro Gly Asp Ile Ala Ala Pro Leu Thr Gly Ile Val

1055 1060 1065

Thr Val Leu Val Ala Asp Gly Glu Gln Val Gln Ala Gly Ala Arg

1070 1075 1080

Leu Cys Ala Leu Glu Ala Met Lys Met Glu Ser Thr Val Thr Ala

1085 1090 1095

Pro Phe Ala Gly Arg Val Ala Arg Val Val Ala Ser Asn Gly Ala

1100 1105 1110

Arg Val Glu Pro Gly Asp Leu Leu Val Val Leu Glu Pro Asp Glu

1115 1120 1125

<210> 258

<211> 3456

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_9序列

<400> 258

atgcgcaaac tgctcgtcgc caaccgctct gagattgcga tccgctgttt cagggcggcg 60

accgaactcg gcctccggac cgtcgccatc tacagccacg aggatcgatt ctcactacat 120

cgcttcaagg cagacgaagc attcctgatc ggtccgccgg gcggcggcga gccggtacgc 180

tcgtatctga acatccccgc gatcatcgcc attgctcatc agcagagcgt cgatgcgatt 240

catccgggct acggattcct cgccgagagc gccgacctcg cgcgcgcgtg cgaggcggcc 300

ggcattcaat tcgtcgggcc gacgcccgag catctcgata tgtttggcga caagaccgcc 360

gccaagcgtc tggcggtcgc cgccggcgtg ccgactgtgc ccggctccga gggcgctctc 420

caggaccttg gcgacatctc ggcggcggcc gccaaagtga gttatcccct gatgatcaag 480

gcgagcttcg gcggcggcgg ccgaggcatg cgcatcgtca ggacgcccga cgagctcgcc 540

aacaagctcg aggaggcgca gcgggaagcg ggcgcggcgt tcgggcgacc ggacgtgttt 600

ctcgagcggt acataccgcg cgccaagcac atcgaggtgc agatcctcgg tgatgcccac 660

ggcgctctgg ttcatctctg ggagcgcgac tgctccgtac agcgccggca tcagaaagtc 720

gtcgagctcg cgccgagcat caatgtggcc gagagcctgc gtcagcagat ctgcgacgct 780

gccgtgcgct tgtgccggtc agtcaactat cgcaatgccg gcaccgtcga gttcctgctg 840

gatgtcgagc gcggtgagtt ctacttcatc gaggtaaacc cgcgcattca ggtcgagcac 900

acggtcaccg aggtcgtcac ggggatcgat ctggtccgca gtcagatcct gatcgccgac 960

ggccatcgcc tgcatgaggc cccgctcaac gtccccgccc aggaggagat ccgcacgcgc 1020

ggtgtcgcca tgcagtgccg gatcacaacc gaagatccgg accgccattt catccccgac 1080

tacgggcgca tcaccacgta tcgatcggcc ggcggcttcg cggtgcggct cgatgggggc 1140

aacgggttcg gcggctccgt catcacgccc ttcttcgatt cgctgctcgt gaaagtgacg 1200

acgtggggcg gcacgctcga ggaatcggcg cagcgcgctg accgcgcgct ccgcgagttc 1260

cgtattcgcg gcgtcaagac gaacatcgcg tttctgttga atctgattgg tcacccgacg 1320

ttcaggtcgg gcgccgccac cacgaccttc atcgacgaga cgccggagct cttccggatt 1380

caggccccac gggatcgcgc cacgaaaatg ctcgggtact tgggcgacgt gatcgtgaac 1440

ggccgaccag acgtcaagaa cgcgtacgac ccccaacgca agctgccgac ccagaagcca 1500

ccggttgtgc cgccaatggc cgccccccca gctggaatgc gccagaagct tcaagagctt 1560

gggccggagc gcttcgcctc gtgggtgcgg ggtgaacgtc gtctgctcat gaccgacacg 1620

acgttccgcg atgcgcacca gtcgttgctc gcgacacgcg tacggacgta cgacatgctg 1680

gcgatcgccg acgccgtggc gcggttgatg ccggatctgt tcagcctgga gatgtgggga 1740

ggtgcgacgt tcgacacctc gatgcgcttc ctccaggaag acccgtgggc caggctcatt 1800

cagctgcgcg agcgcatccc gaacatcctg tttcagatgc tgctccgcgg cagcaacgcg 1860

gtcggctaca ccacctatcc cgacaacgtg gttcgcgcct tcgtgaagcg gagcgcggag 1920

gccggcatcg acgtattccg gatcttcgac gcgctgaact ccaccgacaa catgcgtgtc 1980

gcgatcgagg ccgtccgcga ggacacgacg gcgatctgcg aggccgccat ctgctacacg 2040

ggcgatatcc tcgatccgaa cagaaccaag tactccctgg attactacct gcggatcgcg 2100

aagaagctcg tggccatggg cacgcacatc ctctgcatca aggacatggc cgggttgtgc 2160

aaaccctacg cggcccacgc gctgattcag gcgctgcgcg aggaagtgga cgtcccgatc 2220

catttccaca cgcatgacac gagcggagtg aacgccggca gcattctgcg cgcatccgac 2280

gccggagtgg acattgccga cgctgcgatc gcctcgatga gcgggatgac gagccagccg 2340

agtctgaacg gcgtcgttgc agcgctccgc catacggagc gcgacaccgc actgaatcag 2400

gaagcgctcg acgagctgag ccgctactgg gccgacgtgc gggaactcta ttacccgttc 2460

gaagaagggt tgaaagcgcc tcaggccgat gtctatcaac acgagatgcc cgggggtcag 2520

ttcacaaacc ttcgtcagca ggcccgcaac cttggatttg gcgatcgctg gcccgagatc 2580

tccgcggcct atgcagaagc caatcggctg gccggcgata tcgtgaaggt gaccccttcc 2640

agcaaggtca tcggcgatct ggcgctcttc atggtcacga acaacctgac cgccaacgac 2700

atcctgacgt ccggcgcgcc cctgagtttc ccgcgcagcg tcgtcggcat gatgcaggga 2760

ctgctcggcc agcccgaggg cggctggccg aaagactttc aggagatcgt gctgcgttcg 2820

gcgcacgcaa cgcctatcac gggccgcccc gcggatactc tgccgccagc tgacttcgag 2880

gccaccgcgc aggagctcaa ggctaagaca ggacgcgatg tcagcgagca cgacgtgctc 2940

tcgtacttgc tgtatccgca ggtgtatgtc gagtacatcg agcactggca gaagtacggc 3000

gacacctcga cgattccgac agcgaatttc ttctacggat tgcagccagg ggaggagacc 3060

gcgatcgaga tcgagcgcgg caagacgttg ttcgtccgct ttctgacggc tggagaggtg 3120

cgcgaggacg gcacgcgaac ggtgttcttc gagctgaacg gacagccgcg agaggtgcgt 3180

gtcatcgatc gttcagtgac cgccctccgc aagagccatc ccaaggccga cgttgaaaat 3240

cccgatcatg tcgctgcgcc gatgccggga aagatctcgt cggtcgcggt acggccaggc 3300

caaagggtgc gagcgggtga ccgtctgctt tcgatcgaag cgatgaaaat ggagacagcc 3360

gtatatagtc cacgcgatgg ggccgtggca gaggtgctgg tcgtcacggg ccaggttgtc 3420

gaaacccggg atctgctgct tgtcctgact gagtga 3456

<210> 259

<211> 1151

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_9序列

<400> 259

Met Arg Lys Leu Leu Val Ala Asn Arg Ser Glu Ile Ala Ile Arg Cys

1 5 10 15

Phe Arg Ala Ala Thr Glu Leu Gly Leu Arg Thr Val Ala Ile Tyr Ser

20 25 30

His Glu Asp Arg Phe Ser Leu His Arg Phe Lys Ala Asp Glu Ala Phe

35 40 45

Leu Ile Gly Pro Pro Gly Gly Gly Glu Pro Val Arg Ser Tyr Leu Asn

50 55 60

Ile Pro Ala Ile Ile Ala Ile Ala His Gln Gln Ser Val Asp Ala Ile

65 70 75 80

His Pro Gly Tyr Gly Phe Leu Ala Glu Ser Ala Asp Leu Ala Arg Ala

85 90 95

Cys Glu Ala Ala Gly Ile Gln Phe Val Gly Pro Thr Pro Glu His Leu

100 105 110

Asp Met Phe Gly Asp Lys Thr Ala Ala Lys Arg Leu Ala Val Ala Ala

115 120 125

Gly Val Pro Thr Val Pro Gly Ser Glu Gly Ala Leu Gln Asp Leu Gly

130 135 140

Asp Ile Ser Ala Ala Ala Ala Lys Val Ser Tyr Pro Leu Met Ile Lys

145 150 155 160

Ala Ser Phe Gly Gly Gly Gly Arg Gly Met Arg Ile Val Arg Thr Pro

165 170 175

Asp Glu Leu Ala Asn Lys Leu Glu Glu Ala Gln Arg Glu Ala Gly Ala

180 185 190

Ala Phe Gly Arg Pro Asp Val Phe Leu Glu Arg Tyr Ile Pro Arg Ala

195 200 205

Lys His Ile Glu Val Gln Ile Leu Gly Asp Ala His Gly Ala Leu Val

210 215 220

His Leu Trp Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val

225 230 235 240

Val Glu Leu Ala Pro Ser Ile Asn Val Ala Glu Ser Leu Arg Gln Gln

245 250 255

Ile Cys Asp Ala Ala Val Arg Leu Cys Arg Ser Val Asn Tyr Arg Asn

260 265 270

Ala Gly Thr Val Glu Phe Leu Leu Asp Val Glu Arg Gly Glu Phe Tyr

275 280 285

Phe Ile Glu Val Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu

290 295 300

Val Val Thr Gly Ile Asp Leu Val Arg Ser Gln Ile Leu Ile Ala Asp

305 310 315 320

Gly His Arg Leu His Glu Ala Pro Leu Asn Val Pro Ala Gln Glu Glu

325 330 335

Ile Arg Thr Arg Gly Val Ala Met Gln Cys Arg Ile Thr Thr Glu Asp

340 345 350

Pro Asp Arg His Phe Ile Pro Asp Tyr Gly Arg Ile Thr Thr Tyr Arg

355 360 365

Ser Ala Gly Gly Phe Ala Val Arg Leu Asp Gly Gly Asn Gly Phe Gly

370 375 380

Gly Ser Val Ile Thr Pro Phe Phe Asp Ser Leu Leu Val Lys Val Thr

385 390 395 400

Thr Trp Gly Gly Thr Leu Glu Glu Ser Ala Gln Arg Ala Asp Arg Ala

405 410 415

Leu Arg Glu Phe Arg Ile Arg Gly Val Lys Thr Asn Ile Ala Phe Leu

420 425 430

Leu Asn Leu Ile Gly His Pro Thr Phe Arg Ser Gly Ala Ala Thr Thr

435 440 445

Thr Phe Ile Asp Glu Thr Pro Glu Leu Phe Arg Ile Gln Ala Pro Arg

450 455 460

Asp Arg Ala Thr Lys Met Leu Gly Tyr Leu Gly Asp Val Ile Val Asn

465 470 475 480

Gly Arg Pro Asp Val Lys Asn Ala Tyr Asp Pro Gln Arg Lys Leu Pro

485 490 495

Thr Gln Lys Pro Pro Val Val Pro Pro Met Ala Ala Pro Pro Ala Gly

500 505 510

Met Arg Gln Lys Leu Gln Glu Leu Gly Pro Glu Arg Phe Ala Ser Trp

515 520 525

Val Arg Gly Glu Arg Arg Leu Leu Met Thr Asp Thr Thr Phe Arg Asp

530 535 540

Ala His Gln Ser Leu Leu Ala Thr Arg Val Arg Thr Tyr Asp Met Leu

545 550 555 560

Ala Ile Ala Asp Ala Val Ala Arg Leu Met Pro Asp Leu Phe Ser Leu

565 570 575

Glu Met Trp Gly Gly Ala Thr Phe Asp Thr Ser Met Arg Phe Leu Gln

580 585 590

Glu Asp Pro Trp Ala Arg Leu Ile Gln Leu Arg Glu Arg Ile Pro Asn

595 600 605

Ile Leu Phe Gln Met Leu Leu Arg Gly Ser Asn Ala Val Gly Tyr Thr

610 615 620

Thr Tyr Pro Asp Asn Val Val Arg Ala Phe Val Lys Arg Ser Ala Glu

625 630 635 640

Ala Gly Ile Asp Val Phe Arg Ile Phe Asp Ala Leu Asn Ser Thr Asp

645 650 655

Asn Met Arg Val Ala Ile Glu Ala Val Arg Glu Asp Thr Thr Ala Ile

660 665 670

Cys Glu Ala Ala Ile Cys Tyr Thr Gly Asp Ile Leu Asp Pro Asn Arg

675 680 685

Thr Lys Tyr Ser Leu Asp Tyr Tyr Leu Arg Ile Ala Lys Lys Leu Val

690 695 700

Ala Met Gly Thr His Ile Leu Cys Ile Lys Asp Met Ala Gly Leu Cys

705 710 715 720

Lys Pro Tyr Ala Ala His Ala Leu Ile Gln Ala Leu Arg Glu Glu Val

725 730 735

Asp Val Pro Ile His Phe His Thr His Asp Thr Ser Gly Val Asn Ala

740 745 750

Gly Ser Ile Leu Arg Ala Ser Asp Ala Gly Val Asp Ile Ala Asp Ala

755 760 765

Ala Ile Ala Ser Met Ser Gly Met Thr Ser Gln Pro Ser Leu Asn Gly

770 775 780

Val Val Ala Ala Leu Arg His Thr Glu Arg Asp Thr Ala Leu Asn Gln

785 790 795 800

Glu Ala Leu Asp Glu Leu Ser Arg Tyr Trp Ala Asp Val Arg Glu Leu

805 810 815

Tyr Tyr Pro Phe Glu Glu Gly Leu Lys Ala Pro Gln Ala Asp Val Tyr

820 825 830

Gln His Glu Met Pro Gly Gly Gln Phe Thr Asn Leu Arg Gln Gln Ala

835 840 845

Arg Asn Leu Gly Phe Gly Asp Arg Trp Pro Glu Ile Ser Ala Ala Tyr

850 855 860

Ala Glu Ala Asn Arg Leu Ala Gly Asp Ile Val Lys Val Thr Pro Ser

865 870 875 880

Ser Lys Val Ile Gly Asp Leu Ala Leu Phe Met Val Thr Asn Asn Leu

885 890 895

Thr Ala Asn Asp Ile Leu Thr Ser Gly Ala Pro Leu Ser Phe Pro Arg

900 905 910

Ser Val Val Gly Met Met Gln Gly Leu Leu Gly Gln Pro Glu Gly Gly

915 920 925

Trp Pro Lys Asp Phe Gln Glu Ile Val Leu Arg Ser Ala His Ala Thr

930 935 940

Pro Ile Thr Gly Arg Pro Ala Asp Thr Leu Pro Pro Ala Asp Phe Glu

945 950 955 960

Ala Thr Ala Gln Glu Leu Lys Ala Lys Thr Gly Arg Asp Val Ser Glu

965 970 975

His Asp Val Leu Ser Tyr Leu Leu Tyr Pro Gln Val Tyr Val Glu Tyr

980 985 990

Ile Glu His Trp Gln Lys Tyr Gly Asp Thr Ser Thr Ile Pro Thr Ala

995 1000 1005

Asn Phe Phe Tyr Gly Leu Gln Pro Gly Glu Glu Thr Ala Ile Glu

1010 1015 1020

Ile Glu Arg Gly Lys Thr Leu Phe Val Arg Phe Leu Thr Ala Gly

1025 1030 1035

Glu Val Arg Glu Asp Gly Thr Arg Thr Val Phe Phe Glu Leu Asn

1040 1045 1050

Gly Gln Pro Arg Glu Val Arg Val Ile Asp Arg Ser Val Thr Ala

1055 1060 1065

Leu Arg Lys Ser His Pro Lys Ala Asp Val Glu Asn Pro Asp His

1070 1075 1080

Val Ala Ala Pro Met Pro Gly Lys Ile Ser Ser Val Ala Val Arg

1085 1090 1095

Pro Gly Gln Arg Val Arg Ala Gly Asp Arg Leu Leu Ser Ile Glu

1100 1105 1110

Ala Met Lys Met Glu Thr Ala Val Tyr Ser Pro Arg Asp Gly Ala

1115 1120 1125

Val Ala Glu Val Leu Val Val Thr Gly Gln Val Val Glu Thr Arg

1130 1135 1140

Asp Leu Leu Leu Val Leu Thr Glu

1145 1150

<210> 260

<211> 3459

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_10序列

<400> 260

atggccatgt ccatccgccc cccccctgcc ttcaagcgca tcctggtcgc caaccgcagc 60

gagatcgcca tccgggtctt tcgcgcctgc accgagctcg gcatccgcac cctcggcatc 120

ttcagcaagg aagaccgcac ggctctgcat cgctacaagg cggacgagac ctacgcgctg 180

gacgagcggc tcgagcccat caaggcctac ctcgacatcc ccggcatcat caacatcgcc 240

aaacgccacg gcgccgacgc catccacccc ggctacggct tcctctccga gaacgcggcc 300

ttcgcgcgcg cctgcgagga agcgggcatc gtcttcattg gcccgccccc cgcgctgctc 360

gacatgatgg gcgacaaaac cgccgcccgg aaacaagccc aagaggtggg cctgccggtg 420

gtgcccggca ccgacgcccc ggtgcccagc cccgaggacg cggtcaccat cgccggccgc 480

atcggctacc cggtgatcct caaagcctcc tacggcggcg gcggccgcgg catgcgcgtg 540

gcgcgcaccg acgccgagct gcgcgagttt ttcacccagg ccgaacgcga agccaccgcc 600

gctttcggcc gcggcgagat cttcctcgag aagttcatcg agagccccaa acacatcgag 660

gtgcagatcc tggccgacca gcacggccac accgtgcacc tttacgagcg cgactgctcg 720

gtccagcgcc gccaccagaa ggtggtggag atcgccccct ccccgcacct cgacgacaag 780

ctgcgcgcca ccctgtgcga cgaggccgtg cgcctgtgcc aggcggtggg ctacgtcaac 840

gccggcacgg tggagttcct ggtcgacaaa cacggcgccc attacttcat cgagatgaac 900

ccccgggtgc aggtggaaca caccgtcacc gagatggtca ccggcatcga catcgtcaaa 960

tcgcagatcc gcatcgccga gggccacccc ctggaaagcc ccctcatcgg catccccgcc 1020

caaagcgcgg tcagcctgcg cggctacgcc atccagtgcc gcataaccac cgaggatccc 1080

gccaacaact tcatccccga ctacggccgc atctcgcact accgctcggc ggccggcttc 1140

ggcatccgcc tcgacgccgg caccgccttt tccggtgccc tcatcacccc gttttacgac 1200

tcgctgctgg tcaaggtctg cgcctcgggc ctcaccttcg acgaggcctg cagcaagatg 1260

gaccgcgccc tggccgagtg gcgcgtgcgc ggcgtgcgca ccaacctgcc cttcctgcgc 1320

aacgtggtca atcacccgcg ctttcgcgcc ggcgacgcca ccaccacctt catcgccgac 1380

acacccgagc tgctggtgtt ccaagagcgc ttcgatcgcg ccaccaagat cctgcagttc 1440

atcggcgacg tcagcgtgaa cggcaacccc gaggtcaagg gcgcgcggcc cgagaagctg 1500

cgcaagccgg tggtgcccga acacgagccc caccaggccc ccccgcccgg cacccgcgat 1560

ctgtggaaga agctcggcac cgcggacttc tgcgcctggg tgcgcgatca gaagaagctg 1620

ctgctcaccg acaccacctt ccgcgacgcc caccagtcgc tcttcgccac ccgcctgcgc 1680

acgctggaga tgacccgggt ggcgcccgcg gtggcccagc acctctccgg cctgttctcg 1740

ctggagatgt ggggcggcgc caccttcgac gtggccatgc gctttttgca cgaggatccc 1800

tgggaccgcc tggccacctt gcgcaagcag atccccaacg tgctcttcca gatgctcctg 1860

cgcggggcga acgcggtcgg ctacaccaac taccccgaca atgtcgtgcg ccgcttcgtc 1920

gaggaagccg cccgcaccgg catggacatc ttccgcatct tcgattcgct caactggctg 1980

cccggcatcc tgcccgccat cgagatggtg gccagcgccg gcggcatcgc cgaagcctcc 2040

ctctgttaca ccggcaacat cgacgacccc aaacgttcga agtacgacct caagtactac 2100

gtcgatctgg ccaaggagct ggagaaacac ggcgcccaca tgctgggcat caaggacatg 2160

tcgggcctct tgcgcccctt cgccgcccgc cgcctgatcc gggccctgcg cgaggaagtg 2220

ggcctgccca tccacctgca cacccacgac accgccggca tccaagccgg ctcctatctg 2280

ttcgccgccg aggccggggt caacgtggtc gactgcgcgt tcggcgccat gtccagcctc 2340

acctcgcagc ccaacctcga gagcatcgtg gccgccctcg agcaccagga gcgcgacacc 2400

ggcctcgact tcacccggct gctggacttc acctactact gggaagaggt ccgcaactac 2460

tacgccgcct tcgaaagcgg catgaagtca ccttccgccg acgtctacgt gcacgagatc 2520

cccggcggtc agtacagcaa cctgcgcccg caggccgaat cagtgggggt gggggatcgc 2580

atccccgagc tcaagcgcat gtacgccgtg gtcaacgaga tgctggggga catcgtcaag 2640

gtgacgccca gctcgaagat ggtgggcgat ctggccctct tcatgttgac caacaatctc 2700

actccgccgg acctgatcga gcgcggacgc gagctgacct tccccgagtc ggtgatcggc 2760

tacttcgccg gcgaaatcgg ccagccgccc ggcggttttc ccccggcgct gtccgcggcc 2820

atcctcaagg gccgcacgcc cttcgccggc cgcccgggcg acaccctgcc gccggtggac 2880

ttcgacaaga cccggcgcga ggtggaaacc aaagtgggcc gtcccgccag cgagcaggac 2940

gtgctgtcct atctgatgta ccccaaggtc ttcaccgact ttgccagcta tgtgaaaaag 3000

tacggtgacg tctcggcggt gcccaccgat gtgatgttct atggcatgcg caagggcgac 3060

gagaccgagg tggagatcga acggggcaag accctgttca tccgcctcgg cgccatcagc 3120

gagcccaacg aacgcggcat gcgcaccttg ttcttcgagc tgaacgggca cccccgcgag 3180

gtcgaggtgc tggacaagaa gctgggcaag gcggtggcgg cccgccccaa ggccgacaag 3240

gacaacctgc accacctcgg ctcacccatg cccggcaccg tcatcgaggt gaaggccaag 3300

gcgggcgacg aggtcaagga aggcgacaag ctggtggtgc tggaagcgat gaagatggag 3360

atgacgctgg cctcgccgct ggccggcgtc atcaaggaaa tcaccgtcac cccgaaggac 3420

cgtgtcgata ccggggatct gctggtggta ttcaagtag 3459

<210> 261

<211> 1152

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_10序列

<400> 261

Met Ala Met Ser Ile Arg Pro Pro Pro Ala Phe Lys Arg Ile Leu Val

1 5 10 15

Ala Asn Arg Ser Glu Ile Ala Ile Arg Val Phe Arg Ala Cys Thr Glu

20 25 30

Leu Gly Ile Arg Thr Leu Gly Ile Phe Ser Lys Glu Asp Arg Thr Ala

35 40 45

Leu His Arg Tyr Lys Ala Asp Glu Thr Tyr Ala Leu Asp Glu Arg Leu

50 55 60

Glu Pro Ile Lys Ala Tyr Leu Asp Ile Pro Gly Ile Ile Asn Ile Ala

65 70 75 80

Lys Arg His Gly Ala Asp Ala Ile His Pro Gly Tyr Gly Phe Leu Ser

85 90 95

Glu Asn Ala Ala Phe Ala Arg Ala Cys Glu Glu Ala Gly Ile Val Phe

100 105 110

Ile Gly Pro Pro Pro Ala Leu Leu Asp Met Met Gly Asp Lys Thr Ala

115 120 125

Ala Arg Lys Gln Ala Gln Glu Val Gly Leu Pro Val Val Pro Gly Thr

130 135 140

Asp Ala Pro Val Pro Ser Pro Glu Asp Ala Val Thr Ile Ala Gly Arg

145 150 155 160

Ile Gly Tyr Pro Val Ile Leu Lys Ala Ser Tyr Gly Gly Gly Gly Arg

165 170 175

Gly Met Arg Val Ala Arg Thr Asp Ala Glu Leu Arg Glu Phe Phe Thr

180 185 190

Gln Ala Glu Arg Glu Ala Thr Ala Ala Phe Gly Arg Gly Glu Ile Phe

195 200 205

Leu Glu Lys Phe Ile Glu Ser Pro Lys His Ile Glu Val Gln Ile Leu

210 215 220

Ala Asp Gln His Gly His Thr Val His Leu Tyr Glu Arg Asp Cys Ser

225 230 235 240

Val Gln Arg Arg His Gln Lys Val Val Glu Ile Ala Pro Ser Pro His

245 250 255

Leu Asp Asp Lys Leu Arg Ala Thr Leu Cys Asp Glu Ala Val Arg Leu

260 265 270

Cys Gln Ala Val Gly Tyr Val Asn Ala Gly Thr Val Glu Phe Leu Val

275 280 285

Asp Lys His Gly Ala His Tyr Phe Ile Glu Met Asn Pro Arg Val Gln

290 295 300

Val Glu His Thr Val Thr Glu Met Val Thr Gly Ile Asp Ile Val Lys

305 310 315 320

Ser Gln Ile Arg Ile Ala Glu Gly His Pro Leu Glu Ser Pro Leu Ile

325 330 335

Gly Ile Pro Ala Gln Ser Ala Val Ser Leu Arg Gly Tyr Ala Ile Gln

340 345 350

Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Asn Phe Ile Pro Asp Tyr

355 360 365

Gly Arg Ile Ser His Tyr Arg Ser Ala Ala Gly Phe Gly Ile Arg Leu

370 375 380

Asp Ala Gly Thr Ala Phe Ser Gly Ala Leu Ile Thr Pro Phe Tyr Asp

385 390 395 400

Ser Leu Leu Val Lys Val Cys Ala Ser Gly Leu Thr Phe Asp Glu Ala

405 410 415

Cys Ser Lys Met Asp Arg Ala Leu Ala Glu Trp Arg Val Arg Gly Val

420 425 430

Arg Thr Asn Leu Pro Phe Leu Arg Asn Val Val Asn His Pro Arg Phe

435 440 445

Arg Ala Gly Asp Ala Thr Thr Thr Phe Ile Ala Asp Thr Pro Glu Leu

450 455 460

Leu Val Phe Gln Glu Arg Phe Asp Arg Ala Thr Lys Ile Leu Gln Phe

465 470 475 480

Ile Gly Asp Val Ser Val Asn Gly Asn Pro Glu Val Lys Gly Ala Arg

485 490 495

Pro Glu Lys Leu Arg Lys Pro Val Val Pro Glu His Glu Pro His Gln

500 505 510

Ala Pro Pro Pro Gly Thr Arg Asp Leu Trp Lys Lys Leu Gly Thr Ala

515 520 525

Asp Phe Cys Ala Trp Val Arg Asp Gln Lys Lys Leu Leu Leu Thr Asp

530 535 540

Thr Thr Phe Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Leu Arg

545 550 555 560

Thr Leu Glu Met Thr Arg Val Ala Pro Ala Val Ala Gln His Leu Ser

565 570 575

Gly Leu Phe Ser Leu Glu Met Trp Gly Gly Ala Thr Phe Asp Val Ala

580 585 590

Met Arg Phe Leu His Glu Asp Pro Trp Asp Arg Leu Ala Thr Leu Arg

595 600 605

Lys Gln Ile Pro Asn Val Leu Phe Gln Met Leu Leu Arg Gly Ala Asn

610 615 620

Ala Val Gly Tyr Thr Asn Tyr Pro Asp Asn Val Val Arg Arg Phe Val

625 630 635 640

Glu Glu Ala Ala Arg Thr Gly Met Asp Ile Phe Arg Ile Phe Asp Ser

645 650 655

Leu Asn Trp Leu Pro Gly Ile Leu Pro Ala Ile Glu Met Val Ala Ser

660 665 670

Ala Gly Gly Ile Ala Glu Ala Ser Leu Cys Tyr Thr Gly Asn Ile Asp

675 680 685

Asp Pro Lys Arg Ser Lys Tyr Asp Leu Lys Tyr Tyr Val Asp Leu Ala

690 695 700

Lys Glu Leu Glu Lys His Gly Ala His Met Leu Gly Ile Lys Asp Met

705 710 715 720

Ser Gly Leu Leu Arg Pro Phe Ala Ala Arg Arg Leu Ile Arg Ala Leu

725 730 735

Arg Glu Glu Val Gly Leu Pro Ile His Leu His Thr His Asp Thr Ala

740 745 750

Gly Ile Gln Ala Gly Ser Tyr Leu Phe Ala Ala Glu Ala Gly Val Asn

755 760 765

Val Val Asp Cys Ala Phe Gly Ala Met Ser Ser Leu Thr Ser Gln Pro

770 775 780

Asn Leu Glu Ser Ile Val Ala Ala Leu Glu His Gln Glu Arg Asp Thr

785 790 795 800

Gly Leu Asp Phe Thr Arg Leu Leu Asp Phe Thr Tyr Tyr Trp Glu Glu

805 810 815

Val Arg Asn Tyr Tyr Ala Ala Phe Glu Ser Gly Met Lys Ser Pro Ser

820 825 830

Ala Asp Val Tyr Val His Glu Ile Pro Gly Gly Gln Tyr Ser Asn Leu

835 840 845

Arg Pro Gln Ala Glu Ser Val Gly Val Gly Asp Arg Ile Pro Glu Leu

850 855 860

Lys Arg Met Tyr Ala Val Val Asn Glu Met Leu Gly Asp Ile Val Lys

865 870 875 880

Val Thr Pro Ser Ser Lys Met Val Gly Asp Leu Ala Leu Phe Met Leu

885 890 895

Thr Asn Asn Leu Thr Pro Pro Asp Leu Ile Glu Arg Gly Arg Glu Leu

900 905 910

Thr Phe Pro Glu Ser Val Ile Gly Tyr Phe Ala Gly Glu Ile Gly Gln

915 920 925

Pro Pro Gly Gly Phe Pro Pro Ala Leu Ser Ala Ala Ile Leu Lys Gly

930 935 940

Arg Thr Pro Phe Ala Gly Arg Pro Gly Asp Thr Leu Pro Pro Val Asp

945 950 955 960

Phe Asp Lys Thr Arg Arg Glu Val Glu Thr Lys Val Gly Arg Pro Ala

965 970 975

Ser Glu Gln Asp Val Leu Ser Tyr Leu Met Tyr Pro Lys Val Phe Thr

980 985 990

Asp Phe Ala Ser Tyr Val Lys Lys Tyr Gly Asp Val Ser Ala Val Pro

995 1000 1005

Thr Asp Val Met Phe Tyr Gly Met Arg Lys Gly Asp Glu Thr Glu

1010 1015 1020

Val Glu Ile Glu Arg Gly Lys Thr Leu Phe Ile Arg Leu Gly Ala

1025 1030 1035

Ile Ser Glu Pro Asn Glu Arg Gly Met Arg Thr Leu Phe Phe Glu

1040 1045 1050

Leu Asn Gly His Pro Arg Glu Val Glu Val Leu Asp Lys Lys Leu

1055 1060 1065

Gly Lys Ala Val Ala Ala Arg Pro Lys Ala Asp Lys Asp Asn Leu

1070 1075 1080

His His Leu Gly Ser Pro Met Pro Gly Thr Val Ile Glu Val Lys

1085 1090 1095

Ala Lys Ala Gly Asp Glu Val Lys Glu Gly Asp Lys Leu Val Val

1100 1105 1110

Leu Glu Ala Met Lys Met Glu Met Thr Leu Ala Ser Pro Leu Ala

1115 1120 1125

Gly Val Ile Lys Glu Ile Thr Val Thr Pro Lys Asp Arg Val Asp

1130 1135 1140

Thr Gly Asp Leu Leu Val Val Phe Lys

1145 1150

<210> 262

<211> 3393

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_11序列

<400> 262

atgccggtca gggagtcgct cgtgcgcaag gtcctcgtcg ccaaccgcag tgagatcgcg 60

gtccgcgtca tgcgcgcggc ccacgagatg gacctgctga ccgtcggggt ctacacgccc 120

gaggaccgcg gggcgctgca ccgcaccaag gcgggcgagg cctaccagct gggcgagccg 180

ggccaccccg tgcgcggcta cctcgacgtc gaggcgctgc tcgaggtcgc ccgccgctcc 240

ggcgccgacg cgctgcaccc cggctacggc ttcctgtccg agagcgccgc gctcgccgac 300

gcctgcgcct cggctggcat caccttcgtc gggccgccgg cggacgtgct gcgcctgacc 360

ggtgacaagg tcaccgcccg ccaggcggcg gtggccgcgg gcctgccggt gctgcgcgcc 420

tccgacccgc tgccggacgg gtccggggcg atcgaggcgg ccgaggcggt gggcttcccg 480

ctgttcgtca aggcggcggc cggcggcggc ggccgcggcc tgcggctcgt gcggacgccc 540

gaggagctcg ccgacgcggc gctgtcggcg tcgagggagg cggccgcggc cttcggcgac 600

gggacgatct tcctcgagca ggcggtcgag cggccgcgcc acatcgaggt gcaggtgctc 660

ggcgacacgc acggctcggt ggtccacctg ttcgagcgcg actgctcggt gcagcggcgc 720

caccagaagg tcgtcgagct cgcgccggcg cccgacctgc cggaggccac gcgcacgggc 780

ctgcacgagg cggccctcgc gttcgcccgt tcggtcggct acgtcaacgc cgggacggtg 840

gagttcctcg tgggcgccga cggggcgttc acgttcatgg agatgaaccc ccgcatccag 900

gtcgagcaca ccgtcaccga ggaggtcacc ggcgtcgacc tcgtcggcgc ccagctgcgg 960

gtggccgcgg gggagtcgct cgccgacatc ggcatcgtgc aggagcgcct ggcggtccgc 1020

ggctgcgccg tccagtgccg catcaccacc gaggacccgg ccaacggctt ccgcccggac 1080

accggcacga tcgcgaccta ccagtcgccc ggcggcccgg gcgtgcgcct cgacggcgcg 1140

gtctacgccg gcgccgaggt cacgccgtac ttcgactccc tgctcgtcaa gctcacgacc 1200

cgcgcccccg acctgcgcac cgccgccaac cgcacccggc gggcgctgcg ggagttccgc 1260

gtgcgcgggg tgaagaccaa cgtcgagttc ctctaccgcc tcatggagga cgaggacttc 1320

ctgtccggcg cggtgccgac gtcgttcctc gccgagcacc cgcacctcac ggacgcgccc 1380

gcggtcaccg accggacgac ccggatgctc ggcgcgctgg ccgacgcgac ggtgaacggg 1440

ctgcagcgcc cgtcgcgccc gctgctcgac ccggtcagca agctccccga cctgcccgcc 1500

gccccgccgg tgcagggctc gcggcgcctg ctcgacgagg tcggcccgga gcgctgggcg 1560

caggccctgc gcgaccgcac gtccctcgcg gtgaccgaca cgacgctgcg cgacgcccac 1620

cagtcgctgc tggccacccg gctgcggacc accgacgtcc tcggcgccgc gccgaggacg 1680

gcggagctgc tgcccggcct gctgtcgctc gaggcgtggg gcggcgcgac gtacgacgtg 1740

gcgctgcgct tcctgcacga ggacccctgg cagcggctcg ccgcgctgcg cgaggccgcg 1800

cccgacgtct gcctgcagat gctgctgcgc ggccgcaacg ccgtcggcta cacgccctac 1860

ccggaccggg tcgtgcaggt cttcgtcgcc gaggcggcgg ccaccggcgt cgacgtcttc 1920

cgcgtgttcg acgccctgaa cgacctcgag cagatgcgtc ccgcgctcga cgccgtccgc 1980

gaggccggca aggtcgcgga gggcacgctc tgctacaccg gcgacctgac gaacccgggc 2040

gagcggctct acacgctcga ctactacctg cgcctcgccg aggggctcgt cgaggcgggc 2100

gcgcacgtgc tggccgtcaa ggacatggcc gggctgctgc gcccgcgcgc cgccgacacg 2160

ctggtccggg cgctgcgcag ccgcttcgag ctgcccgtgc acctgcacac ccacgacacg 2220

accggcgggc agctcgcgac gctgctcgcc gccagcgacg cgggggtcga cgccgtcgac 2280

gccgccatgg cgccgatgtc gggcggcacc agccaggtca acctgtcggc cctggtcgcc 2340

gcgaccgacc acaccgagcg gtccacgggc ctgtcgctgg ccgcgctgtc ggcgctcgag 2400

ccgtactggg aggcggtgcg cgacctctac gcgccgttcg aggcgggcct gcgggcgccg 2460

accggcaccg tctaccgcca cgagatcccg ggcggccagc tcaccaacct gcgccagcag 2520

gcgatcgcgc tcggcctcgg cgaccgctgg gaggacgtcc aggagctgta cgccgtcgcc 2580

aacgagctgc tcggcaagcc gatcaaggtg acgccgacga gcaaggtcgt cggcgacctg 2640

gcgatcttcc tggccagcgg cgacgtcgac gtcgagcggc tgcgcgagga cccgggggcg 2700

tacgacctgc cggccagcgt gctcggctac ctcgccggcg agctgggcac gccgcccgcc 2760

ggcttccccg agccgttccg ctccaaggcg gtcgcgggcc gtgcggagga gctgccggag 2820

gtcgtcctcg acccggccga cgacgcggcc ctcgacggcg agcagcggcg cgacgtgctg 2880

tcgcggctgc tgttctccgg cccgtggaag gactaccagt cggcgctcgc cgagcacggc 2940

gacgtctcga tgatccccac ggaggcgttc ttctacggcc tgcagcccgg cgggacggtc 3000

accgtctgcc tcgaggccgg ggtcgaggtg ctcgtcgagc tgcagacggt cggcgagctg 3060

tccaaggacg gcctgcggac cctccacgtg cgcgtgaacg gccagccccg gccggtgcag 3120

gtgcgcgacc gctcggtcaa ggtcgccgac acggccgcgc gccgcgccga ccccggcgac 3180

ccccgccacg tcggcgcggc cctgcccggg ctcgtcctgc cgaaggtcgc cgtcggcgac 3240

cgggtgacca agggacaggc gctggccgtc gtcgaggcga tgaagatgga gtcgaccgtc 3300

tcgagccccg ccgacgggac cgtggccgag gtcgccgtga cggccggcac caacgtcgag 3360

gtgggcgacc tgctcgtggt cctgggcgac tga 3393

<210> 263

<211> 1130

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_11序列

<400> 263

Met Pro Val Arg Glu Ser Leu Val Arg Lys Val Leu Val Ala Asn Arg

1 5 10 15

Ser Glu Ile Ala Val Arg Val Met Arg Ala Ala His Glu Met Asp Leu

20 25 30

Leu Thr Val Gly Val Tyr Thr Pro Glu Asp Arg Gly Ala Leu His Arg

35 40 45

Thr Lys Ala Gly Glu Ala Tyr Gln Leu Gly Glu Pro Gly His Pro Val

50 55 60

Arg Gly Tyr Leu Asp Val Glu Ala Leu Leu Glu Val Ala Arg Arg Ser

65 70 75 80

Gly Ala Asp Ala Leu His Pro Gly Tyr Gly Phe Leu Ser Glu Ser Ala

85 90 95

Ala Leu Ala Asp Ala Cys Ala Ser Ala Gly Ile Thr Phe Val Gly Pro

100 105 110

Pro Ala Asp Val Leu Arg Leu Thr Gly Asp Lys Val Thr Ala Arg Gln

115 120 125

Ala Ala Val Ala Ala Gly Leu Pro Val Leu Arg Ala Ser Asp Pro Leu

130 135 140

Pro Asp Gly Ser Gly Ala Ile Glu Ala Ala Glu Ala Val Gly Phe Pro

145 150 155 160

Leu Phe Val Lys Ala Ala Ala Gly Gly Gly Gly Arg Gly Leu Arg Leu

165 170 175

Val Arg Thr Pro Glu Glu Leu Ala Asp Ala Ala Leu Ser Ala Ser Arg

180 185 190

Glu Ala Ala Ala Ala Phe Gly Asp Gly Thr Ile Phe Leu Glu Gln Ala

195 200 205

Val Glu Arg Pro Arg His Ile Glu Val Gln Val Leu Gly Asp Thr His

210 215 220

Gly Ser Val Val His Leu Phe Glu Arg Asp Cys Ser Val Gln Arg Arg

225 230 235 240

His Gln Lys Val Val Glu Leu Ala Pro Ala Pro Asp Leu Pro Glu Ala

245 250 255

Thr Arg Thr Gly Leu His Glu Ala Ala Leu Ala Phe Ala Arg Ser Val

260 265 270

Gly Tyr Val Asn Ala Gly Thr Val Glu Phe Leu Val Gly Ala Asp Gly

275 280 285

Ala Phe Thr Phe Met Glu Met Asn Pro Arg Ile Gln Val Glu His Thr

290 295 300

Val Thr Glu Glu Val Thr Gly Val Asp Leu Val Gly Ala Gln Leu Arg

305 310 315 320

Val Ala Ala Gly Glu Ser Leu Ala Asp Ile Gly Ile Val Gln Glu Arg

325 330 335

Leu Ala Val Arg Gly Cys Ala Val Gln Cys Arg Ile Thr Thr Glu Asp

340 345 350

Pro Ala Asn Gly Phe Arg Pro Asp Thr Gly Thr Ile Ala Thr Tyr Gln

355 360 365

Ser Pro Gly Gly Pro Gly Val Arg Leu Asp Gly Ala Val Tyr Ala Gly

370 375 380

Ala Glu Val Thr Pro Tyr Phe Asp Ser Leu Leu Val Lys Leu Thr Thr

385 390 395 400

Arg Ala Pro Asp Leu Arg Thr Ala Ala Asn Arg Thr Arg Arg Ala Leu

405 410 415

Arg Glu Phe Arg Val Arg Gly Val Lys Thr Asn Val Glu Phe Leu Tyr

420 425 430

Arg Leu Met Glu Asp Glu Asp Phe Leu Ser Gly Ala Val Pro Thr Ser

435 440 445

Phe Leu Ala Glu His Pro His Leu Thr Asp Ala Pro Ala Val Thr Asp

450 455 460

Arg Thr Thr Arg Met Leu Gly Ala Leu Ala Asp Ala Thr Val Asn Gly

465 470 475 480

Leu Gln Arg Pro Ser Arg Pro Leu Leu Asp Pro Val Ser Lys Leu Pro

485 490 495

Asp Leu Pro Ala Ala Pro Pro Val Gln Gly Ser Arg Arg Leu Leu Asp

500 505 510

Glu Val Gly Pro Glu Arg Trp Ala Gln Ala Leu Arg Asp Arg Thr Ser

515 520 525

Leu Ala Val Thr Asp Thr Thr Leu Arg Asp Ala His Gln Ser Leu Leu

530 535 540

Ala Thr Arg Leu Arg Thr Thr Asp Val Leu Gly Ala Ala Pro Arg Thr

545 550 555 560

Ala Glu Leu Leu Pro Gly Leu Leu Ser Leu Glu Ala Trp Gly Gly Ala

565 570 575

Thr Tyr Asp Val Ala Leu Arg Phe Leu His Glu Asp Pro Trp Gln Arg

580 585 590

Leu Ala Ala Leu Arg Glu Ala Ala Pro Asp Val Cys Leu Gln Met Leu

595 600 605

Leu Arg Gly Arg Asn Ala Val Gly Tyr Thr Pro Tyr Pro Asp Arg Val

610 615 620

Val Gln Val Phe Val Ala Glu Ala Ala Ala Thr Gly Val Asp Val Phe

625 630 635 640

Arg Val Phe Asp Ala Leu Asn Asp Leu Glu Gln Met Arg Pro Ala Leu

645 650 655

Asp Ala Val Arg Glu Ala Gly Lys Val Ala Glu Gly Thr Leu Cys Tyr

660 665 670

Thr Gly Asp Leu Thr Asn Pro Gly Glu Arg Leu Tyr Thr Leu Asp Tyr

675 680 685

Tyr Leu Arg Leu Ala Glu Gly Leu Val Glu Ala Gly Ala His Val Leu

690 695 700

Ala Val Lys Asp Met Ala Gly Leu Leu Arg Pro Arg Ala Ala Asp Thr

705 710 715 720

Leu Val Arg Ala Leu Arg Ser Arg Phe Glu Leu Pro Val His Leu His

725 730 735

Thr His Asp Thr Thr Gly Gly Gln Leu Ala Thr Leu Leu Ala Ala Ser

740 745 750

Asp Ala Gly Val Asp Ala Val Asp Ala Ala Met Ala Pro Met Ser Gly

755 760 765

Gly Thr Ser Gln Val Asn Leu Ser Ala Leu Val Ala Ala Thr Asp His

770 775 780

Thr Glu Arg Ser Thr Gly Leu Ser Leu Ala Ala Leu Ser Ala Leu Glu

785 790 795 800

Pro Tyr Trp Glu Ala Val Arg Asp Leu Tyr Ala Pro Phe Glu Ala Gly

805 810 815

Leu Arg Ala Pro Thr Gly Thr Val Tyr Arg His Glu Ile Pro Gly Gly

820 825 830

Gln Leu Thr Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu Gly Asp

835 840 845

Arg Trp Glu Asp Val Gln Glu Leu Tyr Ala Val Ala Asn Glu Leu Leu

850 855 860

Gly Lys Pro Ile Lys Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu

865 870 875 880

Ala Ile Phe Leu Ala Ser Gly Asp Val Asp Val Glu Arg Leu Arg Glu

885 890 895

Asp Pro Gly Ala Tyr Asp Leu Pro Ala Ser Val Leu Gly Tyr Leu Ala

900 905 910

Gly Glu Leu Gly Thr Pro Pro Ala Gly Phe Pro Glu Pro Phe Arg Ser

915 920 925

Lys Ala Val Ala Gly Arg Ala Glu Glu Leu Pro Glu Val Val Leu Asp

930 935 940

Pro Ala Asp Asp Ala Ala Leu Asp Gly Glu Gln Arg Arg Asp Val Leu

945 950 955 960

Ser Arg Leu Leu Phe Ser Gly Pro Trp Lys Asp Tyr Gln Ser Ala Leu

965 970 975

Ala Glu His Gly Asp Val Ser Met Ile Pro Thr Glu Ala Phe Phe Tyr

980 985 990

Gly Leu Gln Pro Gly Gly Thr Val Thr Val Cys Leu Glu Ala Gly Val

995 1000 1005

Glu Val Leu Val Glu Leu Gln Thr Val Gly Glu Leu Ser Lys Asp

1010 1015 1020

Gly Leu Arg Thr Leu His Val Arg Val Asn Gly Gln Pro Arg Pro

1025 1030 1035

Val Gln Val Arg Asp Arg Ser Val Lys Val Ala Asp Thr Ala Ala

1040 1045 1050

Arg Arg Ala Asp Pro Gly Asp Pro Arg His Val Gly Ala Ala Leu

1055 1060 1065

Pro Gly Leu Val Leu Pro Lys Val Ala Val Gly Asp Arg Val Thr

1070 1075 1080

Lys Gly Gln Ala Leu Ala Val Val Glu Ala Met Lys Met Glu Ser

1085 1090 1095

Thr Val Ser Ser Pro Ala Asp Gly Thr Val Ala Glu Val Ala Val

1100 1105 1110

Thr Ala Gly Thr Asn Val Glu Val Gly Asp Leu Leu Val Val Leu

1115 1120 1125

Gly Asp

1130

<210> 264

<211> 3384

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc12序列

<400> 264

atgatcgaga aggtgctggt cgccaatcgc ggcgagatcg cgacccgcgc cttccgagcg 60

gcgaatgagc ttcggatccg cagcgtggcg ttgtacgcgc cggaggatcg cgactcggtc 120

catcgcgtaa aggccgacga ggcgtacgag atcggtgcgc cgggtcatcc ggtcagcacc 180

tacctggacc ctgacatcgc ggtcgcgctc gcgctgcggg tcggcgccga cgcgatctac 240

ccgggctacg gcttcatgtc cgaaaacccg gagctcgctc gagcctgcgt cgctgccgga 300

ttggtgttcg tcgggccgcc accggaggtg ctcggtctcg ccggcgacaa gacgcgcgcg 360

cgaacggcgg cgcgcgaggc gggcgtcccg gtgctcgacg cttcagagcc ggtcgagaac 420

gccgaagctg cgctggcggc agccgagaag atcggcttcc cggtgttcgt gaaggcgtcg 480

cacggcggcg gcgggcgcgg catgcgcctc gtgaccgatc cggcgcgcct cgcggcgtcg 540

ctggaggagg cgcgcaacga ggcggaggcg gcgttcggcg acggcacggt ctacctcgag 600

caggcgctcg tgcgcccgcg ccacatcgag gttcagctgc tggccgacgc gaccggcgac 660

gtcgtgcatc tctacgagcg cgactgctca ttgcagcgcc ggcatcagaa ggtgatcgag 720

atcacaccgg caccgaacct cgagccggag ctgcgcgacc gcatctgcgc cgacgccgtc 780

cgcttcgccc gccacgtggg gtacgtcaac gcgggcacgg tcgagttcct gctcgacgag 840

gccaacgggc gctacgcgtt catcgagatg aaccctcgca ttcaggtcga gcacacggtc 900

accgaggaga cgaccgacat cgacctcgtg cgcgcacaac tgcagatagc cggcggcgag 960

acgctcgccg gactcggcgt gcgccaggac gacatccgcc agcgcggctt cgcgctgcag 1020

tgccgggtga cgacggagga ccccgccaac gggttccgcc ccgactccgg ccgcatcacc 1080

gcgtaccgat cccccggagg ggcgggcgtg cggctcgacg agggctcagc cttcgtcggc 1140

gccgaggtct cgccgttctt cgacccgctg ctggtgaaga tctccgcgcg cgggcgtgat 1200

ctgcacagcg cggtctcacg cgcgcggcgc gccgtcgccg agctgcgagt acgcggtgtc 1260

aagaccaacc agggcttcct gctcgcgctg ctcaacgacc ccgacgtcct cgctgggcgc 1320

acgcacacca cgttcgtcga cgagcgtccc gacctctcga ccgccggccc cggcggcgac 1380

cgcgccagcc gactgctcaa acgcctcgcc gaggtcacgg tcaaccacga gcctgccagc 1440

tccgccctcg ccggcgatcc gcgcgcgaag ctcccagcgc ccccgacggg cgcgccgccc 1500

gccgggtcgc gccagaaact gctcgacctc ggcccgtcca cgttcgccgc ggcgctgcgc 1560

ggacagcagg cgatcgcgct caccgacacc acgctccgtg acgcccacca gtcactgttc 1620

gccacgcgta tgcgcacgcg cgacatgctc cccgtagcac cgcacctcgc gcacgaactg 1680

ccgcagctgc tgtcgcttga ggtgtggggc ggcgcaacct tcgatgtcgc gctgcgcttc 1740

ctgcacgagg acccgtggga ccggctcgtg cagctacgcg aactggtccc caacgtgtgc 1800

ctgcagatgc tcctgcgcgg ccagaacctg ctcgcctact cccgctttcc caccagggtg 1860

gtgcgtgcat tcgtcgccga ggcggtcgag gccggcatcg acgtcttccg catcttcgac 1920

gcgctcaacg acatcgaagg catgcgctcc gcgatcgagg caacgctcga gacgcccgcg 1980

ctagccgaag gaaccctgtg ttacacgggc gacctgagcg acccgcgcga gcggctctac 2040

accctcgact attacctgcg cctcgcccag cagctggtcg acgccggtgt acacatgctc 2100

gccatcaagg acatggccgg gctgctgagg gcacccgccg cacacacgct cgtgaccgcg 2160

ctgcaccgcg agttcgaact gccggtgcac ctgcacacac acgacaccgc cggcgggcaa 2220

ctcgccacct acctcgccgc catcgaggcc ggcgtcgacg ccgtcgatgg cgccgccgcg 2280

ccgatggcgg gcatgaccag ccagccctcc ctggcggcga tcgtcgccgc caccgcgacg 2340

accgagcgcg actcgggcat cgcgctcgac gcgctcctgg accaagagcc ctactgggag 2400

tcggtgcgca cgctctacgg cgcgttcgag accggcctga aggcgccgac tggtcgcgtc 2460

taccgccacg agatccccgg tggccagctc tccaacctgc gccaacaagc ggacgcggtc 2520

ggcctcacgg gccgcttcga cgagatcgaa cgcgcctacg agcgagccaa ccgactgctc 2580

ggcaacgtgg tcaaggtcac gccctcgagc aaggtcgtcg gcgacctcgc cctgtttgcg 2640

gtctcagccg gcatcgactt cgacgagctc gaacgccgac ccggctcctt cgacctcccc 2700

gactccgtca tcgacttcct gcgcggcggg atcggcaccc cacccggcgg cttcccacaa 2760

cccttcaccg acctggcact cgccggtcgc cccgcgccgc cggcacccac ggagctcgac 2820

cccgagctcg ccgaccggct acagcaaccc ggcgcacctc gtcgcggggc gctcgccgag 2880

atcctcttcc ccgggccggc gtccgacttc gccgccgccc gcgccacgtt cggcgacgtc 2940

tcgctgatcc ccacgcccgc gttcttccgc ggcctgcacg aggacgaaga actggcgatc 3000

gacctcgcac ccggcgtacg cctgctcttc gaactcgaag ccatcggcga acccgacaag 3060

cgcggcatgc ggaccgtcct ggtacgcgtc aacggccagc tgcgccccgt cgaagtgcgc 3120

gaccactccg tcaagaccac cggtgtgcag atcgaacgcg cggaccccaa acgcccaggc 3180

cacgtcccgg cgccagtgac cgggatcgtg tccctgctcg tcgccgcggg cgacaccgtg 3240

tccgagggcg acccgatcgc aacgctcgaa gccatgaaga tggagtccac gatctccgcg 3300

ccgctcgccg gccgcgtgca acgcctcgcc gtcaccacgg gtgcgcgcct ggaacagggg 3360

gacctcctgc tcgtcatcga ctag 3384

<210> 265

<211> 1127

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_12序列

<400> 265

Met Ile Glu Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Thr Arg

1 5 10 15

Ala Phe Arg Ala Ala Asn Glu Leu Arg Ile Arg Ser Val Ala Leu Tyr

20 25 30

Ala Pro Glu Asp Arg Asp Ser Val His Arg Val Lys Ala Asp Glu Ala

35 40 45

Tyr Glu Ile Gly Ala Pro Gly His Pro Val Ser Thr Tyr Leu Asp Pro

50 55 60

Asp Ile Ala Val Ala Leu Ala Leu Arg Val Gly Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Met Ser Glu Asn Pro Glu Leu Ala Arg Ala Cys

85 90 95

Val Ala Ala Gly Leu Val Phe Val Gly Pro Pro Pro Glu Val Leu Gly

100 105 110

Leu Ala Gly Asp Lys Thr Arg Ala Arg Thr Ala Ala Arg Glu Ala Gly

115 120 125

Val Pro Val Leu Asp Ala Ser Glu Pro Val Glu Asn Ala Glu Ala Ala

130 135 140

Leu Ala Ala Ala Glu Lys Ile Gly Phe Pro Val Phe Val Lys Ala Ser

145 150 155 160

His Gly Gly Gly Gly Arg Gly Met Arg Leu Val Thr Asp Pro Ala Arg

165 170 175

Leu Ala Ala Ser Leu Glu Glu Ala Arg Asn Glu Ala Glu Ala Ala Phe

180 185 190

Gly Asp Gly Thr Val Tyr Leu Glu Gln Ala Leu Val Arg Pro Arg His

195 200 205

Ile Glu Val Gln Leu Leu Ala Asp Ala Thr Gly Asp Val Val His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Leu Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Ile Thr Pro Ala Pro Asn Leu Glu Pro Glu Leu Arg Asp Arg Ile Cys

245 250 255

Ala Asp Ala Val Arg Phe Ala Arg His Val Gly Tyr Val Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Leu Asp Glu Ala Asn Gly Arg Tyr Ala Phe Ile

275 280 285

Glu Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr

290 295 300

Thr Asp Ile Asp Leu Val Arg Ala Gln Leu Gln Ile Ala Gly Gly Glu

305 310 315 320

Thr Leu Ala Gly Leu Gly Val Arg Gln Asp Asp Ile Arg Gln Arg Gly

325 330 335

Phe Ala Leu Gln Cys Arg Val Thr Thr Glu Asp Pro Ala Asn Gly Phe

340 345 350

Arg Pro Asp Ser Gly Arg Ile Thr Ala Tyr Arg Ser Pro Gly Gly Ala

355 360 365

Gly Val Arg Leu Asp Glu Gly Ser Ala Phe Val Gly Ala Glu Val Ser

370 375 380

Pro Phe Phe Asp Pro Leu Leu Val Lys Ile Ser Ala Arg Gly Arg Asp

385 390 395 400

Leu His Ser Ala Val Ser Arg Ala Arg Arg Ala Val Ala Glu Leu Arg

405 410 415

Val Arg Gly Val Lys Thr Asn Gln Gly Phe Leu Leu Ala Leu Leu Asn

420 425 430

Asp Pro Asp Val Leu Ala Gly Arg Thr His Thr Thr Phe Val Asp Glu

435 440 445

Arg Pro Asp Leu Ser Thr Ala Gly Pro Gly Gly Asp Arg Ala Ser Arg

450 455 460

Leu Leu Lys Arg Leu Ala Glu Val Thr Val Asn His Glu Pro Ala Ser

465 470 475 480

Ser Ala Leu Ala Gly Asp Pro Arg Ala Lys Leu Pro Ala Pro Pro Thr

485 490 495

Gly Ala Pro Pro Ala Gly Ser Arg Gln Lys Leu Leu Asp Leu Gly Pro

500 505 510

Ser Thr Phe Ala Ala Ala Leu Arg Gly Gln Gln Ala Ile Ala Leu Thr

515 520 525

Asp Thr Thr Leu Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Met

530 535 540

Arg Thr Arg Asp Met Leu Pro Val Ala Pro His Leu Ala His Glu Leu

545 550 555 560

Pro Gln Leu Leu Ser Leu Glu Val Trp Gly Gly Ala Thr Phe Asp Val

565 570 575

Ala Leu Arg Phe Leu His Glu Asp Pro Trp Asp Arg Leu Val Gln Leu

580 585 590

Arg Glu Leu Val Pro Asn Val Cys Leu Gln Met Leu Leu Arg Gly Gln

595 600 605

Asn Leu Leu Ala Tyr Ser Arg Phe Pro Thr Arg Val Val Arg Ala Phe

610 615 620

Val Ala Glu Ala Val Glu Ala Gly Ile Asp Val Phe Arg Ile Phe Asp

625 630 635 640

Ala Leu Asn Asp Ile Glu Gly Met Arg Ser Ala Ile Glu Ala Thr Leu

645 650 655

Glu Thr Pro Ala Leu Ala Glu Gly Thr Leu Cys Tyr Thr Gly Asp Leu

660 665 670

Ser Asp Pro Arg Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu

675 680 685

Ala Gln Gln Leu Val Asp Ala Gly Val His Met Leu Ala Ile Lys Asp

690 695 700

Met Ala Gly Leu Leu Arg Ala Pro Ala Ala His Thr Leu Val Thr Ala

705 710 715 720

Leu His Arg Glu Phe Glu Leu Pro Val His Leu His Thr His Asp Thr

725 730 735

Ala Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Ile Glu Ala Gly Val

740 745 750

Asp Ala Val Asp Gly Ala Ala Ala Pro Met Ala Gly Met Thr Ser Gln

755 760 765

Pro Ser Leu Ala Ala Ile Val Ala Ala Thr Ala Thr Thr Glu Arg Asp

770 775 780

Ser Gly Ile Ala Leu Asp Ala Leu Leu Asp Gln Glu Pro Tyr Trp Glu

785 790 795 800

Ser Val Arg Thr Leu Tyr Gly Ala Phe Glu Thr Gly Leu Lys Ala Pro

805 810 815

Thr Gly Arg Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Ser Asn

820 825 830

Leu Arg Gln Gln Ala Asp Ala Val Gly Leu Thr Gly Arg Phe Asp Glu

835 840 845

Ile Glu Arg Ala Tyr Glu Arg Ala Asn Arg Leu Leu Gly Asn Val Val

850 855 860

Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu Phe Ala

865 870 875 880

Val Ser Ala Gly Ile Asp Phe Asp Glu Leu Glu Arg Arg Pro Gly Ser

885 890 895

Phe Asp Leu Pro Asp Ser Val Ile Asp Phe Leu Arg Gly Gly Ile Gly

900 905 910

Thr Pro Pro Gly Gly Phe Pro Gln Pro Phe Thr Asp Leu Ala Leu Ala

915 920 925

Gly Arg Pro Ala Pro Pro Ala Pro Thr Glu Leu Asp Pro Glu Leu Ala

930 935 940

Asp Arg Leu Gln Gln Pro Gly Ala Pro Arg Arg Gly Ala Leu Ala Glu

945 950 955 960

Ile Leu Phe Pro Gly Pro Ala Ser Asp Phe Ala Ala Ala Arg Ala Thr

965 970 975

Phe Gly Asp Val Ser Leu Ile Pro Thr Pro Ala Phe Phe Arg Gly Leu

980 985 990

His Glu Asp Glu Glu Leu Ala Ile Asp Leu Ala Pro Gly Val Arg Leu

995 1000 1005

Leu Phe Glu Leu Glu Ala Ile Gly Glu Pro Asp Lys Arg Gly Met

1010 1015 1020

Arg Thr Val Leu Val Arg Val Asn Gly Gln Leu Arg Pro Val Glu

1025 1030 1035

Val Arg Asp His Ser Val Lys Thr Thr Gly Val Gln Ile Glu Arg

1040 1045 1050

Ala Asp Pro Lys Arg Pro Gly His Val Pro Ala Pro Val Thr Gly

1055 1060 1065

Ile Val Ser Leu Leu Val Ala Ala Gly Asp Thr Val Ser Glu Gly

1070 1075 1080

Asp Pro Ile Ala Thr Leu Glu Ala Met Lys Met Glu Ser Thr Ile

1085 1090 1095

Ser Ala Pro Leu Ala Gly Arg Val Gln Arg Leu Ala Val Thr Thr

1100 1105 1110

Gly Ala Arg Leu Glu Gln Gly Asp Leu Leu Leu Val Ile Asp

1115 1120 1125

<210> 266

<211> 3429

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_13序列

<400> 266

gtgctgaaga aggttctggt tgccaaccgg ggagagatag ccatacgcgc cttccgggcc 60

gcttacgagc taggtatccg caccgtcgcg gtctacacgc cagaggacag ggactcgttg 120

caccggcaga aggccgacga ggcgtacggg atcggcgagg cggggcaccc ggtgagagct 180

tacctggacg tcgagacgct ggtcgataag gccctggagg tcggggccga ctcgatctac 240

ccgggctacg gctttctctc cgagagcgca aagctcgcgt cggcgtgcga ggaggccggc 300

cttaccttcg tgggaccccc gagcgaggtc ctctccctta ccggagacaa gatcgaggcg 360

cgagaggcgg cggagtccgc cggaatctcg atcacccaag cgtcggggct gatctctgac 420

cccaacgagg cttcggaggc ggccgaggag gttgggtatc cgctcttcgt gaaggcagcc 480

ggtgggggag ggggcagagg tatgcgcttg gtcagggatg ccggcgacct gcaagaggcg 540

gtggatgcgg cgacgagcga ggcggagtcc gcgttcggtg acccgtcggt ctttctggag 600

caggccctcg tgagaccccg gcacatagag atccaggtgc tcgccgacgc ggaaggggag 660

gttatacacc tctacgagcg cgattgctcg gtgcagcggc gtcaccagaa ggtcctggag 720

atggcgccgg cgccgaacct ggatccggac ctcagggacc gattgtgcga ggatgcggtc 780

cgcttcgccc gcgaggtcgg ctacctcaac gccggtacgg tcgagttttt ggtcggggaa 840

gacggtgagt acgctttcat cgagatgaac ccccgcatcc aggtcgaaca caccgtcacc 900

gaggagacca ccgacgtaga cttggtcagc gcccagctga ggatcgccgg gggggagacg 960

ctggaggatc tgggtctctc ccaggaaggt atagagcagc gcggtgttgc tttgcagtgc 1020

cgggtgacca ccgaagaccc ggcgcagaac tttcagccgg atacgggccg gatctcggcc 1080

taccgctcgc cgggcgggtt gggtatccgc acggacggcg gcaccgtcta ctccggcgcc 1140

gaggtcagcc cgtacttcga cccgctcctc gtcaaggtca ccgctcgtgg tcccgacctg 1200

ctcaccgcgg cgaggagggc gagtagggca ctcgccgaga tacgggtgcg cggcctggca 1260

acgaacgtgg ccttcctccg ggccgtcctc aacgacgacg actttctggc cgggcggacg 1320

aacacctcct ttatagacga gaggccgcac cttacccagg cctacgcggg tagggaccgg 1380

gcgacccgcc tgctctcgct gctcgccgac gtcaccgtca atcggccaaa cggcccgcca 1440

cccgaagcgc ccgacccgcg caccaagctg ccgtcgctac cggagggtga cgcgccggcc 1500

ggcactaagc agaagctaga cgagctcggc ccggagggct tcgcccgctg gatgcgcgag 1560

tccgaggccc tgctcgtcac tgataccacc atgcgcgacg cccaccagtc cctcttcgcg 1620

acccggatgc gaaccttcga catgctcgcc gtcgcccctc acctggcacg gatgcttccg 1680

cagatcttct ccgccgaggt gtggggtggg gcgaccttcg acgtggctct gcgctttctg 1740

cgcgaggatc cctgggaacg gctgggccgt ctgcgggagg cgctcccgaa cacgtgcctg 1800

cagatgctcc tccggggcca gaatgccgtc ggctacacga cctacccgga cgacgtgcta 1860

aaggccttcg tcgccgaaac cgccgagacc ggcctcgaca tcttccgcgt cttcgacgcc 1920

aacaacgaca tccgcaggat gcgaccggcc atagaggcgg tgctcgagac cgacgccgtc 1980

gcggagggtg cgatctccta caccggcgac ctctcgaacc cggacgagga gctctacacc 2040

ctcgactact acctgcggct cgccgaggag ttggtcgagg ccggctcgca cgtcctgtgc 2100

ataaaggaca tggccggcct cctgcgcgcc cccgccgccg agaagctcat aagttctctg 2160

cggagcgagt tcgacctacc ggtccacctg cacacccacg acaccgccgg cggccaactc 2220

gccacctatc tcgcggcctt acgagcgggg gtagacgccg tagacggcgc cgccgccccg 2280

atgtccggga tgacgagcca gccgtccctg gcggccatag tggcgacgac cgagcacacg 2340

gagcgcgaaa ccggcctctc gctcgacgcc ctgggcgatc tggagcccta ctgggaggcg 2400

gtgagggacc tctacgcgcc cttcgagtct gggctgcgct caccgacggg taccgtttat 2460

cagcacgaga taccgggcgg ccagctctcg aacctgcgcg tgcaggcgac ggccctgggg 2520

ctcggggaac gcttcgagga gatagagtac gcctacgccc gctgcgacga gctattgggg 2580

cacctggtca aggttacgcc cacgagcaag gtcgtcggcg acctagctct gtatctcgtc 2640

tcctccaaca tagatcccgg tgagttcgag gaggaccccg ccgactacga cctgccggag 2700

agcgtgatcg gctttctgcg cggggagatc ggcgagccgc cgggtggctg gcccgaaccc 2760

ttgcgctccg aggtgctctc gcgacaggat gagaacggct cgtcctcggc cgggccctcc 2820

gaggacgggt cctccgggga tgagcaactc cccgaggagg accgggaggc gctcgcgcag 2880

gccgagcgtg gctcggagcg acgggccgcg ctgaaccgat tgcttctgcc cgacccggcc 2940

gccggtaagg aggaggccga ggagagatac ggcgacgtct ccgtgatacc cacgaagccc 3000

tttttctacg ggctggagac cgggcaggag ctcgacctgg acctcgagcc cggcgtcagg 3060

ttgcacgtgg ggctggaggc gatctcggag gccgaccagc gcggcatccg ggccctcatc 3120

gtcacggtca acggccaatc gaggagcgta gacgctcagg accgctcgct ggagccggag 3180

acgccgagca cggagaaggc cgacccgaac gaggaaggac acgtcgcagc cccgatgacg 3240

ggcgcggtga cgctcgccgt cgaggagggc gaggaggtcg aggagggcca gcagctcggc 3300

acgatggagg cgatgaagat ggagtccgcc ataagcgccc ccgtctcggg aaccatcgag 3360

cgcatcgccg tcccctccgg caccaacgtc gagtccggcg acctcctgct cgtactggag 3420

acctcttga 3429

<210> 267

<211> 1142

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_13序列

<400> 267

Val Leu Lys Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Ile Arg Thr Val Ala Val Tyr

20 25 30

Thr Pro Glu Asp Arg Asp Ser Leu His Arg Gln Lys Ala Asp Glu Ala

35 40 45

Tyr Gly Ile Gly Glu Ala Gly His Pro Val Arg Ala Tyr Leu Asp Val

50 55 60

Glu Thr Leu Val Asp Lys Ala Leu Glu Val Gly Ala Asp Ser Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Ser Ala Lys Leu Ala Ser Ala Cys

85 90 95

Glu Glu Ala Gly Leu Thr Phe Val Gly Pro Pro Ser Glu Val Leu Ser

100 105 110

Leu Thr Gly Asp Lys Ile Glu Ala Arg Glu Ala Ala Glu Ser Ala Gly

115 120 125

Ile Ser Ile Thr Gln Ala Ser Gly Leu Ile Ser Asp Pro Asn Glu Ala

130 135 140

Ser Glu Ala Ala Glu Glu Val Gly Tyr Pro Leu Phe Val Lys Ala Ala

145 150 155 160

Gly Gly Gly Gly Gly Arg Gly Met Arg Leu Val Arg Asp Ala Gly Asp

165 170 175

Leu Gln Glu Ala Val Asp Ala Ala Thr Ser Glu Ala Glu Ser Ala Phe

180 185 190

Gly Asp Pro Ser Val Phe Leu Glu Gln Ala Leu Val Arg Pro Arg His

195 200 205

Ile Glu Ile Gln Val Leu Ala Asp Ala Glu Gly Glu Val Ile His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Leu Glu

225 230 235 240

Met Ala Pro Ala Pro Asn Leu Asp Pro Asp Leu Arg Asp Arg Leu Cys

245 250 255

Glu Asp Ala Val Arg Phe Ala Arg Glu Val Gly Tyr Leu Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Gly Glu Asp Gly Glu Tyr Ala Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr Thr

290 295 300

Asp Val Asp Leu Val Ser Ala Gln Leu Arg Ile Ala Gly Gly Glu Thr

305 310 315 320

Leu Glu Asp Leu Gly Leu Ser Gln Glu Gly Ile Glu Gln Arg Gly Val

325 330 335

Ala Leu Gln Cys Arg Val Thr Thr Glu Asp Pro Ala Gln Asn Phe Gln

340 345 350

Pro Asp Thr Gly Arg Ile Ser Ala Tyr Arg Ser Pro Gly Gly Leu Gly

355 360 365

Ile Arg Thr Asp Gly Gly Thr Val Tyr Ser Gly Ala Glu Val Ser Pro

370 375 380

Tyr Phe Asp Pro Leu Leu Val Lys Val Thr Ala Arg Gly Pro Asp Leu

385 390 395 400

Leu Thr Ala Ala Arg Arg Ala Ser Arg Ala Leu Ala Glu Ile Arg Val

405 410 415

Arg Gly Leu Ala Thr Asn Val Ala Phe Leu Arg Ala Val Leu Asn Asp

420 425 430

Asp Asp Phe Leu Ala Gly Arg Thr Asn Thr Ser Phe Ile Asp Glu Arg

435 440 445

Pro His Leu Thr Gln Ala Tyr Ala Gly Arg Asp Arg Ala Thr Arg Leu

450 455 460

Leu Ser Leu Leu Ala Asp Val Thr Val Asn Arg Pro Asn Gly Pro Pro

465 470 475 480

Pro Glu Ala Pro Asp Pro Arg Thr Lys Leu Pro Ser Leu Pro Glu Gly

485 490 495

Asp Ala Pro Ala Gly Thr Lys Gln Lys Leu Asp Glu Leu Gly Pro Glu

500 505 510

Gly Phe Ala Arg Trp Met Arg Glu Ser Glu Ala Leu Leu Val Thr Asp

515 520 525

Thr Thr Met Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Met Arg

530 535 540

Thr Phe Asp Met Leu Ala Val Ala Pro His Leu Ala Arg Met Leu Pro

545 550 555 560

Gln Ile Phe Ser Ala Glu Val Trp Gly Gly Ala Thr Phe Asp Val Ala

565 570 575

Leu Arg Phe Leu Arg Glu Asp Pro Trp Glu Arg Leu Gly Arg Leu Arg

580 585 590

Glu Ala Leu Pro Asn Thr Cys Leu Gln Met Leu Leu Arg Gly Gln Asn

595 600 605

Ala Val Gly Tyr Thr Thr Tyr Pro Asp Asp Val Leu Lys Ala Phe Val

610 615 620

Ala Glu Thr Ala Glu Thr Gly Leu Asp Ile Phe Arg Val Phe Asp Ala

625 630 635 640

Asn Asn Asp Ile Arg Arg Met Arg Pro Ala Ile Glu Ala Val Leu Glu

645 650 655

Thr Asp Ala Val Ala Glu Gly Ala Ile Ser Tyr Thr Gly Asp Leu Ser

660 665 670

Asn Pro Asp Glu Glu Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu Ala

675 680 685

Glu Glu Leu Val Glu Ala Gly Ser His Val Leu Cys Ile Lys Asp Met

690 695 700

Ala Gly Leu Leu Arg Ala Pro Ala Ala Glu Lys Leu Ile Ser Ser Leu

705 710 715 720

Arg Ser Glu Phe Asp Leu Pro Val His Leu His Thr His Asp Thr Ala

725 730 735

Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Leu Arg Ala Gly Val Asp

740 745 750

Ala Val Asp Gly Ala Ala Ala Pro Met Ser Gly Met Thr Ser Gln Pro

755 760 765

Ser Leu Ala Ala Ile Val Ala Thr Thr Glu His Thr Glu Arg Glu Thr

770 775 780

Gly Leu Ser Leu Asp Ala Leu Gly Asp Leu Glu Pro Tyr Trp Glu Ala

785 790 795 800

Val Arg Asp Leu Tyr Ala Pro Phe Glu Ser Gly Leu Arg Ser Pro Thr

805 810 815

Gly Thr Val Tyr Gln His Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu

820 825 830

Arg Val Gln Ala Thr Ala Leu Gly Leu Gly Glu Arg Phe Glu Glu Ile

835 840 845

Glu Tyr Ala Tyr Ala Arg Cys Asp Glu Leu Leu Gly His Leu Val Lys

850 855 860

Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Leu Tyr Leu Val

865 870 875 880

Ser Ser Asn Ile Asp Pro Gly Glu Phe Glu Glu Asp Pro Ala Asp Tyr

885 890 895

Asp Leu Pro Glu Ser Val Ile Gly Phe Leu Arg Gly Glu Ile Gly Glu

900 905 910

Pro Pro Gly Gly Trp Pro Glu Pro Leu Arg Ser Glu Val Leu Ser Arg

915 920 925

Gln Asp Glu Asn Gly Ser Ser Ser Ala Gly Pro Ser Glu Asp Gly Ser

930 935 940

Ser Gly Asp Glu Gln Leu Pro Glu Glu Asp Arg Glu Ala Leu Ala Gln

945 950 955 960

Ala Glu Arg Gly Ser Glu Arg Arg Ala Ala Leu Asn Arg Leu Leu Leu

965 970 975

Pro Asp Pro Ala Ala Gly Lys Glu Glu Ala Glu Glu Arg Tyr Gly Asp

980 985 990

Val Ser Val Ile Pro Thr Lys Pro Phe Phe Tyr Gly Leu Glu Thr Gly

995 1000 1005

Gln Glu Leu Asp Leu Asp Leu Glu Pro Gly Val Arg Leu His Val

1010 1015 1020

Gly Leu Glu Ala Ile Ser Glu Ala Asp Gln Arg Gly Ile Arg Ala

1025 1030 1035

Leu Ile Val Thr Val Asn Gly Gln Ser Arg Ser Val Asp Ala Gln

1040 1045 1050

Asp Arg Ser Leu Glu Pro Glu Thr Pro Ser Thr Glu Lys Ala Asp

1055 1060 1065

Pro Asn Glu Glu Gly His Val Ala Ala Pro Met Thr Gly Ala Val

1070 1075 1080

Thr Leu Ala Val Glu Glu Gly Glu Glu Val Glu Glu Gly Gln Gln

1085 1090 1095

Leu Gly Thr Met Glu Ala Met Lys Met Glu Ser Ala Ile Ser Ala

1100 1105 1110

Pro Val Ser Gly Thr Ile Glu Arg Ile Ala Val Pro Ser Gly Thr

1115 1120 1125

Asn Val Glu Ser Gly Asp Leu Leu Leu Val Leu Glu Thr Ser

1130 1135 1140

<210> 268

<211> 3387

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_14序列

<400> 268

atgctgcgca agctcttggt cgccaaccgc ggcgagatcg ccatccgggc gttccgcgcg 60

gcctacgagc tcggcatcgc caccgtcgcc gtctacaccc accccgaccg cgtgtcgctc 120

caccgatcca aggcggacga ggcgtacgag atcggcgacc ccgcctcggc gctgcgcggc 180

tacctcgacc ccgcgctcat cgtcgacacg gcggtgcggg tgggcgcgga cgcgatctat 240

cccggctacg ggttcctctc cgagtcggag ctgttcgcgc aggcctgcgt cgatgccggg 300

gtgatcttcg tcgggccgcc tcccgaggtc ctgcgcctga ccggcgacaa gctgcgggcg 360

cgcgacgccg ctcgcagggc ggggttgccc gtgctcgagg ccagcccggc ggtcgccgac 420

gccgacgcgg cccgcgaggc cgcctcctcg ctcgggtatc cggtgttcgt caaggccgcg 480

ggcggcgggg gcggtcgcgg cctgcgccgg gtcgagcggc ccgaggacct tccgggcgcc 540

gtcgagaccg cgatgcgcga ggcccagggg gcattcgggg atcggaccat cttcctcgag 600

caggcggtca tccggccccg ccacatcgag gtgcagctgc tcgccgatgc cgacggcgag 660

gtcgtccatc tctacgagcg cgactgctcg atccagcggc ggcaccagaa ggcgctggag 720

ctggcgccgg ccccgggcat cacccccgag ctgcgccagc gcctctgcgc cgacgccgtg 780

tccttcgccc gcgcggtggg ctaccgcaac gcgggtaccg ccgagttcct ggtcggccag 840

gacgggcgcc acgtcttcat cgagatgaac ccccgcatcc aggtcgagca caccgtgacc 900

gaggagacca ccgacgtcga catcgtcgga tcacagctgc gcatcgcagg gggcgcgacg 960

ctggctgacc tcgggctgtc ccaggaccgg atcgtgcagc ggggcgcggc ggtgcagtgc 1020

cgcatcacga cggaggaccc cagcaacggc ttccgccccg acaccggccg catcgtggcc 1080

taccgctcgc cgggcggagc cggcatccgc ctcgacgccg gcagcgccta cgtgggcgcg 1140

gaggtctccc cctacttcga ctcgatgctg gtcaagctga ccgcccgcgg ctccgacctg 1200

caggtcgccg ccacccgcgc ccgccgcgcc ctcgcggagt tccgcatccg cggcgtcagc 1260

acgaacaccc gcttcctgtc cgcggtgctg gccgatcccg acttcctggc gggcaagctg 1320

tccacctcgt tcctggacga gcgcccctgg ctggtctcga cgaccaccgg cgaggaccgc 1380

gccacccggc tgctgcgccg cctggccgac gtcaccgtga accggccaca cggtgcggcg 1440

cccacgacgg tcgatccggt ctccaagctg ccgccgctgc ccggcggcga gcccccgccc 1500

ggctcccgcc agcggctggc cgagttcggt ccgcgggcct tcgcgcgggc gctgcgcgag 1560

caggcggccc tcgcggtcac cgacacgacc ctgcgtgacg cgcaccagtc gctgctggcc 1620

acgcgcatgc ggacacgcga catgctggcc gcggcgccgc atgtcgccca cggtatgggc 1680

ggcctgctga gcttcgaggt ctggggcggc gcgaccttcg acgcggcgct gctgttcctc 1740

ggcgaggacc cgtgggagcg gctggcccac ctgcgcaccg tgctgcccaa cgtgtgcctg 1800

cagatgctgc tgcgcggcga gaacctggtc ggctacacga cctacccggc gccggtggtc 1860

cgctccttcg tcgccgaggc gcgcgcgtgc gggatcgaca tcttccgcgt cttcgacgcc 1920

aacaacgacg tcgagcggat gcgcccggcc atcgaggccg tcgccgagga gggcgggctg 1980

gccgagggca ctctgtgcta caccggcgac ctgtctaccc cgggcgagcg gtacgacctc 2040

gaccactacc tgaccgtcgc caagggctgc ctcgaggccg gcgcgcacat cctgtgcatc 2100

aaggacatgg cggggctgct gcgggcaccg gccgcgcgca cgctggtcac cgccctgcgc 2160

gacagcttcg acgcgccggt gcacatgcat actcacgaca gcgccggcgg gcagctcgcc 2220

acctacctcg ccgccatcgc ggccggcgtc gacgcggtcg acggcgccgc ggcgccgctg 2280

tcgggcggca ccagccagcc ctcgctcgcg gcgatcgtgg ccgcgaccga tcacacggag 2340

cgtgcgaccg gcctgtcgct tgacgcgctg gccgacctcg agccctactg ggaggccgtg 2400

cgcaccctct acgccccctt cgaatccggg ctgcgcgccc cgaccggcgc cgtgtaccgc 2460

cacgagatcc ccggcggcca gctgtccaac ctgcgccagc aggcggtggc gctggggctc 2520

ggcacccggt tcgaggaggt cgagcgcgtc tacgcccgct gtgacgacct gctcggcggg 2580

ctcatcaagg tgacccccac cagcaaggtc gtcggcgacc tcgcgctcta cctcgtgtcg 2640

gctggcatcg atcccgggga gctggaggcc gatcccgcca ggtacgacct gcccggctcg 2700

gtgatcggct tcctgcaggg cgagctgggc gagccaccct tcgggtggcc cgagccgttt 2760

cgctccaagg cgctggcggg caagcccgac cacgtcgacc cgccggccct cgacgccgac 2820

cagcagcgcg acctcgacgc cgccgatcca gagcagcgcc gccgcgcgct gagcaccctg 2880

ctgctgccgg ctgcggcccg cgagcacctc aagtcgatcg agctgtacgg cgacgtctcc 2940

gtgctgccca cccgcgccta cctgtacggc ctggagcccg gcgaggaggt ggccgtagac 3000

ctcgaaccgg gcgtgcggct gttcctgcag ctcgaagcgg tcggcgaggt cgacaagcgc 3060

ggcgtgcgca ccgtgctggt caacgtcaac ggccaggccc gccccatcga ggtccaggac 3120

cgctccgccg aggtcaccgc caaggtcacc gagaaggccg atcccgccag accgggtcac 3180

atccccgcgc ccttgaccgg cgtcgtcgcc atgcgcgtcg ccgaaggcga ccaggtcgag 3240

gccggcgccc agctcgccac gatcgaggcc atgaagatgg aaagctctat cagcgccccc 3300

ttcgccgccc gcatcgaccg cctcgtcgtc accgacggca cggccgtcga acccggcgat 3360

ctcctcctcg tcctctccca ggcgtga 3387

<210> 269

<211> 1128

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_14序列

<400> 269

Met Leu Arg Lys Leu Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Ile Ala Thr Val Ala Val Tyr

20 25 30

Thr His Pro Asp Arg Val Ser Leu His Arg Ser Lys Ala Asp Glu Ala

35 40 45

Tyr Glu Ile Gly Asp Pro Ala Ser Ala Leu Arg Gly Tyr Leu Asp Pro

50 55 60

Ala Leu Ile Val Asp Thr Ala Val Arg Val Gly Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Ser Glu Leu Phe Ala Gln Ala Cys

85 90 95

Val Asp Ala Gly Val Ile Phe Val Gly Pro Pro Pro Glu Val Leu Arg

100 105 110

Leu Thr Gly Asp Lys Leu Arg Ala Arg Asp Ala Ala Arg Arg Ala Gly

115 120 125

Leu Pro Val Leu Glu Ala Ser Pro Ala Val Ala Asp Ala Asp Ala Ala

130 135 140

Arg Glu Ala Ala Ser Ser Leu Gly Tyr Pro Val Phe Val Lys Ala Ala

145 150 155 160

Gly Gly Gly Gly Gly Arg Gly Leu Arg Arg Val Glu Arg Pro Glu Asp

165 170 175

Leu Pro Gly Ala Val Glu Thr Ala Met Arg Glu Ala Gln Gly Ala Phe

180 185 190

Gly Asp Arg Thr Ile Phe Leu Glu Gln Ala Val Ile Arg Pro Arg His

195 200 205

Ile Glu Val Gln Leu Leu Ala Asp Ala Asp Gly Glu Val Val His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Ile Gln Arg Arg His Gln Lys Ala Leu Glu

225 230 235 240

Leu Ala Pro Ala Pro Gly Ile Thr Pro Glu Leu Arg Gln Arg Leu Cys

245 250 255

Ala Asp Ala Val Ser Phe Ala Arg Ala Val Gly Tyr Arg Asn Ala Gly

260 265 270

Thr Ala Glu Phe Leu Val Gly Gln Asp Gly Arg His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr Thr

290 295 300

Asp Val Asp Ile Val Gly Ser Gln Leu Arg Ile Ala Gly Gly Ala Thr

305 310 315 320

Leu Ala Asp Leu Gly Leu Ser Gln Asp Arg Ile Val Gln Arg Gly Ala

325 330 335

Ala Val Gln Cys Arg Ile Thr Thr Glu Asp Pro Ser Asn Gly Phe Arg

340 345 350

Pro Asp Thr Gly Arg Ile Val Ala Tyr Arg Ser Pro Gly Gly Ala Gly

355 360 365

Ile Arg Leu Asp Ala Gly Ser Ala Tyr Val Gly Ala Glu Val Ser Pro

370 375 380

Tyr Phe Asp Ser Met Leu Val Lys Leu Thr Ala Arg Gly Ser Asp Leu

385 390 395 400

Gln Val Ala Ala Thr Arg Ala Arg Arg Ala Leu Ala Glu Phe Arg Ile

405 410 415

Arg Gly Val Ser Thr Asn Thr Arg Phe Leu Ser Ala Val Leu Ala Asp

420 425 430

Pro Asp Phe Leu Ala Gly Lys Leu Ser Thr Ser Phe Leu Asp Glu Arg

435 440 445

Pro Trp Leu Val Ser Thr Thr Thr Gly Glu Asp Arg Ala Thr Arg Leu

450 455 460

Leu Arg Arg Leu Ala Asp Val Thr Val Asn Arg Pro His Gly Ala Ala

465 470 475 480

Pro Thr Thr Val Asp Pro Val Ser Lys Leu Pro Pro Leu Pro Gly Gly

485 490 495

Glu Pro Pro Pro Gly Ser Arg Gln Arg Leu Ala Glu Phe Gly Pro Arg

500 505 510

Ala Phe Ala Arg Ala Leu Arg Glu Gln Ala Ala Leu Ala Val Thr Asp

515 520 525

Thr Thr Leu Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Met Arg

530 535 540

Thr Arg Asp Met Leu Ala Ala Ala Pro His Val Ala His Gly Met Gly

545 550 555 560

Gly Leu Leu Ser Phe Glu Val Trp Gly Gly Ala Thr Phe Asp Ala Ala

565 570 575

Leu Leu Phe Leu Gly Glu Asp Pro Trp Glu Arg Leu Ala His Leu Arg

580 585 590

Thr Val Leu Pro Asn Val Cys Leu Gln Met Leu Leu Arg Gly Glu Asn

595 600 605

Leu Val Gly Tyr Thr Thr Tyr Pro Ala Pro Val Val Arg Ser Phe Val

610 615 620

Ala Glu Ala Arg Ala Cys Gly Ile Asp Ile Phe Arg Val Phe Asp Ala

625 630 635 640

Asn Asn Asp Val Glu Arg Met Arg Pro Ala Ile Glu Ala Val Ala Glu

645 650 655

Glu Gly Gly Leu Ala Glu Gly Thr Leu Cys Tyr Thr Gly Asp Leu Ser

660 665 670

Thr Pro Gly Glu Arg Tyr Asp Leu Asp His Tyr Leu Thr Val Ala Lys

675 680 685

Gly Cys Leu Glu Ala Gly Ala His Ile Leu Cys Ile Lys Asp Met Ala

690 695 700

Gly Leu Leu Arg Ala Pro Ala Ala Arg Thr Leu Val Thr Ala Leu Arg

705 710 715 720

Asp Ser Phe Asp Ala Pro Val His Met His Thr His Asp Ser Ala Gly

725 730 735

Gly Gln Leu Ala Thr Tyr Leu Ala Ala Ile Ala Ala Gly Val Asp Ala

740 745 750

Val Asp Gly Ala Ala Ala Pro Leu Ser Gly Gly Thr Ser Gln Pro Ser

755 760 765

Leu Ala Ala Ile Val Ala Ala Thr Asp His Thr Glu Arg Ala Thr Gly

770 775 780

Leu Ser Leu Asp Ala Leu Ala Asp Leu Glu Pro Tyr Trp Glu Ala Val

785 790 795 800

Arg Thr Leu Tyr Ala Pro Phe Glu Ser Gly Leu Arg Ala Pro Thr Gly

805 810 815

Ala Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu Arg

820 825 830

Gln Gln Ala Val Ala Leu Gly Leu Gly Thr Arg Phe Glu Glu Val Glu

835 840 845

Arg Val Tyr Ala Arg Cys Asp Asp Leu Leu Gly Gly Leu Ile Lys Val

850 855 860

Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Leu Tyr Leu Val Ser

865 870 875 880

Ala Gly Ile Asp Pro Gly Glu Leu Glu Ala Asp Pro Ala Arg Tyr Asp

885 890 895

Leu Pro Gly Ser Val Ile Gly Phe Leu Gln Gly Glu Leu Gly Glu Pro

900 905 910

Pro Phe Gly Trp Pro Glu Pro Phe Arg Ser Lys Ala Leu Ala Gly Lys

915 920 925

Pro Asp His Val Asp Pro Pro Ala Leu Asp Ala Asp Gln Gln Arg Asp

930 935 940

Leu Asp Ala Ala Asp Pro Glu Gln Arg Arg Arg Ala Leu Ser Thr Leu

945 950 955 960

Leu Leu Pro Ala Ala Ala Arg Glu His Leu Lys Ser Ile Glu Leu Tyr

965 970 975

Gly Asp Val Ser Val Leu Pro Thr Arg Ala Tyr Leu Tyr Gly Leu Glu

980 985 990

Pro Gly Glu Glu Val Ala Val Asp Leu Glu Pro Gly Val Arg Leu Phe

995 1000 1005

Leu Gln Leu Glu Ala Val Gly Glu Val Asp Lys Arg Gly Val Arg

1010 1015 1020

Thr Val Leu Val Asn Val Asn Gly Gln Ala Arg Pro Ile Glu Val

1025 1030 1035

Gln Asp Arg Ser Ala Glu Val Thr Ala Lys Val Thr Glu Lys Ala

1040 1045 1050

Asp Pro Ala Arg Pro Gly His Ile Pro Ala Pro Leu Thr Gly Val

1055 1060 1065

Val Ala Met Arg Val Ala Glu Gly Asp Gln Val Glu Ala Gly Ala

1070 1075 1080

Gln Leu Ala Thr Ile Glu Ala Met Lys Met Glu Ser Ser Ile Ser

1085 1090 1095

Ala Pro Phe Ala Ala Arg Ile Asp Arg Leu Val Val Thr Asp Gly

1100 1105 1110

Thr Ala Val Glu Pro Gly Asp Leu Leu Leu Val Leu Ser Gln Ala

1115 1120 1125

<210> 270

<211> 3438

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_15序列

<400> 270

atggtccgca agctgctcgt tgccaaccgc ggcgagatcg ccatccgcgc gtttcgtgct 60

gcgaccgagc tcggcatcgc cacggtggcg gtgtacaccc aggaagaccg tgactccctc 120

caccgcctca aggccgacga ggcatatcag atcggcgagc ccggccatcc cgtccgcgcg 180

tacctcgacg gggctgcgct catcgcgctt gccgggcgca tcggcgcgga tgcggtttac 240

ccgggctatg ggttcctgtc cgagaacgcg gagttcgccg ccgactgtgc ggctgccggc 300

ctgacgttcg tgggtccgcc gcccagcgtg ttgcggttga ccggcgacaa gaccgaggcg 360

cgacggctgg cgcgcgacgc tgggctgcgc gtgctcgagg cgagcgacat ccttgcggat 420

cccgcggacg ctgcggcggc cgccgagcag ctgggatatc ccgtgttcgt gaaggccgct 480

gccggtggtg gcggtcgcgg tctgcgcagg gttgagtcgc ccgccgacct ggcgggcgcg 540

gttgagacgg ccattcgcga ggcgcatggc gccttcggcg atgagcgggt gttcctcgag 600

cacgccgtaa cacggccgcg gcacatcgaa gtgcaggtgc ttgccgatga ccgcggcgag 660

gtcatccacc tgtgggagcg tgattgctcg gtccagcgac gccatcagaa ggtgatggag 720

atcgcgccgg cgccaaagct ggatccggcg ctgcgtgagg cgatctgtgc tgacggggtc 780

cggttcgctc gcgcggctgg ctaccgcaac gccggcacgg tggagttcct gctcgatcgc 840

aacggcacgc acgtgttcat cgagatgaac ccgcgcatcc aggtcgagca caccgtgacc 900

gaggaggtca ccgacgtcga tctggtgcaa gcgcagctgc ggatcgcctc cggcgagacg 960

ctggccgacc tcggactcac ccaggacacg gtggccacgc gtggatccgc gatccagtgc 1020

cggatcacca cggaggactc agccaacggg tttcgccccg ataccggtcg catcacggcc 1080

taccggtcgg cggccggggc aggcatccgc ctcgacgccg gctcggcctt cgttggctca 1140

caggtcagtc cgttcttcga ctcgttgctg gtcaagctga cggccagagg tcgcgacttc 1200

cccgccgccg cgcaacgcgc gcgccgggcg ctggcggagt tccgcgtgcg cggcgtggcg 1260

accaacatgg cgttcctggg cgccgtgctc gccgatcccg acttcctggc cggcaacctg 1320

tcgaccacgt tcatcgacga gcgaccgcac ctggtcagca ccagcgcggg tcgcgaccgc 1380

gggacgcgcc tgctggagta cctggccaac gtcacggtca accggccgca tggcacggtg 1440

cgcgatgtgg ttgacgcgcg cgacaagctg ccgccgctcc ccgccgagcc gacggtgagg 1500

gcgaccgcca gcgagccggc gggcggggac ctcatggtcc cgccgggcgc tcgccagcaa 1560

ctgcaagcgc tgggcccgga gcgcttcgcg cgctggctgc gggagcgcga cgccgtgggg 1620

ctgaccgaca ccacgttccg cgacgcgcac cagtcgctgc tcgccacgcg gatgcggaca 1680

ttcgacatgc tcgcggtggc gccgcacatc gccgccggcc tgccggaact gttcagcctg 1740

gagatgtggg gcggcgccac ctatgacgtg gcgctgcggt tcctgcacga ggatccctgg 1800

cagcggctgg cggccatgcg ggaggccgtg cccaacatct gcctgcagat gctgctgcgc 1860

ggccagaacg cggtcggcta ctcggcctat ccgagcgatg tggtgcgcgc cttcgtggcc 1920

gaagccagcc tcaccggcat cgatatcttc cggatcttcg atgcccttaa caacgtgacg 1980

gcaatgcgcg ccgcgatcag cgcgacgctc gaggcgggtg ccgtggccga aggcgccatc 2040

tgctacaccg gcgacctcca cgatcccgcc gaacgcgtgt acacgctcga ctactacctg 2100

ggtgtcgccg aggagctcgt cgaagcgggc gtgcacatcc tgtgcatcaa agacatggcg 2160

ggcctgctgc gcccacctgc tgcgcgcacg ctgatagccg cgctgcgcga gcgcttcgac 2220

cagccggtgc acctgcatac gcacgacacc gccggcggcc agctcgggac cgtggtggcc 2280

gccatcgacg cgggcgtgga tgccgtcgac ggtgcggccg cgccgctgtc gggcatgacc 2340

agccagccca acctcgccgc gatcgtggca gccaccgacc acaccccacg cgcaacggga 2400

gtgtcgctgg acacgctgac cgcgctggag ccctactggg aggcggtgcg caaccagtac 2460

gcgccgctgg aggcgggtct gcgctcgccg accggagccg tctacgacca cgagatcccc 2520

ggtggccagc tgtcgaacct gcgtcagcag gcgatcgcgt tggggctcgg tgaccggttc 2580

gagcacctga cccggctcta ccggcagtgc aacgatctgc tcgggaacat gatcaaggtc 2640

acgccgacca gcaaggtcgt gggcgacctg gcgctgtacc tcgacagcgc cggcatcacg 2700

cccgagcagc tggtggccga cccagcgcgc tacgatctgc ccgacagcgt catcggctac 2760

ctgcacggtg agcttggaac cccgcccggc ggctggcccg agccgttgcg cggcaaggtg 2820

gtggccgccc gccccaaccg tccgccgccc ccgacgttga ccgaccacca gcgcaccgcc 2880

ctgcgcgagc gctcggggcg gtaccgtcag gagctgttga acgagctgct gttccccggt 2940

ccggcggcgg cgcgagccga gatgcggcag cggttcggcg acgtctccat cctgccgacg 3000

tgggcgttcc tctacggcct cacgccacgc cgggagctgc atgttgatct ggcccccggc 3060

gtgcggttgt tcatccacct ggaggccgtc agcgaacccg acgaacaggg catccgtacc 3120

gtgctgtgca cgctgaacgg gcaggtgcga ccggtcgata cgcgcgaccg ttctatcgaa 3180

gccgccacgg agcctgccga gcgcgctgat ctcaacgatc ccagccacgt ggcagcgccg 3240

ctgaccgggg tggtgaccat cgtcgtgtcg cccggcgagc acgtggccgc cggcgccaag 3300

ctcggttcca tcgaagccat gaagatggag tccaacatca gcgcacccca cgccggcacg 3360

gtctcccgcg tgctcaccgg cagcggcacg gcagtcgagc cgggtgacct cctcctggtc 3420

ctcgaccccg acgactga 3438

<210> 271

<211> 1145

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_15序列

<400> 271

Met Val Arg Lys Leu Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Thr Glu Leu Gly Ile Ala Thr Val Ala Val Tyr

20 25 30

Thr Gln Glu Asp Arg Asp Ser Leu His Arg Leu Lys Ala Asp Glu Ala

35 40 45

Tyr Gln Ile Gly Glu Pro Gly His Pro Val Arg Ala Tyr Leu Asp Gly

50 55 60

Ala Ala Leu Ile Ala Leu Ala Gly Arg Ile Gly Ala Asp Ala Val Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ala Glu Phe Ala Ala Asp Cys

85 90 95

Ala Ala Ala Gly Leu Thr Phe Val Gly Pro Pro Pro Ser Val Leu Arg

100 105 110

Leu Thr Gly Asp Lys Thr Glu Ala Arg Arg Leu Ala Arg Asp Ala Gly

115 120 125

Leu Arg Val Leu Glu Ala Ser Asp Ile Leu Ala Asp Pro Ala Asp Ala

130 135 140

Ala Ala Ala Ala Glu Gln Leu Gly Tyr Pro Val Phe Val Lys Ala Ala

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Leu Arg Arg Val Glu Ser Pro Ala Asp

165 170 175

Leu Ala Gly Ala Val Glu Thr Ala Ile Arg Glu Ala His Gly Ala Phe

180 185 190

Gly Asp Glu Arg Val Phe Leu Glu His Ala Val Thr Arg Pro Arg His

195 200 205

Ile Glu Val Gln Val Leu Ala Asp Asp Arg Gly Glu Val Ile His Leu

210 215 220

Trp Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Met Glu

225 230 235 240

Ile Ala Pro Ala Pro Lys Leu Asp Pro Ala Leu Arg Glu Ala Ile Cys

245 250 255

Ala Asp Gly Val Arg Phe Ala Arg Ala Ala Gly Tyr Arg Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Leu Asp Arg Asn Gly Thr His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr

290 295 300

Asp Val Asp Leu Val Gln Ala Gln Leu Arg Ile Ala Ser Gly Glu Thr

305 310 315 320

Leu Ala Asp Leu Gly Leu Thr Gln Asp Thr Val Ala Thr Arg Gly Ser

325 330 335

Ala Ile Gln Cys Arg Ile Thr Thr Glu Asp Ser Ala Asn Gly Phe Arg

340 345 350

Pro Asp Thr Gly Arg Ile Thr Ala Tyr Arg Ser Ala Ala Gly Ala Gly

355 360 365

Ile Arg Leu Asp Ala Gly Ser Ala Phe Val Gly Ser Gln Val Ser Pro

370 375 380

Phe Phe Asp Ser Leu Leu Val Lys Leu Thr Ala Arg Gly Arg Asp Phe

385 390 395 400

Pro Ala Ala Ala Gln Arg Ala Arg Arg Ala Leu Ala Glu Phe Arg Val

405 410 415

Arg Gly Val Ala Thr Asn Met Ala Phe Leu Gly Ala Val Leu Ala Asp

420 425 430

Pro Asp Phe Leu Ala Gly Asn Leu Ser Thr Thr Phe Ile Asp Glu Arg

435 440 445

Pro His Leu Val Ser Thr Ser Ala Gly Arg Asp Arg Gly Thr Arg Leu

450 455 460

Leu Glu Tyr Leu Ala Asn Val Thr Val Asn Arg Pro His Gly Thr Val

465 470 475 480

Arg Asp Val Val Asp Ala Arg Asp Lys Leu Pro Pro Leu Pro Ala Glu

485 490 495

Pro Thr Val Arg Ala Thr Ala Ser Glu Pro Ala Gly Gly Asp Leu Met

500 505 510

Val Pro Pro Gly Ala Arg Gln Gln Leu Gln Ala Leu Gly Pro Glu Arg

515 520 525

Phe Ala Arg Trp Leu Arg Glu Arg Asp Ala Val Gly Leu Thr Asp Thr

530 535 540

Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Met Arg Thr

545 550 555 560

Phe Asp Met Leu Ala Val Ala Pro His Ile Ala Ala Gly Leu Pro Glu

565 570 575

Leu Phe Ser Leu Glu Met Trp Gly Gly Ala Thr Tyr Asp Val Ala Leu

580 585 590

Arg Phe Leu His Glu Asp Pro Trp Gln Arg Leu Ala Ala Met Arg Glu

595 600 605

Ala Val Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Gln Asn Ala

610 615 620

Val Gly Tyr Ser Ala Tyr Pro Ser Asp Val Val Arg Ala Phe Val Ala

625 630 635 640

Glu Ala Ser Leu Thr Gly Ile Asp Ile Phe Arg Ile Phe Asp Ala Leu

645 650 655

Asn Asn Val Thr Ala Met Arg Ala Ala Ile Ser Ala Thr Leu Glu Ala

660 665 670

Gly Ala Val Ala Glu Gly Ala Ile Cys Tyr Thr Gly Asp Leu His Asp

675 680 685

Pro Ala Glu Arg Val Tyr Thr Leu Asp Tyr Tyr Leu Gly Val Ala Glu

690 695 700

Glu Leu Val Glu Ala Gly Val His Ile Leu Cys Ile Lys Asp Met Ala

705 710 715 720

Gly Leu Leu Arg Pro Pro Ala Ala Arg Thr Leu Ile Ala Ala Leu Arg

725 730 735

Glu Arg Phe Asp Gln Pro Val His Leu His Thr His Asp Thr Ala Gly

740 745 750

Gly Gln Leu Gly Thr Val Val Ala Ala Ile Asp Ala Gly Val Asp Ala

755 760 765

Val Asp Gly Ala Ala Ala Pro Leu Ser Gly Met Thr Ser Gln Pro Asn

770 775 780

Leu Ala Ala Ile Val Ala Ala Thr Asp His Thr Pro Arg Ala Thr Gly

785 790 795 800

Val Ser Leu Asp Thr Leu Thr Ala Leu Glu Pro Tyr Trp Glu Ala Val

805 810 815

Arg Asn Gln Tyr Ala Pro Leu Glu Ala Gly Leu Arg Ser Pro Thr Gly

820 825 830

Ala Val Tyr Asp His Glu Ile Pro Gly Gly Gln Leu Ser Asn Leu Arg

835 840 845

Gln Gln Ala Ile Ala Leu Gly Leu Gly Asp Arg Phe Glu His Leu Thr

850 855 860

Arg Leu Tyr Arg Gln Cys Asn Asp Leu Leu Gly Asn Met Ile Lys Val

865 870 875 880

Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Leu Tyr Leu Asp Ser

885 890 895

Ala Gly Ile Thr Pro Glu Gln Leu Val Ala Asp Pro Ala Arg Tyr Asp

900 905 910

Leu Pro Asp Ser Val Ile Gly Tyr Leu His Gly Glu Leu Gly Thr Pro

915 920 925

Pro Gly Gly Trp Pro Glu Pro Leu Arg Gly Lys Val Val Ala Ala Arg

930 935 940

Pro Asn Arg Pro Pro Pro Pro Thr Leu Thr Asp His Gln Arg Thr Ala

945 950 955 960

Leu Arg Glu Arg Ser Gly Arg Tyr Arg Gln Glu Leu Leu Asn Glu Leu

965 970 975

Leu Phe Pro Gly Pro Ala Ala Ala Arg Ala Glu Met Arg Gln Arg Phe

980 985 990

Gly Asp Val Ser Ile Leu Pro Thr Trp Ala Phe Leu Tyr Gly Leu Thr

995 1000 1005

Pro Arg Arg Glu Leu His Val Asp Leu Ala Pro Gly Val Arg Leu

1010 1015 1020

Phe Ile His Leu Glu Ala Val Ser Glu Pro Asp Glu Gln Gly Ile

1025 1030 1035

Arg Thr Val Leu Cys Thr Leu Asn Gly Gln Val Arg Pro Val Asp

1040 1045 1050

Thr Arg Asp Arg Ser Ile Glu Ala Ala Thr Glu Pro Ala Glu Arg

1055 1060 1065

Ala Asp Leu Asn Asp Pro Ser His Val Ala Ala Pro Leu Thr Gly

1070 1075 1080

Val Val Thr Ile Val Val Ser Pro Gly Glu His Val Ala Ala Gly

1085 1090 1095

Ala Lys Leu Gly Ser Ile Glu Ala Met Lys Met Glu Ser Asn Ile

1100 1105 1110

Ser Ala Pro His Ala Gly Thr Val Ser Arg Val Leu Thr Gly Ser

1115 1120 1125

Gly Thr Ala Val Glu Pro Gly Asp Leu Leu Leu Val Leu Asp Pro

1130 1135 1140

Asp Asp

1145

<210> 272

<211> 3384

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_16序列

<400> 272

atgatcgaga aggtgctggt cgccaatcgc ggcgagatcg cgacccgcgc cttccgagcg 60

gcgaatgagc ttcggatccg cagcgtggcg ttgtacgcgc cggaggatcg cgactcggtc 120

catcgcgtaa aggccgacga ggcgtacgag atcggtgcgc cgggtcatcc ggtcagcacc 180

tacctggacc ctgacatcgc ggtcgcgctc gcgctgcggg tcggcgccga cgcgatctac 240

ccgggctacg gcttcatgtc cgaaaacccg gagctcgctc gagcctgcgt cgctgccgga 300

ttggtgttcg tcgggccgcc accggaggtg ctcggtctcg ccggcgacaa gacgcgcgcg 360

cgaacggcgg cgcgcgaggc gggcgtcccg gtgctcgacg cttcagagcc ggtcgagaac 420

gccgaagctg cgctggcggc agccgagaag atcggcttcc cggtgttcgt gaaggcgtcg 480

cacggcggcg gcgggcgcgg catgcgcctc gtgaccgatc cggcgcgcct cgcggcgtcg 540

ctggaggagg cgcgcaacga ggcggaggcg gcgttcggcg acggcacggt ctacctcgag 600

caggcgctcg tgcgcccgcg ccacatcgag gttcagctgc tggccgacgc gaccggcgac 660

gtcgtgcatc tctacgagcg cgactgctca ttgcagcgcc ggcatcagaa ggtgatcgag 720

atcacaccgg caccgaacct cgagccggag ctgcgcgacc gcatctgcgc cgacgccgtc 780

cgcttcgccc gccacgtggg gtacgtcaac gcgggcacgg tcgagttcct gctcgacgag 840

gccaacgggc gctacgcgtt catcgagatg aaccctcgca ttcaggtcga gcacacggtc 900

accgaggaga cgaccgacat cgacctcgtg cgcgcacaac tgcagatagc cggcggcgag 960

acgctcgccg gactcggcgt gcgccaggac gacatccgcc agcgcggctt cgcgctgcag 1020

tgccgggtga cgacggagga ccccgccaac gggttccgcc ccgactccgg ccgcatcacc 1080

gcgtaccgat cccccggagg ggcgggcgtg cggctcgacg agggctcagc cttcgtcggc 1140

gccgaggtct cgccgttctt cgacccgctg ctggtgaaga tctccgcgcg cgggcgtgat 1200

ctgcacagcg cggtctcacg cgcgcggcgc gccgtcgccg agctgcgagt acgcggtgtc 1260

aagaccaacc agggcttcct gctcgcgctg ctcaacgacc ccgacgtcct cgctgggcgc 1320

acgcacacca cgttcgtcga cgagcgtccc gacctctcga ccgccggccc cggcggcgac 1380

cgcgccagcc gactgctcaa acgcctcgcc gaggtcacgg tcaaccacga gcctgccagc 1440

tccgccctcg ccggcgatcc gcgcgcgaag ctcccagcgc ccccgacggg cgcgccgccc 1500

gccgggtcgc gccagaaact gctcgacctc ggcccgtcca cgttcgccgc ggcgctgcgc 1560

ggacagcagg cgatcgcgct caccgacacc acgctccgtg acgcccacca gtcactgttc 1620

gccacgcgta tgcgcacgcg cgacatgctc cccgtagcac cgcacctcgc gcacgaactg 1680

ccgcagctgc tgtcgcttga ggtgtggggc ggcgcaacct tcgatgtcgc gctgcgcttc 1740

ctgcacgagg acccgtggga ccggctcgtg cagctacgcg aactggtccc caacgtgtgc 1800

ctgcagatgc tcctgcgcgg ccagaacctg ctcgcctact cccgctttcc caccagggtg 1860

gtgcgtgcat tcgtcgccga ggcggtcgag gccggcatcg acgtcttccg catcttcgac 1920

gcgctcaacg acatcgaagg catgcgctcc gcgatcgagg caacgctcga gacgcccgcg 1980

ctagccgaag gaaccctgtg ttacacgggc gacctgagcg acccgcgcga gcggctctac 2040

accctcgact attacctgcg cctcgcccag cagctggtcg acgccggtgt acacatgctc 2100

gccatcaagg acatggccgg gctgctgagg gcacccgccg cacacacgct cgtgaccgcg 2160

ctgcaccgcg agttcgaact gccggtgcac ctgcacacac acgacaccgc cggcgggcaa 2220

ctcgccacct acctcgccgc catcgaggcc ggcgtcgacg ccgtcgatgg cgccgccgcg 2280

ccgatggcgg gcatgaccag ccagccctcc ctggcggcga tcgtcgccgc caccgcgacg 2340

accgagcgcg actcgggcat cgcgctcgac gcgctcctgg accaagagcc ctactgggag 2400

tcggtgcgca cgctctacgg cgcgttcgag accggcctga aggcgccgac tggtcgcgtc 2460

taccgccacg agatccccgg tggccagctc tccaacctgc gccaacaagc ggacgcggtc 2520

ggcctcacgg gccgcttcga cgagatcgaa cgcgcctacg agcgagccaa ccgactgctc 2580

ggcaacgtgg tcaaggtcac gccctcgagc aaggtcgtcg gcgacctcgc cctgtttgcg 2640

gtctcagccg gcatcgactt cgacgagctc gaacgccgac ccggctcctt cgacctcccc 2700

gactccgtca tcgacttcct gcgcggcggg atcggcaccc cacccggcgg cttcccacaa 2760

cccttcaccg acctggcact cgccggtcgc cccgcgccgc cggcacccac ggagctcgac 2820

cccgagctcg ccgaccggct acagcaaccc ggcgcacctc gtcgcggggc gctcgccgag 2880

atcctcttcc ccgggccggc gtccgacttc gccgccgccc gcgccacgtt cggcgacgtc 2940

tcgctgatcc ccacgcccgc gttcttccgc ggcctgcacg aggacgaaga actggcgatc 3000

gacctcgcac ccggcgtacg cctgctcttc gaactcgaag ccatcggcga acccgacaag 3060

cgcggcatgc ggaccgtcct ggtacgcgtc aacggccagc tgcgccccgt cgaagtgcgc 3120

gaccactccg tcaagaccac cggtgtgcag atcgaacgcg cggaccccaa acgcccaggc 3180

cacgtcccgg cgccagtgac cgggatcgtg tccctgctcg tcgccgcggg cgacaccgtg 3240

tccgagggcg acccgatcgc aacgctcgaa gccatgaaga tggagtccac gatctccgcg 3300

ccgctcgccg gccgcgtgca acgcctcgcc gtcaccacgg gtgcgcgcct ggaacagggg 3360

gacctcctgc tcgtcatcga ctag 3384

<210> 273

<211> 1127

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_16序列

<400> 273

Met Ile Glu Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Thr Arg

1 5 10 15

Ala Phe Arg Ala Ala Asn Glu Leu Arg Ile Arg Ser Val Ala Leu Tyr

20 25 30

Ala Pro Glu Asp Arg Asp Ser Val His Arg Val Lys Ala Asp Glu Ala

35 40 45

Tyr Glu Ile Gly Ala Pro Gly His Pro Val Ser Thr Tyr Leu Asp Pro

50 55 60

Asp Ile Ala Val Ala Leu Ala Leu Arg Val Gly Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Met Ser Glu Asn Pro Glu Leu Ala Arg Ala Cys

85 90 95

Val Ala Ala Gly Leu Val Phe Val Gly Pro Pro Pro Glu Val Leu Gly

100 105 110

Leu Ala Gly Asp Lys Thr Arg Ala Arg Thr Ala Ala Arg Glu Ala Gly

115 120 125

Val Pro Val Leu Asp Ala Ser Glu Pro Val Glu Asn Ala Glu Ala Ala

130 135 140

Leu Ala Ala Ala Glu Lys Ile Gly Phe Pro Val Phe Val Lys Ala Ser

145 150 155 160

His Gly Gly Gly Gly Arg Gly Met Arg Leu Val Thr Asp Pro Ala Arg

165 170 175

Leu Ala Ala Ser Leu Glu Glu Ala Arg Asn Glu Ala Glu Ala Ala Phe

180 185 190

Gly Asp Gly Thr Val Tyr Leu Glu Gln Ala Leu Val Arg Pro Arg His

195 200 205

Ile Glu Val Gln Leu Leu Ala Asp Ala Thr Gly Asp Val Val His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Leu Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Ile Thr Pro Ala Pro Asn Leu Glu Pro Glu Leu Arg Asp Arg Ile Cys

245 250 255

Ala Asp Ala Val Arg Phe Ala Arg His Val Gly Tyr Val Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Leu Asp Glu Ala Asn Gly Arg Tyr Ala Phe Ile

275 280 285

Glu Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr

290 295 300

Thr Asp Ile Asp Leu Val Arg Ala Gln Leu Gln Ile Ala Gly Gly Glu

305 310 315 320

Thr Leu Ala Gly Leu Gly Val Arg Gln Asp Asp Ile Arg Gln Arg Gly

325 330 335

Phe Ala Leu Gln Cys Arg Val Thr Thr Glu Asp Pro Ala Asn Gly Phe

340 345 350

Arg Pro Asp Ser Gly Arg Ile Thr Ala Tyr Arg Ser Pro Gly Gly Ala

355 360 365

Gly Val Arg Leu Asp Glu Gly Ser Ala Phe Val Gly Ala Glu Val Ser

370 375 380

Pro Phe Phe Asp Pro Leu Leu Val Lys Ile Ser Ala Arg Gly Arg Asp

385 390 395 400

Leu His Ser Ala Val Ser Arg Ala Arg Arg Ala Val Ala Glu Leu Arg

405 410 415

Val Arg Gly Val Lys Thr Asn Gln Gly Phe Leu Leu Ala Leu Leu Asn

420 425 430

Asp Pro Asp Val Leu Ala Gly Arg Thr His Thr Thr Phe Val Asp Glu

435 440 445

Arg Pro Asp Leu Ser Thr Ala Gly Pro Gly Gly Asp Arg Ala Ser Arg

450 455 460

Leu Leu Lys Arg Leu Ala Glu Val Thr Val Asn His Glu Pro Ala Ser

465 470 475 480

Ser Ala Leu Ala Gly Asp Pro Arg Ala Lys Leu Pro Ala Pro Pro Thr

485 490 495

Gly Ala Pro Pro Ala Gly Ser Arg Gln Lys Leu Leu Asp Leu Gly Pro

500 505 510

Ser Thr Phe Ala Ala Ala Leu Arg Gly Gln Gln Ala Ile Ala Leu Thr

515 520 525

Asp Thr Thr Leu Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Met

530 535 540

Arg Thr Arg Asp Met Leu Pro Val Ala Pro His Leu Ala His Glu Leu

545 550 555 560

Pro Gln Leu Leu Ser Leu Glu Val Trp Gly Gly Ala Thr Phe Asp Val

565 570 575

Ala Leu Arg Phe Leu His Glu Asp Pro Trp Asp Arg Leu Val Gln Leu

580 585 590

Arg Glu Leu Val Pro Asn Val Cys Leu Gln Met Leu Leu Arg Gly Gln

595 600 605

Asn Leu Leu Ala Tyr Ser Arg Phe Pro Thr Arg Val Val Arg Ala Phe

610 615 620

Val Ala Glu Ala Val Glu Ala Gly Ile Asp Val Phe Arg Ile Phe Asp

625 630 635 640

Ala Leu Asn Asp Ile Glu Gly Met Arg Ser Ala Ile Glu Ala Thr Leu

645 650 655

Glu Thr Pro Ala Leu Ala Glu Gly Thr Leu Cys Tyr Thr Gly Asp Leu

660 665 670

Ser Asp Pro Arg Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu

675 680 685

Ala Gln Gln Leu Val Asp Ala Gly Val His Met Leu Ala Ile Lys Asp

690 695 700

Met Ala Gly Leu Leu Arg Ala Pro Ala Ala His Thr Leu Val Thr Ala

705 710 715 720

Leu His Arg Glu Phe Glu Leu Pro Val His Leu His Thr His Asp Thr

725 730 735

Ala Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Ile Glu Ala Gly Val

740 745 750

Asp Ala Val Asp Gly Ala Ala Ala Pro Met Ala Gly Met Thr Ser Gln

755 760 765

Pro Ser Leu Ala Ala Ile Val Ala Ala Thr Ala Thr Thr Glu Arg Asp

770 775 780

Ser Gly Ile Ala Leu Asp Ala Leu Leu Asp Gln Glu Pro Tyr Trp Glu

785 790 795 800

Ser Val Arg Thr Leu Tyr Gly Ala Phe Glu Thr Gly Leu Lys Ala Pro

805 810 815

Thr Gly Arg Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Ser Asn

820 825 830

Leu Arg Gln Gln Ala Asp Ala Val Gly Leu Thr Gly Arg Phe Asp Glu

835 840 845

Ile Glu Arg Ala Tyr Glu Arg Ala Asn Arg Leu Leu Gly Asn Val Val

850 855 860

Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu Phe Ala

865 870 875 880

Val Ser Ala Gly Ile Asp Phe Asp Glu Leu Glu Arg Arg Pro Gly Ser

885 890 895

Phe Asp Leu Pro Asp Ser Val Ile Asp Phe Leu Arg Gly Gly Ile Gly

900 905 910

Thr Pro Pro Gly Gly Phe Pro Gln Pro Phe Thr Asp Leu Ala Leu Ala

915 920 925

Gly Arg Pro Ala Pro Pro Ala Pro Thr Glu Leu Asp Pro Glu Leu Ala

930 935 940

Asp Arg Leu Gln Gln Pro Gly Ala Pro Arg Arg Gly Ala Leu Ala Glu

945 950 955 960

Ile Leu Phe Pro Gly Pro Ala Ser Asp Phe Ala Ala Ala Arg Ala Thr

965 970 975

Phe Gly Asp Val Ser Leu Ile Pro Thr Pro Ala Phe Phe Arg Gly Leu

980 985 990

His Glu Asp Glu Glu Leu Ala Ile Asp Leu Ala Pro Gly Val Arg Leu

995 1000 1005

Leu Phe Glu Leu Glu Ala Ile Gly Glu Pro Asp Lys Arg Gly Met

1010 1015 1020

Arg Thr Val Leu Val Arg Val Asn Gly Gln Leu Arg Pro Val Glu

1025 1030 1035

Val Arg Asp His Ser Val Lys Thr Thr Gly Val Gln Ile Glu Arg

1040 1045 1050

Ala Asp Pro Lys Arg Pro Gly His Val Pro Ala Pro Val Thr Gly

1055 1060 1065

Ile Val Ser Leu Leu Val Ala Ala Gly Asp Thr Val Ser Glu Gly

1070 1075 1080

Asp Pro Ile Ala Thr Leu Glu Ala Met Lys Met Glu Ser Thr Ile

1085 1090 1095

Ser Ala Pro Leu Ala Gly Arg Val Gln Arg Leu Ala Val Thr Thr

1100 1105 1110

Gly Ala Arg Leu Glu Gln Gly Asp Leu Leu Leu Val Ile Asp

1115 1120 1125

<210> 274

<211> 3372

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_17序列

<400> 274

gtgcgcaagg tcctcgtcgc caaccgcagt gagatcgcgg tccgcgtcat gcgcgcggcc 60

cacgagatgg acctgctgac cgtcggcgtc tacacgcccg aggaccgcgg ggcgctgcac 120

cgcaccaagg cgggggaggc ctaccagctc ggcgagcccg gccacccggt ccgcggctac 180

ctcgacgtcg aggcactgct cgaggtcgcc cgccgctcgg gcgccgacgc gctgcacccc 240

ggctacggct tcctgtccga gagcgcggcg ctcgccgacg cctgcgccgc cgcgggcgtc 300

accttcgtcg ggccgcccgc cgacgtgctg cgcctgaccg gcgacaaggt caccgctcgc 360

caggcggcgg tcgccgccgg cctgccggtg ctgcgcgcct cggacccgct gccggacggc 420

gagggcgcga tcgaggcggc ggaggcggtc ggcttcccgc tgttcgtcaa ggcggcagcc 480

ggtggcggtg gccgcggcct gcgcctggtg cagacgccgg aggagctcgc ggacgctgcc 540

cggtcggcgt cgagggaggc ggccgcggca ttcggtgacg ggaccatctt cctcgagcag 600

gccgtcgagc ggccgcgcca catcgaggtg caggtgctcg gcgacacgca cgggtcggtg 660

gtgcacctgt tcgagcgcga ctgctcggtg cagcgccgcc accagaaggt ggtggagctc 720

gcgccggcac ccgacctgcc ggaggccacc cgcacgggtc tgcacgaggc ggcgctggcg 780

ttcgcccgct cggtgggcta cgtcaacgcc ggcacggtcg agttcctcgt cggggccgac 840

ggcgcgttca cgttcatcga gatgaacccg cgcatccagg tcgagcacac ggtcaccgag 900

gaggtcaccg gcgtcgacct cgtcggcgcc cagctgcggg tcgcgcgcgg cgagacgctc 960

gagcagatcg gcatcgtgca ggaccgtctg tcggtgcgcg gctgcgcggt gcagtgccgc 1020

atcaccaccg aggaccccgc caacggcttc cgccccgaca ccggcaccat cgcgacctac 1080

cagtcgccgg gcggcccggg cgtccgcctc gacggcgccg tctacgcagg cgccgaggtc 1140

acgccgtact tcgactcgct gctcgtcaag ctcacgacgc gcggccccga cctgcgcacc 1200

gccgccaacc gcacccgccg ggcgctgcgg gagttccgcg tccgcggcgt caagaccaac 1260

gtcgagttcc tctaccggct catggaggac gaggacttcc tgtccggcgc ggtgccgacg 1320

tcgttcctcg ccgagcaccc ccacctcacc gacgccccgg cggtcaccga ccgcacgacg 1380

cgcatgctcg gcgcgctggc cgacgcgacc gtgaacggcc tgcagcgacc gtcccgggcg 1440

ctgctcgacc ccgtcagcaa gctgccggag ctgccggccg ccccgccggt ccagggctcc 1500

cggcgcctgc tcgacgaggt cgggcccgag cgctgggcgc aggcgctgcg tgagcgcacc 1560

tccctcgcgg tgaccgacac gaccctgcgc gacgcccacc agtcgctgct cgccacccgg 1620

ctgcgcacca ccgacgtcct gggcgccgcg cccacgaccg cgaagctcct gccgggcctg 1680

ctgtccttgg aggcgtgggg cggtgcgacg tacgacgtgg cgctgcggtt cctgcacgag 1740

gacccctggc agcgcctcgc ccagctgcgc gaggccgccc ccgacgtctg cctgcagatg 1800

ctgctgcgcg ggcgcaacgc cgtcggctac acgccctacc ccgaccgcgt cgtgcaggtc 1860

ttcgtcgccg aggcggccgc gacgggcgtc gacgtgttcc gcgtcttcga cgccctgaac 1920

gacctcgagc agatgcgccc ggcgctcgac gccgtccgcg aggcgggcaa ggtcgccgag 1980

gggacgctct gctacacggg cgacctgagc gaccccgggg agcggctcta cacgctcgac 2040

tactacctgc gcctggccga gcagctcgtc ggagccggcg cccacgtgct cgccgtcaag 2100

gacatggccg ggctgctgcg cccgcgcgcc gcggccacgc tcgtgcaggc cctgcgcagc 2160

cgcttcgacc tgcccgtgca cctgcacacc cacgacacga ccggcgggca gctcgccacc 2220

ctgctcgccg cgagcgacgc cggcgtcgac gcggtcgacg gcgccatggc gccgatgtcg 2280

ggcggcacca gccaggtcaa cctgtcggcg ctggtggccg cgaccgacca caccgagcgc 2340

tcgaccgggc tgtcgctggc ggccctgtcg gcgctcgagc cctactggga ggcggtgcgc 2400

gacctctacg cgccgttcga ggcgggactg cgggcgccga ccggcaccgt ctaccgccac 2460

gagatcccgg gcggccagct caccaacctg cgccagcagg cgatcgcgat cggactcggc 2520

gaccgctggg aggacgtcca ggagctgtac gccgtcgcca acgagctgct cggcaagccg 2580

atcaaggtga ccccgacgag caaggtcgtc ggtgacctcg cgatcttcct ggccagcggc 2640

gacgtcgacg tcgagcgcct gcgcgccgac ccgggggcct acgacctgcc ggccagcgtc 2700

ctcggctacc tcgccggcga gctcggcacg ccaccggccg gcttccccga gccgttccgc 2760

acccaggcgg tcgccggccg ggcagaggag ctgccggagg tcgcactcga gccggccgac 2820

gacgaggccc tcgacggacc cgaccgtcgc gcggtcctgt cgcggctgct gttccccggc 2880

ccgtggaagg actacgagac ggcgctcgcg gcgcacggcg acgtctcgat gatcccgacc 2940

gaggccttct tctacggcct cgagcccggc ggcaccgtga ccgtctgcct cgaggccggc 3000

gtggaggtgc tcgtcgagct gcagaccgtc ggggagctgt cggcggacgg gatgcgcacg 3060

ctccacgtcc gggtcaacgg ccagccccgg ccggtgcagg tgcgcgaccg ctcggtgggg 3120

gtggccgaca cggccgcccg gcgcgccgac ccgggcaacc cgcgccacgt cggtgcggcg 3180

ctgcccggcc tcgtgctgcc gaaggtggcg gtcggcgaca cggtcaccaa gggccaggcg 3240

ctcgccgtcg tcgaggcgat gaagatggag tcgaccgtct cgagccccgc ggacgggacc 3300

gtcgccgagg tggccgtcac cgccggcacc aacgtcgagg tcggtgacct gctggtggtg 3360

ctgggcgact ga 3372

<210> 275

<211> 1123

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_17序列

<400> 275

Val Arg Lys Val Leu Val Ala Asn Arg Ser Glu Ile Ala Val Arg Val

1 5 10 15

Met Arg Ala Ala His Glu Met Asp Leu Leu Thr Val Gly Val Tyr Thr

20 25 30

Pro Glu Asp Arg Gly Ala Leu His Arg Thr Lys Ala Gly Glu Ala Tyr

35 40 45

Gln Leu Gly Glu Pro Gly His Pro Val Arg Gly Tyr Leu Asp Val Glu

50 55 60

Ala Leu Leu Glu Val Ala Arg Arg Ser Gly Ala Asp Ala Leu His Pro

65 70 75 80

Gly Tyr Gly Phe Leu Ser Glu Ser Ala Ala Leu Ala Asp Ala Cys Ala

85 90 95

Ala Ala Gly Val Thr Phe Val Gly Pro Pro Ala Asp Val Leu Arg Leu

100 105 110

Thr Gly Asp Lys Val Thr Ala Arg Gln Ala Ala Val Ala Ala Gly Leu

115 120 125

Pro Val Leu Arg Ala Ser Asp Pro Leu Pro Asp Gly Glu Gly Ala Ile

130 135 140

Glu Ala Ala Glu Ala Val Gly Phe Pro Leu Phe Val Lys Ala Ala Ala

145 150 155 160

Gly Gly Gly Gly Arg Gly Leu Arg Leu Val Gln Thr Pro Glu Glu Leu

165 170 175

Ala Asp Ala Ala Arg Ser Ala Ser Arg Glu Ala Ala Ala Ala Phe Gly

180 185 190

Asp Gly Thr Ile Phe Leu Glu Gln Ala Val Glu Arg Pro Arg His Ile

195 200 205

Glu Val Gln Val Leu Gly Asp Thr His Gly Ser Val Val His Leu Phe

210 215 220

Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu Leu

225 230 235 240

Ala Pro Ala Pro Asp Leu Pro Glu Ala Thr Arg Thr Gly Leu His Glu

245 250 255

Ala Ala Leu Ala Phe Ala Arg Ser Val Gly Tyr Val Asn Ala Gly Thr

260 265 270

Val Glu Phe Leu Val Gly Ala Asp Gly Ala Phe Thr Phe Ile Glu Met

275 280 285

Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr Gly

290 295 300

Val Asp Leu Val Gly Ala Gln Leu Arg Val Ala Arg Gly Glu Thr Leu

305 310 315 320

Glu Gln Ile Gly Ile Val Gln Asp Arg Leu Ser Val Arg Gly Cys Ala

325 330 335

Val Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Gly Phe Arg Pro

340 345 350

Asp Thr Gly Thr Ile Ala Thr Tyr Gln Ser Pro Gly Gly Pro Gly Val

355 360 365

Arg Leu Asp Gly Ala Val Tyr Ala Gly Ala Glu Val Thr Pro Tyr Phe

370 375 380

Asp Ser Leu Leu Val Lys Leu Thr Thr Arg Gly Pro Asp Leu Arg Thr

385 390 395 400

Ala Ala Asn Arg Thr Arg Arg Ala Leu Arg Glu Phe Arg Val Arg Gly

405 410 415

Val Lys Thr Asn Val Glu Phe Leu Tyr Arg Leu Met Glu Asp Glu Asp

420 425 430

Phe Leu Ser Gly Ala Val Pro Thr Ser Phe Leu Ala Glu His Pro His

435 440 445

Leu Thr Asp Ala Pro Ala Val Thr Asp Arg Thr Thr Arg Met Leu Gly

450 455 460

Ala Leu Ala Asp Ala Thr Val Asn Gly Leu Gln Arg Pro Ser Arg Ala

465 470 475 480

Leu Leu Asp Pro Val Ser Lys Leu Pro Glu Leu Pro Ala Ala Pro Pro

485 490 495

Val Gln Gly Ser Arg Arg Leu Leu Asp Glu Val Gly Pro Glu Arg Trp

500 505 510

Ala Gln Ala Leu Arg Glu Arg Thr Ser Leu Ala Val Thr Asp Thr Thr

515 520 525

Leu Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Leu Arg Thr Thr

530 535 540

Asp Val Leu Gly Ala Ala Pro Thr Thr Ala Lys Leu Leu Pro Gly Leu

545 550 555 560

Leu Ser Leu Glu Ala Trp Gly Gly Ala Thr Tyr Asp Val Ala Leu Arg

565 570 575

Phe Leu His Glu Asp Pro Trp Gln Arg Leu Ala Gln Leu Arg Glu Ala

580 585 590

Ala Pro Asp Val Cys Leu Gln Met Leu Leu Arg Gly Arg Asn Ala Val

595 600 605

Gly Tyr Thr Pro Tyr Pro Asp Arg Val Val Gln Val Phe Val Ala Glu

610 615 620

Ala Ala Ala Thr Gly Val Asp Val Phe Arg Val Phe Asp Ala Leu Asn

625 630 635 640

Asp Leu Glu Gln Met Arg Pro Ala Leu Asp Ala Val Arg Glu Ala Gly

645 650 655

Lys Val Ala Glu Gly Thr Leu Cys Tyr Thr Gly Asp Leu Ser Asp Pro

660 665 670

Gly Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg Leu Ala Glu Gln

675 680 685

Leu Val Gly Ala Gly Ala His Val Leu Ala Val Lys Asp Met Ala Gly

690 695 700

Leu Leu Arg Pro Arg Ala Ala Ala Thr Leu Val Gln Ala Leu Arg Ser

705 710 715 720

Arg Phe Asp Leu Pro Val His Leu His Thr His Asp Thr Thr Gly Gly

725 730 735

Gln Leu Ala Thr Leu Leu Ala Ala Ser Asp Ala Gly Val Asp Ala Val

740 745 750

Asp Gly Ala Met Ala Pro Met Ser Gly Gly Thr Ser Gln Val Asn Leu

755 760 765

Ser Ala Leu Val Ala Ala Thr Asp His Thr Glu Arg Ser Thr Gly Leu

770 775 780

Ser Leu Ala Ala Leu Ser Ala Leu Glu Pro Tyr Trp Glu Ala Val Arg

785 790 795 800

Asp Leu Tyr Ala Pro Phe Glu Ala Gly Leu Arg Ala Pro Thr Gly Thr

805 810 815

Val Tyr Arg His Glu Ile Pro Gly Gly Gln Leu Thr Asn Leu Arg Gln

820 825 830

Gln Ala Ile Ala Ile Gly Leu Gly Asp Arg Trp Glu Asp Val Gln Glu

835 840 845

Leu Tyr Ala Val Ala Asn Glu Leu Leu Gly Lys Pro Ile Lys Val Thr

850 855 860

Pro Thr Ser Lys Val Val Gly Asp Leu Ala Ile Phe Leu Ala Ser Gly

865 870 875 880

Asp Val Asp Val Glu Arg Leu Arg Ala Asp Pro Gly Ala Tyr Asp Leu

885 890 895

Pro Ala Ser Val Leu Gly Tyr Leu Ala Gly Glu Leu Gly Thr Pro Pro

900 905 910

Ala Gly Phe Pro Glu Pro Phe Arg Thr Gln Ala Val Ala Gly Arg Ala

915 920 925

Glu Glu Leu Pro Glu Val Ala Leu Glu Pro Ala Asp Asp Glu Ala Leu

930 935 940

Asp Gly Pro Asp Arg Arg Ala Val Leu Ser Arg Leu Leu Phe Pro Gly

945 950 955 960

Pro Trp Lys Asp Tyr Glu Thr Ala Leu Ala Ala His Gly Asp Val Ser

965 970 975

Met Ile Pro Thr Glu Ala Phe Phe Tyr Gly Leu Glu Pro Gly Gly Thr

980 985 990

Val Thr Val Cys Leu Glu Ala Gly Val Glu Val Leu Val Glu Leu Gln

995 1000 1005

Thr Val Gly Glu Leu Ser Ala Asp Gly Met Arg Thr Leu His Val

1010 1015 1020

Arg Val Asn Gly Gln Pro Arg Pro Val Gln Val Arg Asp Arg Ser

1025 1030 1035

Val Gly Val Ala Asp Thr Ala Ala Arg Arg Ala Asp Pro Gly Asn

1040 1045 1050

Pro Arg His Val Gly Ala Ala Leu Pro Gly Leu Val Leu Pro Lys

1055 1060 1065

Val Ala Val Gly Asp Thr Val Thr Lys Gly Gln Ala Leu Ala Val

1070 1075 1080

Val Glu Ala Met Lys Met Glu Ser Thr Val Ser Ser Pro Ala Asp

1085 1090 1095

Gly Thr Val Ala Glu Val Ala Val Thr Ala Gly Thr Asn Val Glu

1100 1105 1110

Val Gly Asp Leu Leu Val Val Leu Gly Asp

1115 1120

<210> 276

<211> 3399

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_18序列

<400> 276

atgcgaaagc tcctggtcgc aaaccgcggc gagatcgcca cccgtgcgtt ccgcgccgcc 60

tacgaactgg gcctccgcag cgtcgcgatc tacacgccgg aggatcgcga gtccgcccac 120

cgcgtgaagg ccgacgaggc ctacgagatc ggagagccgg gacaccccgt ccgcggctac 180

ctcgaccccg agctcatcgc cgcgaccgcg aagtcggtgg gcgccgacgc cgtctatccc 240

ggctacgggt tcctctccga gaacccggat ctcgcgaccg cgtgcaccga gcgcgacatc 300

acgttcgtcg gcccgcccgc cgaggtcctg gagcgggtgg gcgacaaggt ccgcgcccgc 360

acggcggcga tcgaagccgg cctgcccgtg ctgagcgcga ccgacctgct cgacgaggac 420

gccgacgtgg cggcgctcgc cgaggagctc ggcatgccgg tgttcgtgaa ggccgcgcac 480

ggcggcggcg ggcgcggcat gcggctcgtc accgacctgg ccgatctccc cgaggccgtc 540

gccgccgcgc gccgcgaggc cgagagcgcg ttcggcaacc ccgccgtcta cctcgagcag 600

gcgatggtgc gcccgcggca catcgaggtg caggtgctcg cggacggcca cggcgggctc 660

gttcacctgt acgagcgcga ctgctcggtg caacgccgcc atcagaaggt cgtcgagctt 720

gcgcccgcgc cgaacctcga ccccgagctg cgcgaccgga tctgcgcgga cgccgtgcgc 780

ttcgccggcc acgtgggcta cgtcaacgcg ggcaccgtcg agttcctcgt cgacacggag 840

cgcgaccggc acgtcttcat cgagatgaac ccgcgcatcc aggtcgagca cacggtgacg 900

gaggagacga ccgacgtcga cctggtccgc acacagctgc tcgtcgccga gggcgcgcgg 960

ctgcacgagc tcggcctccg ccaggaggac atccgccagc gcggcttcgc gctgcagtgc 1020

cgcatcacga ccgaggaccc gagcgccggc ttccggcccg acaccggcac gatcgccgcg 1080

taccgcgcgc caggcggcgc cggggtgcgc ctcgacgagg ggtccgcgta cgtcggcgcg 1140

gagatctcgc cgtatttcga cccgctgctg ctgaagctca cgacgcgcgg gccggacatg 1200

cagaccgcca tcgcacgcgc gcgccgggcc gtcgaggagg tccgcatccg cggcgtcacg 1260

acgaaccagg cgttcctcgg caagctgctc gacgacccgg acttccgctc cgggcggctg 1320

cacacgacgt tcatcgacga gcgcccgcag ctcaccgccg tcgccccggg cggcgaccgc 1380

gcgacgcgca tcctgcgcct gctcgccgag cggacggtga accggcccta cggccacgcg 1440

cccggcggcc ccgatccccg ctccaagctg ccccgcgtgc ccaagggcga ggcgccggcc 1500

ggctcgcgcc agcggctcca ggagctgggg cccgagggct tcgctcgctg gctgcgttcg 1560

cgcgacgcgc tgcagctcac cgacaccacg ctgcgcgacg cgcaccagtc cctgttcgcg 1620

acgcgcatgc gcacgcacga catggaggcg gtcgcgccgc acatggcacg gctgctcggc 1680

gggctgttct cgctggaggc gtggggcggc gcgacgttcg acgtcgcgct gcgcttcctc 1740

aacgaggacc cctgggagcg catcggccgg ctgcgcgacc tcatcccgaa cgtctgcctg 1800

cagatgctgc tgcggggccg caacctgctc ggctacgagc cctatccgga cgaggcggtc 1860

agggcgttcg tcttcgaggc cgtcgacgcg ggggtcgaca tcttccgcat cttcgacgcg 1920

ctgaacaacg tcgagccgat gcgtgcggcg atcagcgcga cggtcgaggc gggggcggtc 1980

gcggagggtg ccatctgcta caccggcgac ctcttggacc ccggcgagcg gctctacacg 2040

ctcgaccact acctgcacgt cgccgagcag ctcgcggagg cgggcgtgca catcctcgcc 2100

atcaaggaca tggcggggct gctccgcgcg ccggccgcgg cccagctcgt cacccgcctg 2160

cggcgcgagt tcgacctccc cgtgcacctg cacacgcacg acaccgccgg cgggcagctc 2220

gcgacgtacc tggcggcgat cgaggcgggc gtcgacgccg tcgacggcgc tgccgcgtcg 2280

atggccggca tgacgagcca gccctcgctg gccagcatcg tggccgccac cgaccacacg 2340

gcgcgcgcga cgggcatcgc cctggagtcg ctgctggagc tcgagccgta ctgggaggcg 2400

gtgcgcacga cgtacgcgcc gttcgagagc ggcctgcgcg cgccgaccgg ccgcgtgtac 2460

cgccatcaga tccccggcgg ccagctgtcg aacctgcacc agcaggccgg ggcgctgggc 2520

ctgggcgacc ggttcgagga ggtcgagctc gcgtacgagc gcgccaacgc gctgctcggc 2580

gacatcatca aggtcacgcc gacgagcaag gtcgtcggcg acctcgcgct gttcgtcgtc 2640

tcggccggca tcgactggga cgagctggcc gcccagccgg agcgcttcga cctcccggcg 2700

tccgtcatcc agctgctgcg gggcgacctc ggcgagccgg ccggtggctt cccgcagccc 2760

ttcaccgagc gcgcgctgcg cggcgctgcg cgacacagcc aggacggctc cgccctcgac 2820

ccggagatgc gcacgcgcct ggccgaggcc gggaaggacc gccgcacggc gctcgccgag 2880

ctgcagttcc ccggtcccac ccaggagcgc agggacgcgt acgagcgcta cggggacgtg 2940

acccatgtgc ccacccggcc gtttctctac ggcctgcccg atgacggcga ggtctcgatc 3000

gacctcgggc ccggtgtgcg gctcatctac gcgctggagg cgatcggcga gcccgacgac 3060

cgcggcatgc gcaccgtcat ggtgcgcgtg aacggccagc tgcggccgat cgacgtgcgc 3120

gacgagtcgg tcgaggcgcc gacgtcgaag gtcgagcgcg ccgacccggc caacgacagc 3180

cacgtggcgg cgccgctcac cggcgtcgtg acgctgcgcg tgaaggagcg cgaggaggtc 3240

gccgaggggc agccgatcgc gatcctcgag gcgatgaaga tggagtcgac ggtgacgtcg 3300

ccggcggccg ggacggtcca gcgcgtgccg gtgccgagcg ggacgcgtct cgaacagggc 3360

gatctcattg cggtgatcga gaggtccgag tcggggtaa 3399

<210> 277

<211> 1132

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_18序列

<400> 277

Met Arg Lys Leu Leu Val Ala Asn Arg Gly Glu Ile Ala Thr Arg Ala

1 5 10 15

Phe Arg Ala Ala Tyr Glu Leu Gly Leu Arg Ser Val Ala Ile Tyr Thr

20 25 30

Pro Glu Asp Arg Glu Ser Ala His Arg Val Lys Ala Asp Glu Ala Tyr

35 40 45

Glu Ile Gly Glu Pro Gly His Pro Val Arg Gly Tyr Leu Asp Pro Glu

50 55 60

Leu Ile Ala Ala Thr Ala Lys Ser Val Gly Ala Asp Ala Val Tyr Pro

65 70 75 80

Gly Tyr Gly Phe Leu Ser Glu Asn Pro Asp Leu Ala Thr Ala Cys Thr

85 90 95

Glu Arg Asp Ile Thr Phe Val Gly Pro Pro Ala Glu Val Leu Glu Arg

100 105 110

Val Gly Asp Lys Val Arg Ala Arg Thr Ala Ala Ile Glu Ala Gly Leu

115 120 125

Pro Val Leu Ser Ala Thr Asp Leu Leu Asp Glu Asp Ala Asp Val Ala

130 135 140

Ala Leu Ala Glu Glu Leu Gly Met Pro Val Phe Val Lys Ala Ala His

145 150 155 160

Gly Gly Gly Gly Arg Gly Met Arg Leu Val Thr Asp Leu Ala Asp Leu

165 170 175

Pro Glu Ala Val Ala Ala Ala Arg Arg Glu Ala Glu Ser Ala Phe Gly

180 185 190

Asn Pro Ala Val Tyr Leu Glu Gln Ala Met Val Arg Pro Arg His Ile

195 200 205

Glu Val Gln Val Leu Ala Asp Gly His Gly Gly Leu Val His Leu Tyr

210 215 220

Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu Leu

225 230 235 240

Ala Pro Ala Pro Asn Leu Asp Pro Glu Leu Arg Asp Arg Ile Cys Ala

245 250 255

Asp Ala Val Arg Phe Ala Gly His Val Gly Tyr Val Asn Ala Gly Thr

260 265 270

Val Glu Phe Leu Val Asp Thr Glu Arg Asp Arg His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr Thr

290 295 300

Asp Val Asp Leu Val Arg Thr Gln Leu Leu Val Ala Glu Gly Ala Arg

305 310 315 320

Leu His Glu Leu Gly Leu Arg Gln Glu Asp Ile Arg Gln Arg Gly Phe

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ser Ala Gly Phe Arg

340 345 350

Pro Asp Thr Gly Thr Ile Ala Ala Tyr Arg Ala Pro Gly Gly Ala Gly

355 360 365

Val Arg Leu Asp Glu Gly Ser Ala Tyr Val Gly Ala Glu Ile Ser Pro

370 375 380

Tyr Phe Asp Pro Leu Leu Leu Lys Leu Thr Thr Arg Gly Pro Asp Met

385 390 395 400

Gln Thr Ala Ile Ala Arg Ala Arg Arg Ala Val Glu Glu Val Arg Ile

405 410 415

Arg Gly Val Thr Thr Asn Gln Ala Phe Leu Gly Lys Leu Leu Asp Asp

420 425 430

Pro Asp Phe Arg Ser Gly Arg Leu His Thr Thr Phe Ile Asp Glu Arg

435 440 445

Pro Gln Leu Thr Ala Val Ala Pro Gly Gly Asp Arg Ala Thr Arg Ile

450 455 460

Leu Arg Leu Leu Ala Glu Arg Thr Val Asn Arg Pro Tyr Gly His Ala

465 470 475 480

Pro Gly Gly Pro Asp Pro Arg Ser Lys Leu Pro Arg Val Pro Lys Gly

485 490 495

Glu Ala Pro Ala Gly Ser Arg Gln Arg Leu Gln Glu Leu Gly Pro Glu

500 505 510

Gly Phe Ala Arg Trp Leu Arg Ser Arg Asp Ala Leu Gln Leu Thr Asp

515 520 525

Thr Thr Leu Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Met Arg

530 535 540

Thr His Asp Met Glu Ala Val Ala Pro His Met Ala Arg Leu Leu Gly

545 550 555 560

Gly Leu Phe Ser Leu Glu Ala Trp Gly Gly Ala Thr Phe Asp Val Ala

565 570 575

Leu Arg Phe Leu Asn Glu Asp Pro Trp Glu Arg Ile Gly Arg Leu Arg

580 585 590

Asp Leu Ile Pro Asn Val Cys Leu Gln Met Leu Leu Arg Gly Arg Asn

595 600 605

Leu Leu Gly Tyr Glu Pro Tyr Pro Asp Glu Ala Val Arg Ala Phe Val

610 615 620

Phe Glu Ala Val Asp Ala Gly Val Asp Ile Phe Arg Ile Phe Asp Ala

625 630 635 640

Leu Asn Asn Val Glu Pro Met Arg Ala Ala Ile Ser Ala Thr Val Glu

645 650 655

Ala Gly Ala Val Ala Glu Gly Ala Ile Cys Tyr Thr Gly Asp Leu Leu

660 665 670

Asp Pro Gly Glu Arg Leu Tyr Thr Leu Asp His Tyr Leu His Val Ala

675 680 685

Glu Gln Leu Ala Glu Ala Gly Val His Ile Leu Ala Ile Lys Asp Met

690 695 700

Ala Gly Leu Leu Arg Ala Pro Ala Ala Ala Gln Leu Val Thr Arg Leu

705 710 715 720

Arg Arg Glu Phe Asp Leu Pro Val His Leu His Thr His Asp Thr Ala

725 730 735

Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Ile Glu Ala Gly Val Asp

740 745 750

Ala Val Asp Gly Ala Ala Ala Ser Met Ala Gly Met Thr Ser Gln Pro

755 760 765

Ser Leu Ala Ser Ile Val Ala Ala Thr Asp His Thr Ala Arg Ala Thr

770 775 780

Gly Ile Ala Leu Glu Ser Leu Leu Glu Leu Glu Pro Tyr Trp Glu Ala

785 790 795 800

Val Arg Thr Thr Tyr Ala Pro Phe Glu Ser Gly Leu Arg Ala Pro Thr

805 810 815

Gly Arg Val Tyr Arg His Gln Ile Pro Gly Gly Gln Leu Ser Asn Leu

820 825 830

His Gln Gln Ala Gly Ala Leu Gly Leu Gly Asp Arg Phe Glu Glu Val

835 840 845

Glu Leu Ala Tyr Glu Arg Ala Asn Ala Leu Leu Gly Asp Ile Ile Lys

850 855 860

Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Leu Phe Val Val

865 870 875 880

Ser Ala Gly Ile Asp Trp Asp Glu Leu Ala Ala Gln Pro Glu Arg Phe

885 890 895

Asp Leu Pro Ala Ser Val Ile Gln Leu Leu Arg Gly Asp Leu Gly Glu

900 905 910

Pro Ala Gly Gly Phe Pro Gln Pro Phe Thr Glu Arg Ala Leu Arg Gly

915 920 925

Ala Ala Arg His Ser Gln Asp Gly Ser Ala Leu Asp Pro Glu Met Arg

930 935 940

Thr Arg Leu Ala Glu Ala Gly Lys Asp Arg Arg Thr Ala Leu Ala Glu

945 950 955 960

Leu Gln Phe Pro Gly Pro Thr Gln Glu Arg Arg Asp Ala Tyr Glu Arg

965 970 975

Tyr Gly Asp Val Thr His Val Pro Thr Arg Pro Phe Leu Tyr Gly Leu

980 985 990

Pro Asp Asp Gly Glu Val Ser Ile Asp Leu Gly Pro Gly Val Arg Leu

995 1000 1005

Ile Tyr Ala Leu Glu Ala Ile Gly Glu Pro Asp Asp Arg Gly Met

1010 1015 1020

Arg Thr Val Met Val Arg Val Asn Gly Gln Leu Arg Pro Ile Asp

1025 1030 1035

Val Arg Asp Glu Ser Val Glu Ala Pro Thr Ser Lys Val Glu Arg

1040 1045 1050

Ala Asp Pro Ala Asn Asp Ser His Val Ala Ala Pro Leu Thr Gly

1055 1060 1065

Val Val Thr Leu Arg Val Lys Glu Arg Glu Glu Val Ala Glu Gly

1070 1075 1080

Gln Pro Ile Ala Ile Leu Glu Ala Met Lys Met Glu Ser Thr Val

1085 1090 1095

Thr Ser Pro Ala Ala Gly Thr Val Gln Arg Val Pro Val Pro Ser

1100 1105 1110

Gly Thr Arg Leu Glu Gln Gly Asp Leu Ile Ala Val Ile Glu Arg

1115 1120 1125

Ser Glu Ser Gly

1130

<210> 278

<211> 3390

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_19序列

<400> 278

atgcgcaagc tcctggtcgc aaatcgcggc gagatcgcca ccagggcgtt ccgcgccgct 60

tacgaactgg gcctccgcag cgtcgcgatc tacacaccgg aggatcgcga gtccgcccac 120

cgagtgaagg ccgacgaggc ctacgagatc ggggagccgg gacatcccgt ccgcggctac 180

ctggatcccg agctcatcgc cgagacggcc aagtcggtgg gcgccgacgc catctatccg 240

ggctacgggt tcctctcgga gaacccggac ctggcgaccg cgtgcgcgga gcgcgacatc 300

acgttcgtcg gcccgcccgc cgaggtgctg gagcgggtgg gcgacaaggt ccgcgcgcgc 360

accgcggcca tcgaggccgg gctgcccgtc ctgagcgcga cggacctgct cgacgaggac 420

gccgacgtcg aggcgctcgc cgaggagctc ggcatgcccg tgttcgtgaa ggccgcgcac 480

ggcggcggcg ggcgcggcat gcggctcgtc accgacgtcg ccgacctccc cgaggcggtc 540

gcagcggcgc gccgcgaggc cgagagcgcg ttcgggaatc ccgccgtcta cctcgagcag 600

gcgatggtgc gcccgcggca catcgaggtg caggtgctcg cggacggcca cggcgggctc 660

gttcacctgt acgagcgcga ctgctcggtg cagcgccgcc accagaaggt cgtcgagctg 720

gcacctgcgc cgaacctcga ccccgagctc cgcgatcgga tctgcgcgga cgccgtgcgg 780

ttcgccggcc acgtgggcta cgtcaacgcg ggcaccgtcg agttcctcgt cgacacggag 840

cgcgaccggc acgtcttcat cgagatgaac ccgcgcatcc aggtcgagca cacggtgacg 900

gaggagacga ccgacgtcga cctcgtccgc acgcagctgc tcgtcgccga aggtgcgcgc 960

ctgcacgagc tcggcctccg ccaggaggac atccgccagc gcggcttcgc gctgcagtgc 1020

cgcatcacga ccgaggatcc cagcgccggc tttcggcccg acaccggcac gatcgccgcg 1080

taccgggcgc cgggcggcgc cggagtgcgc ctcgacgagg ggtccgcgta cgtcggcgcg 1140

gagatctcgc cctacttcga cccgctgctg ctgaagctca cgacgcgcgg gccggacatg 1200

cagaccgcca tcgcacgcgc gcggcgcgcc gtcgaggagg ttcgcatccg tggcgtcacg 1260

accaaccagg cgttcctcgg caagctgctc gacgacccgg acttccgctc cgggcggctg 1320

cacacgacgt tcatcgacga gcgcccgcag ctgacggcag tcgcccccgg cggcgaccgc 1380

gcgacgcgga tcctgcgcct gctcgccgag cggacggtga accggcccta cggccacgcg 1440

ccggacggcc ccgatccccg ctccaagctg ccgcgcgcac cgaagggtga gccgcccgcc 1500

ggctcgcgcc agcgcctgca ggagctgggg ccggagggct tcgcgcgctg gctgcgctcg 1560

cgcgacgcgc tgcagctcac cgacaccacg ttgcgcgacg cgcaccagtc gctgttcgcg 1620

acgcgcatgc gcacacacga catggaggcg gtcgcgccgc acctggcacg gctgctcggc 1680

gggctgttct cgctggaggc gtggggcggc gcgacgttcg acgtcgcgtt gcgcttcctc 1740

aacgaggacc cgtgggagcg catcggccgg ctgcgcgacc tcattccgaa cgtctgcctg 1800

cagatgctgc tgcggggccg caacctgctc ggctacgagc cctatccgga cgaggcggtc 1860

agggcgttcg tcttcgaggc cgtcgacgca ggggtcgaca tcttccgcat cttcgacgcg 1920

ctgaacaacg tcgagccgat gcgcgcggcg atcagcgcga cggtcgaggc gggggcggtc 1980

gcggagggcg ccatctgcta caccggcgac ctcctggacc ccggcgagcg gctctacacg 2040

ctcgaccact accttcacgt cgccgagcag ctcgtcgagg cgggcgtgca catcctcgcc 2100

atcaaggaca tggcggggct gctccgagcg ccggccgcgg cccagctcgt cacccgcctg 2160

cggcgcgagt tcgacctccc cgtgcacctg catacgcacg acaccgccgg cggccagctc 2220

gcgacatacc tggcggcgat cgaggcgggc gtggacgccg tcgacggcgc cgccgcgtcg 2280

atggccggca tgaccagcca gccctcgctg gccagtatcg tcgccgccac cgaccacacg 2340

gcccgcgcga ccggcatcgc cctggagtcg ctgctggagc tcgagccgta ctgggaggcg 2400

gtgcgcacca cgtacgcgcc gttcgagagc ggcctgcgcg cgccgaccgg ccgcgtgtac 2460

cgccatcaga tccccggcgg ccagctgtcg aacctgcacc agcaggccgg ggcgttgggc 2520

ctcggcaacc ggttcgagga ggtcgagctc gcgtacgagc gcgccaacgc gctgctcggc 2580

gacatcatca aggtcacgcc gacgagcaag gtcgtcggcg atctcgcgct gttcgtcgtc 2640

tcggccggca tcgactggga cgagctggcc gcccagccgg agcgcttcga cctcccggcg 2700

tcggtcatcc aattgctgcg aggtgacctc ggcgagccgg ccggcggctt tccccagccc 2760

ttcaccgagc gcgcgctgcg cggcgccgcg cgtgacagcc gggacggctc cgcgctcgag 2820

ccggagatgc gcacgcgctt ggccgaggcc ggaaacgacc gccgtaccgc gctcgccgag 2880

ctgcagttcc ccggccctac ccaggagcgc agggacgcgt acgagcgcta cggggacgtc 2940

acgcccgtgc cgacccggcc gttcctctac ggcctgcccg atgacggcga ggtctcgatc 3000

gacctcgggc ctggcgtgcg gctcatctac gcgctggagg cgatcgggga gcccgacgac 3060

cgcggcatgc gcagcgtcct cgtgcgcgtg aacgggcagc tgcggccgat cgacgtgcgc 3120

gacgagtccg tcgaggcgcc gacgtcgaag gtcgagcgcg ccgacccggc caacgacagc 3180

cacgtggcgg ccccgctcac cggtgtcgtg accgtgcgcg tgaaggaggg cgacgaggtc 3240

tccgagggtc agccgctcgc gatgctcgag gcgatgaaga tggagtcgac ggtgacctcg 3300

cccgccgccg gaacggtgca gcgggtgccg gtgccgagcg gcacgcgcct ggagcagggc 3360

gatctcattg cggtgatcga gcagtcctag 3390

<210> 279

<211> 1129

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_19序列

<400> 279

Met Arg Lys Leu Leu Val Ala Asn Arg Gly Glu Ile Ala Thr Arg Ala

1 5 10 15

Phe Arg Ala Ala Tyr Glu Leu Gly Leu Arg Ser Val Ala Ile Tyr Thr

20 25 30

Pro Glu Asp Arg Glu Ser Ala His Arg Val Lys Ala Asp Glu Ala Tyr

35 40 45

Glu Ile Gly Glu Pro Gly His Pro Val Arg Gly Tyr Leu Asp Pro Glu

50 55 60

Leu Ile Ala Glu Thr Ala Lys Ser Val Gly Ala Asp Ala Ile Tyr Pro

65 70 75 80

Gly Tyr Gly Phe Leu Ser Glu Asn Pro Asp Leu Ala Thr Ala Cys Ala

85 90 95

Glu Arg Asp Ile Thr Phe Val Gly Pro Pro Ala Glu Val Leu Glu Arg

100 105 110

Val Gly Asp Lys Val Arg Ala Arg Thr Ala Ala Ile Glu Ala Gly Leu

115 120 125

Pro Val Leu Ser Ala Thr Asp Leu Leu Asp Glu Asp Ala Asp Val Glu

130 135 140

Ala Leu Ala Glu Glu Leu Gly Met Pro Val Phe Val Lys Ala Ala His

145 150 155 160

Gly Gly Gly Gly Arg Gly Met Arg Leu Val Thr Asp Val Ala Asp Leu

165 170 175

Pro Glu Ala Val Ala Ala Ala Arg Arg Glu Ala Glu Ser Ala Phe Gly

180 185 190

Asn Pro Ala Val Tyr Leu Glu Gln Ala Met Val Arg Pro Arg His Ile

195 200 205

Glu Val Gln Val Leu Ala Asp Gly His Gly Gly Leu Val His Leu Tyr

210 215 220

Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu Leu

225 230 235 240

Ala Pro Ala Pro Asn Leu Asp Pro Glu Leu Arg Asp Arg Ile Cys Ala

245 250 255

Asp Ala Val Arg Phe Ala Gly His Val Gly Tyr Val Asn Ala Gly Thr

260 265 270

Val Glu Phe Leu Val Asp Thr Glu Arg Asp Arg His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Thr Thr

290 295 300

Asp Val Asp Leu Val Arg Thr Gln Leu Leu Val Ala Glu Gly Ala Arg

305 310 315 320

Leu His Glu Leu Gly Leu Arg Gln Glu Asp Ile Arg Gln Arg Gly Phe

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ser Ala Gly Phe Arg

340 345 350

Pro Asp Thr Gly Thr Ile Ala Ala Tyr Arg Ala Pro Gly Gly Ala Gly

355 360 365

Val Arg Leu Asp Glu Gly Ser Ala Tyr Val Gly Ala Glu Ile Ser Pro

370 375 380

Tyr Phe Asp Pro Leu Leu Leu Lys Leu Thr Thr Arg Gly Pro Asp Met

385 390 395 400

Gln Thr Ala Ile Ala Arg Ala Arg Arg Ala Val Glu Glu Val Arg Ile

405 410 415

Arg Gly Val Thr Thr Asn Gln Ala Phe Leu Gly Lys Leu Leu Asp Asp

420 425 430

Pro Asp Phe Arg Ser Gly Arg Leu His Thr Thr Phe Ile Asp Glu Arg

435 440 445

Pro Gln Leu Thr Ala Val Ala Pro Gly Gly Asp Arg Ala Thr Arg Ile

450 455 460

Leu Arg Leu Leu Ala Glu Arg Thr Val Asn Arg Pro Tyr Gly His Ala

465 470 475 480

Pro Asp Gly Pro Asp Pro Arg Ser Lys Leu Pro Arg Ala Pro Lys Gly

485 490 495

Glu Pro Pro Ala Gly Ser Arg Gln Arg Leu Gln Glu Leu Gly Pro Glu

500 505 510

Gly Phe Ala Arg Trp Leu Arg Ser Arg Asp Ala Leu Gln Leu Thr Asp

515 520 525

Thr Thr Leu Arg Asp Ala His Gln Ser Leu Phe Ala Thr Arg Met Arg

530 535 540

Thr His Asp Met Glu Ala Val Ala Pro His Leu Ala Arg Leu Leu Gly

545 550 555 560

Gly Leu Phe Ser Leu Glu Ala Trp Gly Gly Ala Thr Phe Asp Val Ala

565 570 575

Leu Arg Phe Leu Asn Glu Asp Pro Trp Glu Arg Ile Gly Arg Leu Arg

580 585 590

Asp Leu Ile Pro Asn Val Cys Leu Gln Met Leu Leu Arg Gly Arg Asn

595 600 605

Leu Leu Gly Tyr Glu Pro Tyr Pro Asp Glu Ala Val Arg Ala Phe Val

610 615 620

Phe Glu Ala Val Asp Ala Gly Val Asp Ile Phe Arg Ile Phe Asp Ala

625 630 635 640

Leu Asn Asn Val Glu Pro Met Arg Ala Ala Ile Ser Ala Thr Val Glu

645 650 655

Ala Gly Ala Val Ala Glu Gly Ala Ile Cys Tyr Thr Gly Asp Leu Leu

660 665 670

Asp Pro Gly Glu Arg Leu Tyr Thr Leu Asp His Tyr Leu His Val Ala

675 680 685

Glu Gln Leu Val Glu Ala Gly Val His Ile Leu Ala Ile Lys Asp Met

690 695 700

Ala Gly Leu Leu Arg Ala Pro Ala Ala Ala Gln Leu Val Thr Arg Leu

705 710 715 720

Arg Arg Glu Phe Asp Leu Pro Val His Leu His Thr His Asp Thr Ala

725 730 735

Gly Gly Gln Leu Ala Thr Tyr Leu Ala Ala Ile Glu Ala Gly Val Asp

740 745 750

Ala Val Asp Gly Ala Ala Ala Ser Met Ala Gly Met Thr Ser Gln Pro

755 760 765

Ser Leu Ala Ser Ile Val Ala Ala Thr Asp His Thr Ala Arg Ala Thr

770 775 780

Gly Ile Ala Leu Glu Ser Leu Leu Glu Leu Glu Pro Tyr Trp Glu Ala

785 790 795 800

Val Arg Thr Thr Tyr Ala Pro Phe Glu Ser Gly Leu Arg Ala Pro Thr

805 810 815

Gly Arg Val Tyr Arg His Gln Ile Pro Gly Gly Gln Leu Ser Asn Leu

820 825 830

His Gln Gln Ala Gly Ala Leu Gly Leu Gly Asn Arg Phe Glu Glu Val

835 840 845

Glu Leu Ala Tyr Glu Arg Ala Asn Ala Leu Leu Gly Asp Ile Ile Lys

850 855 860

Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Leu Phe Val Val

865 870 875 880

Ser Ala Gly Ile Asp Trp Asp Glu Leu Ala Ala Gln Pro Glu Arg Phe

885 890 895

Asp Leu Pro Ala Ser Val Ile Gln Leu Leu Arg Gly Asp Leu Gly Glu

900 905 910

Pro Ala Gly Gly Phe Pro Gln Pro Phe Thr Glu Arg Ala Leu Arg Gly

915 920 925

Ala Ala Arg Asp Ser Arg Asp Gly Ser Ala Leu Glu Pro Glu Met Arg

930 935 940

Thr Arg Leu Ala Glu Ala Gly Asn Asp Arg Arg Thr Ala Leu Ala Glu

945 950 955 960

Leu Gln Phe Pro Gly Pro Thr Gln Glu Arg Arg Asp Ala Tyr Glu Arg

965 970 975

Tyr Gly Asp Val Thr Pro Val Pro Thr Arg Pro Phe Leu Tyr Gly Leu

980 985 990

Pro Asp Asp Gly Glu Val Ser Ile Asp Leu Gly Pro Gly Val Arg Leu

995 1000 1005

Ile Tyr Ala Leu Glu Ala Ile Gly Glu Pro Asp Asp Arg Gly Met

1010 1015 1020

Arg Ser Val Leu Val Arg Val Asn Gly Gln Leu Arg Pro Ile Asp

1025 1030 1035

Val Arg Asp Glu Ser Val Glu Ala Pro Thr Ser Lys Val Glu Arg

1040 1045 1050

Ala Asp Pro Ala Asn Asp Ser His Val Ala Ala Pro Leu Thr Gly

1055 1060 1065

Val Val Thr Val Arg Val Lys Glu Gly Asp Glu Val Ser Glu Gly

1070 1075 1080

Gln Pro Leu Ala Met Leu Glu Ala Met Lys Met Glu Ser Thr Val

1085 1090 1095

Thr Ser Pro Ala Ala Gly Thr Val Gln Arg Val Pro Val Pro Ser

1100 1105 1110

Gly Thr Arg Leu Glu Gln Gly Asp Leu Ile Ala Val Ile Glu Gln

1115 1120 1125

Ser

<210> 280

<211> 3378

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_20序列

<400> 280

atgttctcga aggtgctcgt ggccaaccgg ggcgagatcg ccatccgggc gttccgggct 60

gcctacgagc tcggtgctcg cacggtggcg gtcttcccca acgaggacag gtggtccgag 120

caccgcctca aggccgacga ggcctacgag atcggccagc gaggccaccc ggtccgcgcc 180

tacctcgacc cggacgcgat cgtcgcggtc gccgtacgtt cgggtgccga cgcggtctac 240

cccggctacg gcttcctgtc ggagaacccc aggctggccg aggcctgcgc caacgccggc 300

atcaccttcg tcggcccgac ggccgaggtg ctcaccctca ccggcaacaa ggcccgggcg 360

atcgcggcgg cccacgaggc cggcgtaccc acactggcct cggtgccgcc gagccaggac 420

gccgacgagc tggtcgccac ggccggcgag ctgccctacc cgctcttcgt caaggcggtc 480

gccggcggcg gcgggcgcgg catgaggcgg gtcgacgagc ccgcgcagct gcgcgcggcc 540

atcgagacgt gcatgcgcga ggccgagggc gccttcggcg atgcgaccgt gttcgtcgag 600

caggcggtgg tcgacccacg gcacatcgag gtgcagatcc tcgcggacag gcaaggcaac 660

gtcatccacc tcttcgagcg cgactgctca gtgcagcgcc gccaccagaa agtggtcgag 720

atcgcaccgg cccccaactt cgaccccgag ctgcgggagc ggatctgcgc cgacgcggtg 780

aggttcgcgc gccacatcgg ctaccagaac gccggcacgg tcgagttcct cgtcgacccc 840

ggtggcagct acgtcttcat cgagatgaac ccccgcatcc aggtcgagca caccgtgacc 900

gaggaggtca ccgacgtcga cctcgtgcag tcgcagctgc ggatcgcctc gggagagacg 960

ctccaggacc tcggcctgcg ccaagactcg atcgtgctgc gcggcgccgc gctccagtgc 1020

cggatcacga ccgaggaccc tgccaacaac ttccggcccg acaccggccg gatcacgacg 1080

taccgctccc ccggcggcgc cggcatccgc ctcgacggcg gcacgaccta caccggcgcc 1140

gaggtcagcc cgcacttcga ctcgatgctg gccaagctca cctgccgggg ccgcacgttc 1200

gagaaggccg tcgaacgggc ccgccgggcg gtcgcggagt tccggatccg cggggtgtcg 1260

accaacatcg cgttcctgca ggcgctgctc gacgaccctg acttccgtgc cggccgggtg 1320

accacgtcgt tcatcgagac gcaccccgag ctgctcaccg cgcgcgcgag cggcgaccgc 1380

ggcaccaagc tgctcaccta cctcgccgac gtcaccgtca accagccaca cggcccagcg 1440

ccggtgagcc tcgacccggt cagcaagctc cccgaggtcg acctcaaggt gcccgcgccc 1500

gacggcaccc gccagtcgtt gctggccctc ggacccgctg cgttcgcgca ggcgcttcgc 1560

gaccagggcc gggttgccgt caccgacacg accttccgcg acgcccacca gtccctgctg 1620

gccacccggg tgcggacccg cgacctgctc gccgtcgcgg gtcatgtcgc gcggacgacc 1680

ccgcagctgt ggtcgctcga ggcgtggggc ggagcgacgt acgacgtggc gctgcggttc 1740

ctgtccgagg acccgtggga gcggctggcc aagctccgcc aggccacacc gaacatctgc 1800

ctccagatgt tgctgcgcgg gcgcaacacg gtcggctaca cgccgtaccc gaccgacgtc 1860

accaccgcct tcgtcgagga ggccgcggcc accgggatcg acgtcttccg catcttcgac 1920

gccctcaacg acgtcgagca gatgcggccg gccatcgagg ccgtgctggc gaccggcacc 1980

agcgtcgcgg aggtcgctct ctgctacacc ggcgacctgt ccgaccctcg cgagaggctc 2040

tacacgctcg actactacct cggcctggca tcgcggatcg tggagtccgg cgcacatgtg 2100

ctggcgatca aggacatggc cggtgtgctg cgggctccgg ccgcccgaag gctggtgacc 2160

gcgctgcgct cggagttcga cctgccggtg cacctgcaca cccacgacac ccccggcggc 2220

cagctggcta cgctgctggc cgcgatcgaa gcaggggtcg acgccgtcga cgcggccacc 2280

gcgtccatgg ccggcaccac ctcgcagccg ccgctctcgg cgctggtgtc cgcgaccgac 2340

cactcgccgc gcgagaccgg tctctcgctc gacgcggtgg gtgcgctgga gccctactgg 2400

gaggccgtgc gccgcgtcta cgcgccgttc gagtcggggc tgcccgcgcc caccggccgc 2460

gtctacaccc acgagatccc cggtgggcag ctctccaacc tgcgccagca ggcgatcgcg 2520

ctcggcctcg gcgagaagtt cgagcagatc gaggacatgt acgccgcggc cgaccgcatc 2580

ctcggccaca tcgtcaaggt gaccccgtcg tccaaggtcg tcggcgacct ggcgctgcac 2640

ctcgtcgcgg tcggtgccga cccggcggag ttcgcggcca accctcagaa gttcgacatc 2700

cccgcctcgg tcatcggctt cctccacggc gagctgggcg acccgcccgg cggctggccg 2760

gagccgttcc ggtcgcgcgc gatcgagggg cgggcgtggg agccgccctc gggctcgctc 2820

accgacgacc agcgcgccgg cctgcgcgac aaccgccgcg agaccctcaa cgagctgctg 2880

ttcccagggc cgaccaagca gttccgcgag atccgggcga cgtacggcga cgtgtcggcg 2940

ctctcctcga tcgactacct ctacggcctg cgtcaggggg tcgagcacca ggtcgagctc 3000

gacgagggcg tgacgatctt cctcgggctc caggcgatct ccgaccccga cgaacgcggc 3060

ttccgtaccg tgatggcgct gatcaacggc cagctacggc cgatcagcgt gcgcgaccgc 3120

agcgtctcca cggccgtcgc cgccgccgag aaggcggacc actctgaccc gagccatgtc 3180

gcggcgccgt tccagggtgc ggtgacggtg gtcgtcgaga agggcgagga ggtcgaggca 3240

ggccagaccg tcgccacgat cgaggcgatg aagatggagg ccgcgatcac cgctccccgc 3300

gccggcaccg tcgagcgcct ggccttcgcc ggcacccaga ccgtcgacgg aggcgacctg 3360

gtgctggtcc tcggctga 3378

<210> 281

<211> 1125

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_20序列

<400> 281

Met Phe Ser Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Ala Arg Thr Val Ala Val Phe

20 25 30

Pro Asn Glu Asp Arg Trp Ser Glu His Arg Leu Lys Ala Asp Glu Ala

35 40 45

Tyr Glu Ile Gly Gln Arg Gly His Pro Val Arg Ala Tyr Leu Asp Pro

50 55 60

Asp Ala Ile Val Ala Val Ala Val Arg Ser Gly Ala Asp Ala Val Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Arg Leu Ala Glu Ala Cys

85 90 95

Ala Asn Ala Gly Ile Thr Phe Val Gly Pro Thr Ala Glu Val Leu Thr

100 105 110

Leu Thr Gly Asn Lys Ala Arg Ala Ile Ala Ala Ala His Glu Ala Gly

115 120 125

Val Pro Thr Leu Ala Ser Val Pro Pro Ser Gln Asp Ala Asp Glu Leu

130 135 140

Val Ala Thr Ala Gly Glu Leu Pro Tyr Pro Leu Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asp Glu Pro Ala Gln

165 170 175

Leu Arg Ala Ala Ile Glu Thr Cys Met Arg Glu Ala Glu Gly Ala Phe

180 185 190

Gly Asp Ala Thr Val Phe Val Glu Gln Ala Val Val Asp Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Arg Gln Gly Asn Val Ile His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Val Glu

225 230 235 240

Ile Ala Pro Ala Pro Asn Phe Asp Pro Glu Leu Arg Glu Arg Ile Cys

245 250 255

Ala Asp Ala Val Arg Phe Ala Arg His Ile Gly Tyr Gln Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Asp Pro Gly Gly Ser Tyr Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr

290 295 300

Asp Val Asp Leu Val Gln Ser Gln Leu Arg Ile Ala Ser Gly Glu Thr

305 310 315 320

Leu Gln Asp Leu Gly Leu Arg Gln Asp Ser Ile Val Leu Arg Gly Ala

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Asn Phe Arg

340 345 350

Pro Asp Thr Gly Arg Ile Thr Thr Tyr Arg Ser Pro Gly Gly Ala Gly

355 360 365

Ile Arg Leu Asp Gly Gly Thr Thr Tyr Thr Gly Ala Glu Val Ser Pro

370 375 380

His Phe Asp Ser Met Leu Ala Lys Leu Thr Cys Arg Gly Arg Thr Phe

385 390 395 400

Glu Lys Ala Val Glu Arg Ala Arg Arg Ala Val Ala Glu Phe Arg Ile

405 410 415

Arg Gly Val Ser Thr Asn Ile Ala Phe Leu Gln Ala Leu Leu Asp Asp

420 425 430

Pro Asp Phe Arg Ala Gly Arg Val Thr Thr Ser Phe Ile Glu Thr His

435 440 445

Pro Glu Leu Leu Thr Ala Arg Ala Ser Gly Asp Arg Gly Thr Lys Leu

450 455 460

Leu Thr Tyr Leu Ala Asp Val Thr Val Asn Gln Pro His Gly Pro Ala

465 470 475 480

Pro Val Ser Leu Asp Pro Val Ser Lys Leu Pro Glu Val Asp Leu Lys

485 490 495

Val Pro Ala Pro Asp Gly Thr Arg Gln Ser Leu Leu Ala Leu Gly Pro

500 505 510

Ala Ala Phe Ala Gln Ala Leu Arg Asp Gln Gly Arg Val Ala Val Thr

515 520 525

Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Val

530 535 540

Arg Thr Arg Asp Leu Leu Ala Val Ala Gly His Val Ala Arg Thr Thr

545 550 555 560

Pro Gln Leu Trp Ser Leu Glu Ala Trp Gly Gly Ala Thr Tyr Asp Val

565 570 575

Ala Leu Arg Phe Leu Ser Glu Asp Pro Trp Glu Arg Leu Ala Lys Leu

580 585 590

Arg Gln Ala Thr Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Arg

595 600 605

Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Asp Val Thr Thr Ala Phe

610 615 620

Val Glu Glu Ala Ala Ala Thr Gly Ile Asp Val Phe Arg Ile Phe Asp

625 630 635 640

Ala Leu Asn Asp Val Glu Gln Met Arg Pro Ala Ile Glu Ala Val Leu

645 650 655

Ala Thr Gly Thr Ser Val Ala Glu Val Ala Leu Cys Tyr Thr Gly Asp

660 665 670

Leu Ser Asp Pro Arg Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Gly

675 680 685

Leu Ala Ser Arg Ile Val Glu Ser Gly Ala His Val Leu Ala Ile Lys

690 695 700

Asp Met Ala Gly Val Leu Arg Ala Pro Ala Ala Arg Arg Leu Val Thr

705 710 715 720

Ala Leu Arg Ser Glu Phe Asp Leu Pro Val His Leu His Thr His Asp

725 730 735

Thr Pro Gly Gly Gln Leu Ala Thr Leu Leu Ala Ala Ile Glu Ala Gly

740 745 750

Val Asp Ala Val Asp Ala Ala Thr Ala Ser Met Ala Gly Thr Thr Ser

755 760 765

Gln Pro Pro Leu Ser Ala Leu Val Ser Ala Thr Asp His Ser Pro Arg

770 775 780

Glu Thr Gly Leu Ser Leu Asp Ala Val Gly Ala Leu Glu Pro Tyr Trp

785 790 795 800

Glu Ala Val Arg Arg Val Tyr Ala Pro Phe Glu Ser Gly Leu Pro Ala

805 810 815

Pro Thr Gly Arg Val Tyr Thr His Glu Ile Pro Gly Gly Gln Leu Ser

820 825 830

Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu Gly Glu Lys Phe Glu

835 840 845

Gln Ile Glu Asp Met Tyr Ala Ala Ala Asp Arg Ile Leu Gly His Ile

850 855 860

Val Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu His

865 870 875 880

Leu Val Ala Val Gly Ala Asp Pro Ala Glu Phe Ala Ala Asn Pro Gln

885 890 895

Lys Phe Asp Ile Pro Ala Ser Val Ile Gly Phe Leu His Gly Glu Leu

900 905 910

Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Phe Arg Ser Arg Ala Ile

915 920 925

Glu Gly Arg Ala Trp Glu Pro Pro Ser Gly Ser Leu Thr Asp Asp Gln

930 935 940

Arg Ala Gly Leu Arg Asp Asn Arg Arg Glu Thr Leu Asn Glu Leu Leu

945 950 955 960

Phe Pro Gly Pro Thr Lys Gln Phe Arg Glu Ile Arg Ala Thr Tyr Gly

965 970 975

Asp Val Ser Ala Leu Ser Ser Ile Asp Tyr Leu Tyr Gly Leu Arg Gln

980 985 990

Gly Val Glu His Gln Val Glu Leu Asp Glu Gly Val Thr Ile Phe Leu

995 1000 1005

Gly Leu Gln Ala Ile Ser Asp Pro Asp Glu Arg Gly Phe Arg Thr

1010 1015 1020

Val Met Ala Leu Ile Asn Gly Gln Leu Arg Pro Ile Ser Val Arg

1025 1030 1035

Asp Arg Ser Val Ser Thr Ala Val Ala Ala Ala Glu Lys Ala Asp

1040 1045 1050

His Ser Asp Pro Ser His Val Ala Ala Pro Phe Gln Gly Ala Val

1055 1060 1065

Thr Val Val Val Glu Lys Gly Glu Glu Val Glu Ala Gly Gln Thr

1070 1075 1080

Val Ala Thr Ile Glu Ala Met Lys Met Glu Ala Ala Ile Thr Ala

1085 1090 1095

Pro Arg Ala Gly Thr Val Glu Arg Leu Ala Phe Ala Gly Thr Gln

1100 1105 1110

Thr Val Asp Gly Gly Asp Leu Val Leu Val Leu Gly

1115 1120 1125

<210> 282

<211> 3405

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_21序列

<400> 282

atgttcgcca aggtgctggt cgccaaccgc ggtgagatcg ctgtccgggc cttccgtgcc 60

gcgtacgagc tgggcgtgaa gacggtagcg gtctttccct atgaggaccg taacgctgtg 120

caccggatca aggcggatga ggcctacatg atcggcgagc gtggccatcc ggtacgcgct 180

tacctggata tcgcagagat catccgggcc gctaaggagt ccgaggccga tgcgatctac 240

cccggctatg gattcttgag cgagaatcct ggcctggccc aggcctgcga cgaggcgggc 300

atcgtcttca tcggcccgcc cgccggggtt ctcgagcttg ccggcaacaa ggtccgtgcc 360

attgaagcag ccagggcggc tggcgtcccc accctcaagt ccacacctcc ttcggcagac 420

cttgatgagc tggtgcccgc cgccgaggag atcggctttc cggtgttcgt caaggcggtc 480

gccggcggcg gcggtcgcgg tatgcgccgg gtcgatgacc ccaagatgct tcgggaatcc 540

ataactgcag cgatgcgcga ggctgaaggc gcgttcggcg atcccaccgt gtacatcgag 600

caggcggttg ggcgcccgcg ccacatcgag gtacagatcc ttgccgatac ccagggccac 660

accatccatc tgttcgagcg tgactgctcg gttcagcggc ggcaccagaa gattgttgag 720

attgcgcccg cgcagaacat ctcgaccgag ttgcgggagg cattgtgccg tgacgcggtg 780

cgctttgccg agtcgatcaa cttctcatgt gcgggaactg ttgagttctt ggtcgaaact 840

gaaggacagc gtgccggtca gcacgtcttc atcgagatga atcctcggat tcaggttgag 900

cacccggtca ccgaagagat caccgacgtt gatcttgtgc aggcccagat gcgcattgcc 960

gccggggaga gcctgagtga tcttggtctg gcccaggatg tgatcaggat caacggtgcg 1020

gcactgcagt gtcggatcac gaccgaggac ccggcgaacg gctttcggcc cgacaccggc 1080

acgatcactg cctaccgctc cgccggtggc gcgggcgtac gcctcgacgg tggcaccatc 1140

gacatcgggg tggagatcag cgcgtacttc gattcgttgc tggtcaagct catttgccgc 1200

ggccggacat tcgagcaggc tgtggctcgg gctcagcgga ccttggctga gttccggatt 1260

cgtggagtca gcaccaacat tcctttcctg caatcggttt tggaggatcc ggacttcatt 1320

gccggcgata tctcgacctc cttcattgac gagcggcccg acctgctgac cgcccatgct 1380

ccggcggacc gcggtaccaa gctgttgcgc tggctggctg aggtaacggt caaccagccg 1440

catggcccgg caccgacgca gctcgaccca ggcgttaaac gacctaccgg cgtcgatctc 1500

aacgtcccgt cgcccccggg ctcgcggcag cgtcttcttg atcttggtcc agaagccttc 1560

gctgccgacc tgcggcaacg ggtcccgatc gaggtcaccg acacgacctt ccgggacgcc 1620

catcagtcgt tgctggctac ccgggtccgt accaaagacc tcatcaggat cgcgccatac 1680

gtcggccgga tgacgccgga actgctgtcg gtcgaatgct ggggcggggc gacctatgac 1740

gtagcgcttc gcttcatttc cgaggatcct tgggaacgcc tggccgcgct gcgctacaac 1800

atgccgggcc tgtgcctgca gatgttgctg cgcggtcgca acacggttgg ctatacgcca 1860

tacccgacca aggtcacgac ctccttcgtg gccgaggctg ccgaggttgg catcgacatc 1920

ttccggatct tcgatgcgct caacgacgtc gagcagatgc gtccagcgat cgaggcggtg 1980

cgcgagacag gcagcaccat tgccgaggtg gctctgtgct acaccggcga tctgaactca 2040

ccggctgagg atctctacac gctcgactac tacttgcgtt tggccgagaa gatcgtgaac 2100

gcgggggcgc acgtgctcgg gatcaaggac atggccggcc tgctccgccc accagcggcc 2160

cggaagctcg tcgccgcact gcgagacaac tttgatctgc cggtgcactt gcatacccac 2220

gacaccgcag gtggtcagct tgcgaccttg ttagccgcca tcgatgtggg tgttgatgcg 2280

gttgacgtgg ccagcgcccc gatggccggg acgaccagcc aggtgccggc gtcggccctg 2340

gtggcagcct gcgcgaacac cgagcggccg accaaccttg atctgcgcgc cgtgatggaa 2400

ctggagccgt actgggaagc ggtgcgcagg gtgtacgcac ccttcgagtc agggttgccc 2460

agtccgacgg gccgggttta cgaccacgag attccgggag ggcagctctc caacctccgc 2520

cagcaggcga tcgctctcgg gctgggggag aagtttgagc agatcgaggc gatgtacacc 2580

gcggcgaatg caattttggg caggccgccc aaggtcaccc cgtcgtcgaa ggtggtcggc 2640

gatctggcac ttcacctggt cgcggtcggc gcggacccgg acgacttcgc tgagaacccg 2700

cagagctacg acatcccgga ttcagtgatc ggctttctca atggggaact gggcgatccg 2760

cctggcggct ggccggaacc attccggacc aaagcgctgc aggggcggac cgtgccggtc 2820

cgcgatgtgg agctctcacc ggaagattca gctgatcttg atgacaaggg ccaggtccgc 2880

caggccacgt tgaaccgcct gctgtttcct gggccgacca aggagttcct ggccaaccga 2940

gcaacctacg gcgacgtcgc ccggctcaat actctcgact tcctctacgg gttgcagccc 3000

ggccaggagc atgtcgccaa gatcggtaaa ggtgtcagcc tgattctcgg gctggcggcg 3060

atcggtaacg ccgacgagcg aggcatgcgc accgtgatgt gtacgctcaa cgggcagttg 3120

cggccgctcc gggtgcgcga caagtcgatc aaggtcgatg tcaagactgc cgaacgcgcg 3180

gatcccacca agccgggtca tgtcgccgct ccgttcgccg gggtggtcac cgtcaccgtc 3240

aacgaaggcg acacagtcga gaccggtgca acggtagcaa ccatcgaagc catgaagatg 3300

gaggccgcca tcaccgcgcc ggtctcaggt gtggtgcagc gattggcgat cgcagcagtg 3360

cagcaggtgg agggcggcga cctcctgctc gtcatcgcgg tctag 3405

<210> 283

<211> 1134

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_21序列

<400> 283

Met Phe Ala Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Val Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Val Lys Thr Val Ala Val Phe

20 25 30

Pro Tyr Glu Asp Arg Asn Ala Val His Arg Ile Lys Ala Asp Glu Ala

35 40 45

Tyr Met Ile Gly Glu Arg Gly His Pro Val Arg Ala Tyr Leu Asp Ile

50 55 60

Ala Glu Ile Ile Arg Ala Ala Lys Glu Ser Glu Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Gly Leu Ala Gln Ala Cys

85 90 95

Asp Glu Ala Gly Ile Val Phe Ile Gly Pro Pro Ala Gly Val Leu Glu

100 105 110

Leu Ala Gly Asn Lys Val Arg Ala Ile Glu Ala Ala Arg Ala Ala Gly

115 120 125

Val Pro Thr Leu Lys Ser Thr Pro Pro Ser Ala Asp Leu Asp Glu Leu

130 135 140

Val Pro Ala Ala Glu Glu Ile Gly Phe Pro Val Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asp Asp Pro Lys Met

165 170 175

Leu Arg Glu Ser Ile Thr Ala Ala Met Arg Glu Ala Glu Gly Ala Phe

180 185 190

Gly Asp Pro Thr Val Tyr Ile Glu Gln Ala Val Gly Arg Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Thr Gln Gly His Thr Ile His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Ile Val Glu

225 230 235 240

Ile Ala Pro Ala Gln Asn Ile Ser Thr Glu Leu Arg Glu Ala Leu Cys

245 250 255

Arg Asp Ala Val Arg Phe Ala Glu Ser Ile Asn Phe Ser Cys Ala Gly

260 265 270

Thr Val Glu Phe Leu Val Glu Thr Glu Gly Gln Arg Ala Gly Gln His

275 280 285

Val Phe Ile Glu Met Asn Pro Arg Ile Gln Val Glu His Pro Val Thr

290 295 300

Glu Glu Ile Thr Asp Val Asp Leu Val Gln Ala Gln Met Arg Ile Ala

305 310 315 320

Ala Gly Glu Ser Leu Ser Asp Leu Gly Leu Ala Gln Asp Val Ile Arg

325 330 335

Ile Asn Gly Ala Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala

340 345 350

Asn Gly Phe Arg Pro Asp Thr Gly Thr Ile Thr Ala Tyr Arg Ser Ala

355 360 365

Gly Gly Ala Gly Val Arg Leu Asp Gly Gly Thr Ile Asp Ile Gly Val

370 375 380

Glu Ile Ser Ala Tyr Phe Asp Ser Leu Leu Val Lys Leu Ile Cys Arg

385 390 395 400

Gly Arg Thr Phe Glu Gln Ala Val Ala Arg Ala Gln Arg Thr Leu Ala

405 410 415

Glu Phe Arg Ile Arg Gly Val Ser Thr Asn Ile Pro Phe Leu Gln Ser

420 425 430

Val Leu Glu Asp Pro Asp Phe Ile Ala Gly Asp Ile Ser Thr Ser Phe

435 440 445

Ile Asp Glu Arg Pro Asp Leu Leu Thr Ala His Ala Pro Ala Asp Arg

450 455 460

Gly Thr Lys Leu Leu Arg Trp Leu Ala Glu Val Thr Val Asn Gln Pro

465 470 475 480

His Gly Pro Ala Pro Thr Gln Leu Asp Pro Gly Val Lys Arg Pro Thr

485 490 495

Gly Val Asp Leu Asn Val Pro Ser Pro Pro Gly Ser Arg Gln Arg Leu

500 505 510

Leu Asp Leu Gly Pro Glu Ala Phe Ala Ala Asp Leu Arg Gln Arg Val

515 520 525

Pro Ile Glu Val Thr Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu

530 535 540

Leu Ala Thr Arg Val Arg Thr Lys Asp Leu Ile Arg Ile Ala Pro Tyr

545 550 555 560

Val Gly Arg Met Thr Pro Glu Leu Leu Ser Val Glu Cys Trp Gly Gly

565 570 575

Ala Thr Tyr Asp Val Ala Leu Arg Phe Ile Ser Glu Asp Pro Trp Glu

580 585 590

Arg Leu Ala Ala Leu Arg Tyr Asn Met Pro Gly Leu Cys Leu Gln Met

595 600 605

Leu Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Lys

610 615 620

Val Thr Thr Ser Phe Val Ala Glu Ala Ala Glu Val Gly Ile Asp Ile

625 630 635 640

Phe Arg Ile Phe Asp Ala Leu Asn Asp Val Glu Gln Met Arg Pro Ala

645 650 655

Ile Glu Ala Val Arg Glu Thr Gly Ser Thr Ile Ala Glu Val Ala Leu

660 665 670

Cys Tyr Thr Gly Asp Leu Asn Ser Pro Ala Glu Asp Leu Tyr Thr Leu

675 680 685

Asp Tyr Tyr Leu Arg Leu Ala Glu Lys Ile Val Asn Ala Gly Ala His

690 695 700

Val Leu Gly Ile Lys Asp Met Ala Gly Leu Leu Arg Pro Pro Ala Ala

705 710 715 720

Arg Lys Leu Val Ala Ala Leu Arg Asp Asn Phe Asp Leu Pro Val His

725 730 735

Leu His Thr His Asp Thr Ala Gly Gly Gln Leu Ala Thr Leu Leu Ala

740 745 750

Ala Ile Asp Val Gly Val Asp Ala Val Asp Val Ala Ser Ala Pro Met

755 760 765

Ala Gly Thr Thr Ser Gln Val Pro Ala Ser Ala Leu Val Ala Ala Cys

770 775 780

Ala Asn Thr Glu Arg Pro Thr Asn Leu Asp Leu Arg Ala Val Met Glu

785 790 795 800

Leu Glu Pro Tyr Trp Glu Ala Val Arg Arg Val Tyr Ala Pro Phe Glu

805 810 815

Ser Gly Leu Pro Ser Pro Thr Gly Arg Val Tyr Asp His Glu Ile Pro

820 825 830

Gly Gly Gln Leu Ser Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu

835 840 845

Gly Glu Lys Phe Glu Gln Ile Glu Ala Met Tyr Thr Ala Ala Asn Ala

850 855 860

Ile Leu Gly Arg Pro Pro Lys Val Thr Pro Ser Ser Lys Val Val Gly

865 870 875 880

Asp Leu Ala Leu His Leu Val Ala Val Gly Ala Asp Pro Asp Asp Phe

885 890 895

Ala Glu Asn Pro Gln Ser Tyr Asp Ile Pro Asp Ser Val Ile Gly Phe

900 905 910

Leu Asn Gly Glu Leu Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Phe

915 920 925

Arg Thr Lys Ala Leu Gln Gly Arg Thr Val Pro Val Arg Asp Val Glu

930 935 940

Leu Ser Pro Glu Asp Ser Ala Asp Leu Asp Asp Lys Gly Gln Val Arg

945 950 955 960

Gln Ala Thr Leu Asn Arg Leu Leu Phe Pro Gly Pro Thr Lys Glu Phe

965 970 975

Leu Ala Asn Arg Ala Thr Tyr Gly Asp Val Ala Arg Leu Asn Thr Leu

980 985 990

Asp Phe Leu Tyr Gly Leu Gln Pro Gly Gln Glu His Val Ala Lys Ile

995 1000 1005

Gly Lys Gly Val Ser Leu Ile Leu Gly Leu Ala Ala Ile Gly Asn

1010 1015 1020

Ala Asp Glu Arg Gly Met Arg Thr Val Met Cys Thr Leu Asn Gly

1025 1030 1035

Gln Leu Arg Pro Leu Arg Val Arg Asp Lys Ser Ile Lys Val Asp

1040 1045 1050

Val Lys Thr Ala Glu Arg Ala Asp Pro Thr Lys Pro Gly His Val

1055 1060 1065

Ala Ala Pro Phe Ala Gly Val Val Thr Val Thr Val Asn Glu Gly

1070 1075 1080

Asp Thr Val Glu Thr Gly Ala Thr Val Ala Thr Ile Glu Ala Met

1085 1090 1095

Lys Met Glu Ala Ala Ile Thr Ala Pro Val Ser Gly Val Val Gln

1100 1105 1110

Arg Leu Ala Ile Ala Ala Val Gln Gln Val Glu Gly Gly Asp Leu

1115 1120 1125

Leu Leu Val Ile Ala Val

1130

<210> 284

<211> 3096

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_22序列

<400> 284

gtgtggcacg acagagacgt cgccgacacc atggtgttcg tcaccggcag gctgaaggga 60

tccggaatgt tccgcaaggt gctggtcgcc aaccgtgggg agatcgcgat tcgcgcgttc 120

cgcgccggtt acgaactggg tgcgcgcacg gtcgccgtct tcccgcacga ggaccgcaac 180

tcgctgcacc ggctcaaggc cgacgaggca tacgagatcg gcgagccggg ccacccggtg 240

cgggcctacc tgtccgtgga ggagatcatc cgggcggcac gtctggccgg tgcggacgcg 300

gtctacccgg ggtacgggtt cctgtccgag aacccggcac tggcccgtgc ctgcgaggag 360

gcgggcatca cgttcgtggg gccggacatg cggaccctgg agctgaccgg gaacaaggcg 420

cgtgccgtgg ccgccgcccg cgaggccggc gtacccgtgc tgggctcgtc ggagccctcc 480

accgacgtgg acgaactggt cgcggccgcc gagggcatcg gcttcccggt gttcgtcaag 540

gccgtcgccg gcggcggcgg acgcggcatg cggcgcgtcg aggacccctc gacgctgcgt 600

gagtccatcg aggcggcggc ccgtgaggcc gcatccgcgt tcggcgaccc caccgtcttc 660

ctggagaagg ccgtcgtcga cccccggcac atcgaggtgc agatcctcgc cgacgggcag 720

ggcgacgtca tccacctctt cgagcgcgac tgctcggtgc agcgccgcca ccagaaggtg 780

atcgaactcg cgcccgcccc gaacctcgac ccggcactgc gcgagcgcat ctgcgacgac 840

gccgtcaagt tcgcccgccg gatcggctac cgcaacgcgg gcaccgtgga attccttctc 900

gaccgcgacg gcaaccacgt cttcatcgag atgaacccgc gcatccaggt cgagcacacg 960

gtgaccgagg aggtgaccga cgtcgacctg gtgcaggcgc agctgcgcat cgccgccggc 1020

gagacgctgg ccgacctcgg cctgacgcag gacgccgtcg tcctgcgcgg cgccgcgctg 1080

cagtgccgga tcaccaccga ggacccggcc aacggcttcc gcccggacac cggcatgatc 1140

agcgcgtacc gctcgccggg cggttcgggc atccgcctcg acggcggcac cacccacgcc 1200

ggtacggagg tcagcgccca cttcgactcg atgctggtca agctgacctg ccggggaagg 1260

gacttcagga ccgcggtcag ccgtgcccgg cgcgcggtgg ccgagttccg catcaggggc 1320

gtgtccacga acatcccgtt cctgcaggct gtgctcgacg acccggactt ccgggccggc 1380

cacgtcacga cctccttcat cgagcagcgg ccgcacctgc tcaccgcgcg ccactccgcc 1440

gaccgcggca cgaagctgct cacctacctc gccgacgtca cggtgaacaa gccccacggc 1500

ccgcggcccg acctgatcgc gccgaccacc aagctgccac cgctgcccgc caccgagccg 1560

ccggccggct cccggcagca gctcaccgcg ctcggcccgg agggcttcgc acgccggctg 1620

cgcgagtcgc cgaccatcgg cgtcaccgac accaccttcc gggacgccca ccagtcgctg 1680

ctcgccaccc gggtccggac caaggacctg ctcgccgtcg cccctgtggt ggcgcgcacc 1740

ctgccgcagc tgctgtccct ggagtgctgg ggcggcgcca cctacgacgt cgccctgcgc 1800

ttcctcgcgg aggacccctg ggagcgcctg gccgcgctgc gcgaagccgt accgaacatc 1860

tgcctccaga tgttgctgcg cggccgcaac accgtgggct acaccccgta cccgaccgag 1920

gtgacggacg ccttcgtgca ggaggcggcc gccaccggaa tcgacatctt ccgtatcttc 1980

gacgcgctca acgacgtcgg acagatgcgg cccgccatcg acgccgtacg cgagaccggg 2040

tcggcggtcg ccgaggtggc gctgtgctac accggcgacc tgtccgatcc gtcggaacgg 2100

ctctacaccc tggactacta cctccggctg gccgaggaga tcgtggccgc gggtgcccac 2160

gtcctggccg tcaaggacat ggccgggctg ctccgcgccc cggccgccgc cacgctggtg 2220

tccgcgctgc gcagggagtt cgacctgccg gtgcacctgc acacgcacga caccgcgggc 2280

ggccagctcg ccacctacct cgcggcggtc caggccggtg cggacgccgt ggacggggcg 2340

gtggcctcca tggcgggcac cacctcgcag ccgtcgctgt cggcgatcgt cgccgcgacc 2400

gaccacaccg agcggccgac gggactcgac ctccaggcgg tcggcgacct ggagccgtac 2460

tgggagagcg tccgcaggat ctacgcaccg ttcgaggccg gtctcgcctc gccgaccggg 2520

cgcgtgtacc accacgagat ccccggcggc cagctctcca acctccgcac ccaggcgatc 2580

gcactcggac tcggcgaccg cttcgaggag gtcgaggcga tgtacgccgc cgcggacagg 2640

atgctcggcc ggctggtgaa ggtcaccccg tcctcgaagg tggtcggcga tctcgcgctg 2700

cacctcgtgg gcgccgccgt gtccccggag gacttcgagg cggagcccgg caggttcgac 2760

atcccggact cggtcatcgg cttcctgcgc ggtgaattgg gcaatccgcc gggcggctgg 2820

cccgagccgt tccgcagcaa ggcgctggcg ggccgcgccg agcccaagcc ggtgcgggag 2880

ctgaccgcgg aagaccgcac cggcctcgag aaggaccggc ggacgacgct caaccggctg 2940

ctgttccccg gaccggcgaa ggagttcgag acacaccgtc agacctacgg cgacaccagc 3000

gtgctcgaca gcaaggactt cttctacggg ctgcgccccg gaaaggagta cgccgtcgac 3060

ctcggaccgg gcgtgcggct gctcatcgag ctggag 3096

<210> 285

<211> 1032

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_22序列

<400> 285

Val Trp His Asp Arg Asp Val Ala Asp Thr Met Val Phe Val Thr Gly

1 5 10 15

Arg Leu Lys Gly Ser Gly Met Phe Arg Lys Val Leu Val Ala Asn Arg

20 25 30

Gly Glu Ile Ala Ile Arg Ala Phe Arg Ala Gly Tyr Glu Leu Gly Ala

35 40 45

Arg Thr Val Ala Val Phe Pro His Glu Asp Arg Asn Ser Leu His Arg

50 55 60

Leu Lys Ala Asp Glu Ala Tyr Glu Ile Gly Glu Pro Gly His Pro Val

65 70 75 80

Arg Ala Tyr Leu Ser Val Glu Glu Ile Ile Arg Ala Ala Arg Leu Ala

85 90 95

Gly Ala Asp Ala Val Tyr Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro

100 105 110

Ala Leu Ala Arg Ala Cys Glu Glu Ala Gly Ile Thr Phe Val Gly Pro

115 120 125

Asp Met Arg Thr Leu Glu Leu Thr Gly Asn Lys Ala Arg Ala Val Ala

130 135 140

Ala Ala Arg Glu Ala Gly Val Pro Val Leu Gly Ser Ser Glu Pro Ser

145 150 155 160

Thr Asp Val Asp Glu Leu Val Ala Ala Ala Glu Gly Ile Gly Phe Pro

165 170 175

Val Phe Val Lys Ala Val Ala Gly Gly Gly Gly Arg Gly Met Arg Arg

180 185 190

Val Glu Asp Pro Ser Thr Leu Arg Glu Ser Ile Glu Ala Ala Ala Arg

195 200 205

Glu Ala Ala Ser Ala Phe Gly Asp Pro Thr Val Phe Leu Glu Lys Ala

210 215 220

Val Val Asp Pro Arg His Ile Glu Val Gln Ile Leu Ala Asp Gly Gln

225 230 235 240

Gly Asp Val Ile His Leu Phe Glu Arg Asp Cys Ser Val Gln Arg Arg

245 250 255

His Gln Lys Val Ile Glu Leu Ala Pro Ala Pro Asn Leu Asp Pro Ala

260 265 270

Leu Arg Glu Arg Ile Cys Asp Asp Ala Val Lys Phe Ala Arg Arg Ile

275 280 285

Gly Tyr Arg Asn Ala Gly Thr Val Glu Phe Leu Leu Asp Arg Asp Gly

290 295 300

Asn His Val Phe Ile Glu Met Asn Pro Arg Ile Gln Val Glu His Thr

305 310 315 320

Val Thr Glu Glu Val Thr Asp Val Asp Leu Val Gln Ala Gln Leu Arg

325 330 335

Ile Ala Ala Gly Glu Thr Leu Ala Asp Leu Gly Leu Thr Gln Asp Ala

340 345 350

Val Val Leu Arg Gly Ala Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp

355 360 365

Pro Ala Asn Gly Phe Arg Pro Asp Thr Gly Met Ile Ser Ala Tyr Arg

370 375 380

Ser Pro Gly Gly Ser Gly Ile Arg Leu Asp Gly Gly Thr Thr His Ala

385 390 395 400

Gly Thr Glu Val Ser Ala His Phe Asp Ser Met Leu Val Lys Leu Thr

405 410 415

Cys Arg Gly Arg Asp Phe Arg Thr Ala Val Ser Arg Ala Arg Arg Ala

420 425 430

Val Ala Glu Phe Arg Ile Arg Gly Val Ser Thr Asn Ile Pro Phe Leu

435 440 445

Gln Ala Val Leu Asp Asp Pro Asp Phe Arg Ala Gly His Val Thr Thr

450 455 460

Ser Phe Ile Glu Gln Arg Pro His Leu Leu Thr Ala Arg His Ser Ala

465 470 475 480

Asp Arg Gly Thr Lys Leu Leu Thr Tyr Leu Ala Asp Val Thr Val Asn

485 490 495

Lys Pro His Gly Pro Arg Pro Asp Leu Ile Ala Pro Thr Thr Lys Leu

500 505 510

Pro Pro Leu Pro Ala Thr Glu Pro Pro Ala Gly Ser Arg Gln Gln Leu

515 520 525

Thr Ala Leu Gly Pro Glu Gly Phe Ala Arg Arg Leu Arg Glu Ser Pro

530 535 540

Thr Ile Gly Val Thr Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu

545 550 555 560

Leu Ala Thr Arg Val Arg Thr Lys Asp Leu Leu Ala Val Ala Pro Val

565 570 575

Val Ala Arg Thr Leu Pro Gln Leu Leu Ser Leu Glu Cys Trp Gly Gly

580 585 590

Ala Thr Tyr Asp Val Ala Leu Arg Phe Leu Ala Glu Asp Pro Trp Glu

595 600 605

Arg Leu Ala Ala Leu Arg Glu Ala Val Pro Asn Ile Cys Leu Gln Met

610 615 620

Leu Leu Arg Gly Arg Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Glu

625 630 635 640

Val Thr Asp Ala Phe Val Gln Glu Ala Ala Ala Thr Gly Ile Asp Ile

645 650 655

Phe Arg Ile Phe Asp Ala Leu Asn Asp Val Gly Gln Met Arg Pro Ala

660 665 670

Ile Asp Ala Val Arg Glu Thr Gly Ser Ala Val Ala Glu Val Ala Leu

675 680 685

Cys Tyr Thr Gly Asp Leu Ser Asp Pro Ser Glu Arg Leu Tyr Thr Leu

690 695 700

Asp Tyr Tyr Leu Arg Leu Ala Glu Glu Ile Val Ala Ala Gly Ala His

705 710 715 720

Val Leu Ala Val Lys Asp Met Ala Gly Leu Leu Arg Ala Pro Ala Ala

725 730 735

Ala Thr Leu Val Ser Ala Leu Arg Arg Glu Phe Asp Leu Pro Val His

740 745 750

Leu His Thr His Asp Thr Ala Gly Gly Gln Leu Ala Thr Tyr Leu Ala

755 760 765

Ala Val Gln Ala Gly Ala Asp Ala Val Asp Gly Ala Val Ala Ser Met

770 775 780

Ala Gly Thr Thr Ser Gln Pro Ser Leu Ser Ala Ile Val Ala Ala Thr

785 790 795 800

Asp His Thr Glu Arg Pro Thr Gly Leu Asp Leu Gln Ala Val Gly Asp

805 810 815

Leu Glu Pro Tyr Trp Glu Ser Val Arg Arg Ile Tyr Ala Pro Phe Glu

820 825 830

Ala Gly Leu Ala Ser Pro Thr Gly Arg Val Tyr His His Glu Ile Pro

835 840 845

Gly Gly Gln Leu Ser Asn Leu Arg Thr Gln Ala Ile Ala Leu Gly Leu

850 855 860

Gly Asp Arg Phe Glu Glu Val Glu Ala Met Tyr Ala Ala Ala Asp Arg

865 870 875 880

Met Leu Gly Arg Leu Val Lys Val Thr Pro Ser Ser Lys Val Val Gly

885 890 895

Asp Leu Ala Leu His Leu Val Gly Ala Ala Val Ser Pro Glu Asp Phe

900 905 910

Glu Ala Glu Pro Gly Arg Phe Asp Ile Pro Asp Ser Val Ile Gly Phe

915 920 925

Leu Arg Gly Glu Leu Gly Asn Pro Pro Gly Gly Trp Pro Glu Pro Phe

930 935 940

Arg Ser Lys Ala Leu Ala Gly Arg Ala Glu Pro Lys Pro Val Arg Glu

945 950 955 960

Leu Thr Ala Glu Asp Arg Thr Gly Leu Glu Lys Asp Arg Arg Thr Thr

965 970 975

Leu Asn Arg Leu Leu Phe Pro Gly Pro Ala Lys Glu Phe Glu Thr His

980 985 990

Arg Gln Thr Tyr Gly Asp Thr Ser Val Leu Asp Ser Lys Asp Phe Phe

995 1000 1005

Tyr Gly Leu Arg Pro Gly Lys Glu Tyr Ala Val Asp Leu Gly Pro

1010 1015 1020

Gly Val Arg Leu Leu Ile Glu Leu Glu

1025 1030

<210> 286

<211> 3378

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_23序列

<400> 286

atgttccgca aggtgctggt cgcgaaccgc ggggagatcg ccatccgcgc gttccgcgca 60

gcgtacgagc tgggcgtgtc gacggtggcg gtgttcccgc acgaggaccg cagctcgctg 120

catcgagcca aggccgacga gtcgtaccag atcggcgagc cgggccaccc ggtgcgggca 180

tacctgtcgg tcgaggaagt catcaaggcc gcgcggaagg ccggagcgga cgcgatctac 240

cccgggtacg gcttcctgtc ggagaaccct gatctcgcgg aggcctgcga gcgcgagggc 300

atcacgttcg tgggtccgtc cgccgaggta ctgcacctca ccggcaacaa ggcgcgcgcg 360

gtggcggccg cccgggaggc gggcatcccg gtgctgcgct cgtcggcgcc gtccgacgac 420

gtcgacacac tgctcgccgc ggcggacggg atcgacttcc cgatcttcgt caaggccgtc 480

gccggcggcg gcgggcgcgg catgcggcgg gtgaccgcgc ccggcgagct gcgcgaggcc 540

gtcgaggcgg cgatgcggga ggccgaatcg gcgttcggcg accgaaccgt cttcctcgaa 600

caggcggtgg tgaacccccg ccacatcgag gtgcagatcc tcgccgacgc cgcgggcaac 660

gtcgtgcacc tctacgagcg cgactgctcg gtgcagcgcc gccatcagaa ggtcatcgag 720

atcgcgcccg cgcccaacct cgaccccgag ctgcgcgagc ggatctgctc cgacgccgtg 780

gccttcgccc gccacatcgg ctacgtcaac gcgggcaccg tcgagttcct gctcgacgag 840

cgcggcaacc acgtgttcat cgagatgaac ccgcgcatcc aggtggagca cacggtcacc 900

gagcaggtca ccgaccgcga cctcgtgatc gcccagctgc gcatcgcgtc cgggatgacg 960

ctgccgcagt tgcggctgaa ccaggaggac gtgacgctga acggcgccgc gctgcagtgc 1020

cgcgtcacca cggaggatcc gaccaacggc ttccgccccg acaccggcac gatcagcgcc 1080

taccgctcgc cgggtggccc cggcgtccgg ctggacggtg gcaccacgca caccggcgcc 1140

gaggtgagcg cccacttcga ctcgatgctg gtgaagctca cctgctacgg ccacgacttc 1200

tcgaacgccg tgcgcagggc gcggcgggcg atcgcggagt tccggatccg cggcgtgtcg 1260

acgaacctgc cgtacctcgc cgctgtactc gacgacccgg acttcgcggc cggccggatc 1320

accacgagct tcatcgacga gcgcccccac ctgctcaccg cgcgcaagcc tgccgaccgg 1380

ggcacccggg tactcagcta cctcgccgac atcacggtca acaagccgaa cgggccgagg 1440

ccgcaggtcg tcgaggcggt ggacaagctg ccccgctgcg acctggacgc ccccgccccg 1500

gacggctccc ggcagctact gcgcgagctg ggtcccgaag gtttcgcgcg gtggttgcgt 1560

gagcagacga ccgtgccggt cactgacacc acgttccgcg acgcgcacca gtcgctgctc 1620

gcgacgcggg tgcggacccg ggacctgctc gcgatcgccc cgcatatggc ccgcatggca 1680

ccacagctgc tctccctcga gtgctggggc ggcgcgacct acgacgtggc gctgcggttc 1740

ctcgccgagg acccgtggga gcggctggcc gcgctcagcg ccgcggtgcc gaacatctgc 1800

acgcagatgc tcctgcgcgg gcgcaacacc gtgggctaca cgccgtaccc caccgaggtg 1860

accgacgcct tcgtcgagga ggcggcgcgt accgggatgg acatcttccg gatcttcgac 1920

gccctcaacg acgtcgagca gatgcgcccg gccatcgacg ccgtgcgcgc cacgggcacc 1980

gccgtcgccg aggtggcgct ctgctacacc gccgacctgt ccgaccccgc cgagcagctc 2040

tacacgctgg actactacct gcggctggcc gagcagatcg tcgaggcagg tgcccacgtc 2100

ctcgcgatca aggacatggc cgggctgctt cgcccgcccg cggcccgcgc gctggtcacg 2160

gcgctgcgcg agcgcttcga cctgccggtg cacctgcaca cccacgacac ggcgggcggg 2220

cagctcgcca cgctggtcac ggcgatcgac gcgggcgtgg acgccgtcga cgcggcagtc 2280

gcgtccatgg caggaacgac gagccagccg tcgctctccg cgctggtcgc ggccaccgac 2340

cacaccgacc gcaccaccgg cctctcgctg gaggcggtcg gcgacctgga gccgtactgg 2400

gaggccgtgc ggaaggtgta cgcgccgttc gaggcggcat tgccgtcgcc gaccgggcgc 2460

gtctaccacc acgagatccc cggcgggcag ctgtccaacc tgcgccagca ggcgatcgcg 2520

ctcgggctcg gcgaccggtt cgagctgatc gaggactgct acgcggccgc ggaccggatg 2580

ctcgggcggc tggtgaaggt gaccccgtcg tcgaaggtgg tgggcgacct cgcgctgcac 2640

ctcgtcggcg ccggggtgga acccaaggac ttcgaggccg acccgggcca gttcgacgtg 2700

cccgactcgg tgatcgggtt cctgcgcggt gagctgggcg acccgccagg cggctggccc 2760

gagccgttcc gcagccgcgc tctcgagggg cgcccggcgg cgaaggaggg cgcgggcctc 2820

tccgacgagg atcgggcggg cctgcgggac gatcgccggg cgacgctcaa ccgactgctg 2880

ttcccggggc cggcgaagga gttcctcgcc caccgcgagg cttacagtga cacctccgtg 2940

ctctcgacga aggacttcct ctacggcctg gagcccgaca tcgagcacat cgcacagctg 3000

gagccaggcg tcgcgctgct catcgagctg gaggcgatct ccgagcccga caagcggggc 3060

taccgcaacg tgctcgccac cttgaacggc cagatgcggc cggtgtcggt gcgcgaccgg 3120

tcgatcgtga gcgacgtcaa ggccgccgag agggccgacc ggtcgaaccc gaagcacgtg 3180

gcggcgccgt tcgccggggt cgtgacgctg caggtcggcg agggcgaccg ggtcgaggac 3240

ggccagaccg tcgccaccat cgaggcgatg aagatggagg cctcgatcac cgcgcaccag 3300

gcgggcacgg tcgggcggct cgcgatcggc aaggtacagc aggtcgaggg cggtgacctg 3360

ctgctggtgc tcgagtga 3378

<210> 287

<211> 1125

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_23序列

<400> 287

Met Phe Arg Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Val Ser Thr Val Ala Val Phe

20 25 30

Pro His Glu Asp Arg Ser Ser Leu His Arg Ala Lys Ala Asp Glu Ser

35 40 45

Tyr Gln Ile Gly Glu Pro Gly His Pro Val Arg Ala Tyr Leu Ser Val

50 55 60

Glu Glu Val Ile Lys Ala Ala Arg Lys Ala Gly Ala Asp Ala Ile Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Asp Leu Ala Glu Ala Cys

85 90 95

Glu Arg Glu Gly Ile Thr Phe Val Gly Pro Ser Ala Glu Val Leu His

100 105 110

Leu Thr Gly Asn Lys Ala Arg Ala Val Ala Ala Ala Arg Glu Ala Gly

115 120 125

Ile Pro Val Leu Arg Ser Ser Ala Pro Ser Asp Asp Val Asp Thr Leu

130 135 140

Leu Ala Ala Ala Asp Gly Ile Asp Phe Pro Ile Phe Val Lys Ala Val

145 150 155 160

Ala Gly Gly Gly Gly Arg Gly Met Arg Arg Val Thr Ala Pro Gly Glu

165 170 175

Leu Arg Glu Ala Val Glu Ala Ala Met Arg Glu Ala Glu Ser Ala Phe

180 185 190

Gly Asp Arg Thr Val Phe Leu Glu Gln Ala Val Val Asn Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Ala Ala Gly Asn Val Val His Leu

210 215 220

Tyr Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Ile Ala Pro Ala Pro Asn Leu Asp Pro Glu Leu Arg Glu Arg Ile Cys

245 250 255

Ser Asp Ala Val Ala Phe Ala Arg His Ile Gly Tyr Val Asn Ala Gly

260 265 270

Thr Val Glu Phe Leu Leu Asp Glu Arg Gly Asn His Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Gln Val Thr

290 295 300

Asp Arg Asp Leu Val Ile Ala Gln Leu Arg Ile Ala Ser Gly Met Thr

305 310 315 320

Leu Pro Gln Leu Arg Leu Asn Gln Glu Asp Val Thr Leu Asn Gly Ala

325 330 335

Ala Leu Gln Cys Arg Val Thr Thr Glu Asp Pro Thr Asn Gly Phe Arg

340 345 350

Pro Asp Thr Gly Thr Ile Ser Ala Tyr Arg Ser Pro Gly Gly Pro Gly

355 360 365

Val Arg Leu Asp Gly Gly Thr Thr His Thr Gly Ala Glu Val Ser Ala

370 375 380

His Phe Asp Ser Met Leu Val Lys Leu Thr Cys Tyr Gly His Asp Phe

385 390 395 400

Ser Asn Ala Val Arg Arg Ala Arg Arg Ala Ile Ala Glu Phe Arg Ile

405 410 415

Arg Gly Val Ser Thr Asn Leu Pro Tyr Leu Ala Ala Val Leu Asp Asp

420 425 430

Pro Asp Phe Ala Ala Gly Arg Ile Thr Thr Ser Phe Ile Asp Glu Arg

435 440 445

Pro His Leu Leu Thr Ala Arg Lys Pro Ala Asp Arg Gly Thr Arg Val

450 455 460

Leu Ser Tyr Leu Ala Asp Ile Thr Val Asn Lys Pro Asn Gly Pro Arg

465 470 475 480

Pro Gln Val Val Glu Ala Val Asp Lys Leu Pro Arg Cys Asp Leu Asp

485 490 495

Ala Pro Ala Pro Asp Gly Ser Arg Gln Leu Leu Arg Glu Leu Gly Pro

500 505 510

Glu Gly Phe Ala Arg Trp Leu Arg Glu Gln Thr Thr Val Pro Val Thr

515 520 525

Asp Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Val

530 535 540

Arg Thr Arg Asp Leu Leu Ala Ile Ala Pro His Met Ala Arg Met Ala

545 550 555 560

Pro Gln Leu Leu Ser Leu Glu Cys Trp Gly Gly Ala Thr Tyr Asp Val

565 570 575

Ala Leu Arg Phe Leu Ala Glu Asp Pro Trp Glu Arg Leu Ala Ala Leu

580 585 590

Ser Ala Ala Val Pro Asn Ile Cys Thr Gln Met Leu Leu Arg Gly Arg

595 600 605

Asn Thr Val Gly Tyr Thr Pro Tyr Pro Thr Glu Val Thr Asp Ala Phe

610 615 620

Val Glu Glu Ala Ala Arg Thr Gly Met Asp Ile Phe Arg Ile Phe Asp

625 630 635 640

Ala Leu Asn Asp Val Glu Gln Met Arg Pro Ala Ile Asp Ala Val Arg

645 650 655

Ala Thr Gly Thr Ala Val Ala Glu Val Ala Leu Cys Tyr Thr Ala Asp

660 665 670

Leu Ser Asp Pro Ala Glu Gln Leu Tyr Thr Leu Asp Tyr Tyr Leu Arg

675 680 685

Leu Ala Glu Gln Ile Val Glu Ala Gly Ala His Val Leu Ala Ile Lys

690 695 700

Asp Met Ala Gly Leu Leu Arg Pro Pro Ala Ala Arg Ala Leu Val Thr

705 710 715 720

Ala Leu Arg Glu Arg Phe Asp Leu Pro Val His Leu His Thr His Asp

725 730 735

Thr Ala Gly Gly Gln Leu Ala Thr Leu Val Thr Ala Ile Asp Ala Gly

740 745 750

Val Asp Ala Val Asp Ala Ala Val Ala Ser Met Ala Gly Thr Thr Ser

755 760 765

Gln Pro Ser Leu Ser Ala Leu Val Ala Ala Thr Asp His Thr Asp Arg

770 775 780

Thr Thr Gly Leu Ser Leu Glu Ala Val Gly Asp Leu Glu Pro Tyr Trp

785 790 795 800

Glu Ala Val Arg Lys Val Tyr Ala Pro Phe Glu Ala Ala Leu Pro Ser

805 810 815

Pro Thr Gly Arg Val Tyr His His Glu Ile Pro Gly Gly Gln Leu Ser

820 825 830

Asn Leu Arg Gln Gln Ala Ile Ala Leu Gly Leu Gly Asp Arg Phe Glu

835 840 845

Leu Ile Glu Asp Cys Tyr Ala Ala Ala Asp Arg Met Leu Gly Arg Leu

850 855 860

Val Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu His

865 870 875 880

Leu Val Gly Ala Gly Val Glu Pro Lys Asp Phe Glu Ala Asp Pro Gly

885 890 895

Gln Phe Asp Val Pro Asp Ser Val Ile Gly Phe Leu Arg Gly Glu Leu

900 905 910

Gly Asp Pro Pro Gly Gly Trp Pro Glu Pro Phe Arg Ser Arg Ala Leu

915 920 925

Glu Gly Arg Pro Ala Ala Lys Glu Gly Ala Gly Leu Ser Asp Glu Asp

930 935 940

Arg Ala Gly Leu Arg Asp Asp Arg Arg Ala Thr Leu Asn Arg Leu Leu

945 950 955 960

Phe Pro Gly Pro Ala Lys Glu Phe Leu Ala His Arg Glu Ala Tyr Ser

965 970 975

Asp Thr Ser Val Leu Ser Thr Lys Asp Phe Leu Tyr Gly Leu Glu Pro

980 985 990

Asp Ile Glu His Ile Ala Gln Leu Glu Pro Gly Val Ala Leu Leu Ile

995 1000 1005

Glu Leu Glu Ala Ile Ser Glu Pro Asp Lys Arg Gly Tyr Arg Asn

1010 1015 1020

Val Leu Ala Thr Leu Asn Gly Gln Met Arg Pro Val Ser Val Arg

1025 1030 1035

Asp Arg Ser Ile Val Ser Asp Val Lys Ala Ala Glu Arg Ala Asp

1040 1045 1050

Arg Ser Asn Pro Lys His Val Ala Ala Pro Phe Ala Gly Val Val

1055 1060 1065

Thr Leu Gln Val Gly Glu Gly Asp Arg Val Glu Asp Gly Gln Thr

1070 1075 1080

Val Ala Thr Ile Glu Ala Met Lys Met Glu Ala Ser Ile Thr Ala

1085 1090 1095

His Gln Ala Gly Thr Val Gly Arg Leu Ala Ile Gly Lys Val Gln

1100 1105 1110

Gln Val Glu Gly Gly Asp Leu Leu Leu Val Leu Glu

1115 1120 1125

<210> 288

<211> 3384

<212> DNA

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_24序列

<400> 288

gtgatctcga aagtcctggt tgccaaccgc ggcgagatcg ccatccgcgc ctttcgcgcc 60

gcctacgagt tgggcatcac gaccgtggcc gtctacccct tcgaggaccg caattcccaa 120

caccggctca aggccgacga gtcctaccag atcggcgaga agggccaccc ggtgcgtgcc 180

tacctgtcgg tcgacgagat cgtctcgacc gcgcgccgcg ccggcgccga cgccgtctac 240

cccggctatg gcttcctgtc ggagaatccc gagctggccg aggcttgcgc ggcagcgggc 300

atcaagttca tcggcccgag cgcggcgatc ctggagctga ccggcaataa gtcccgggcc 360

atcggggagg cgcgcgccgc cgggttgccg gtgctgaact cgtcggcgcc gtcgtcgtcg 420

gtgtacgaac tgcttgccgc cgcccaaaac atgccattcc cgctgttcgt caaggcggtg 480

tctggtggcg gcgggcgcgg catgcgccgg gtgaatgacc ctgacgcctt gcgtgaggcg 540

atcgaggctg ccagccgcga ggccgagtcg tcgttcggcg acccgagcgt gtacctcgag 600

caggccgtgc gcaacccacg ccacatcgag gtacagatcc tggccgacgc tcacggcaac 660

gtgatgcatc tcttcgagcg cgactgcagc gtgcagcgac ggcatcagaa ggtgatcgag 720

ctggcgcctg cgccgaacct gccgacggag ctgcgcgaga agatctgcgc cgacgccgtc 780

gcgttcgcac gccggatcaa ctacacgtat gcgggaaccg tcgagttcct gcttgacgag 840

cgtggacact acgtgttcat cgagatgaac ccgcgcatcc aagtcgagca cacggtcacc 900

gaagaggtca ccgacgttga cctggtggca agccagatgc gcatcgccga cggtgagacc 960

ctcgaagatc ttggcttgaa ccaggattcg ctgcgcacgc gcggtgcggc gttgcagtgc 1020

cggataacca ccgaggaccc ggccaacggg ttccgacccg acacgggccg catcaccggc 1080

taccgctctg cgggcggtgc cggcatccgg ctggacggcg cggcgaacct gggcgccgag 1140

atcggtgcgc atttcgattc gatgctggtg aagctcacct gccggggccg cgacttcgcc 1200

acggcggtcg cccgtgctcg gcgcgcgctc gctgagttcc gggttcgcgg ggtatcgacg 1260

aacatcccgt tcctgttggc cgtggtcacc gactcagatt ttcgggccgg tcggatcaac 1320

acgtcgttca tcgacgagcg cccctacctg ttgaccgcac gcacaccggc ggaccgaggc 1380

accaagatcc tgaactactt ggccgacgtc acggtcaacc agccgcacgg cacccgtcag 1440

tcgacggcgt atccccagga caagcttccg cagatcgatc tgtcggcgcc gccgccggcc 1500

ggctccaagc aactgctcac cgagctcggc ccggagggat tcgctcgccg gctgcgcgag 1560

tcacccgccg tcggcgtcac cgacacgacg ttccgcgatg cccatcagtc gctgctggcc 1620

acccgaatcc gcacgacggg gctgctgatg gttgcgccgt acatcgccag gatgatcccg 1680

cagctgttgt cgatcgaatg ctggggtggc gcgacttatg atgtggcact gcggtttttg 1740

aaggagaacc cgtgggagcg gctggccgcg ctgcgtgagg cggtgcccaa catctgcctg 1800

cagatgctgc tgcgcgggcg caacacggtg ggctacaccc cgtatccgga gtcggtcacg 1860

acggccttca tcgaggaagc cacggccacc ggtgtcgaca tctaccggat cttcgacgcc 1920

ctcaacaacg tggagtcgat gcggccggct atcgacgcgg tgcgcgaaac cggcacggcg 1980

atagccgaag tcgcgatgag ctacaccggc gacctgtccg acccgggcga gaggctttac 2040

acgctggatt actacctcaa gcttgccgag cagatcgtgg acgccggcgc acatgtgctg 2100

gccatcaagg acatggctgg gctgctaaaa gcgccggcgg caacggcttt ggtcggcgcg 2160

ctgcgcagcc gtttcgacct gccggtgcac gtgcacaccc acgacacccc cggcgggcaa 2220

ctggccacgt actgggcggc gtggcatgcc ggtgccaacg cggtcgacgg cgcctccgcg 2280

ccgctggccg gcacgacgag ccagcccgcg ctgtcgtcga tcgtggcggc cgcggcgaac 2340

accgaatacg acacaggcct ggcgctctcc gcggtgtgcg agctggagcc gtactgggat 2400

gccctgcgaa aggtctacgc gcccttcgag tccggactac ccgcgccgac cgggcgcgtg 2460

tacaaccacg agatccccgg gggccagttg tcaaatctgc gtcagcaggc gatcgccctg 2520

gggttcggtg accggttcga ggagatcgag gcgaattacg ctgcggccga ccgcatcctg 2580

ggtcggctgg tcaaggtcac gccgtcgtcg aaggtggtcg gcgaccttgc cctagctctg 2640

gtgggcgccg gcgtgagtgc cgacgagttc gccgctgacc cagcgcgatt cgacattccc 2700

gactccgtga tcggcttctt gcgcggagag ctgggcgatc cgcccggcgg ctggccggag 2760

ccattgcgca ccaaggccct agcgggacgg ccaccggcca agccgcaggt cgcgcttgca 2820

ccagatgatg aggcggcgtt gacgattccc ggctcggagc gtcaatccac cctgaatcgt 2880

ctgctgttcc cgggcccgac aaaggaattc gaagctcacc gcgagctgta cggcgacacg 2940

tcgcgcctgt cggccaacca gttcttctac ggattgcgcc agggcgaaga gcaccgggtg 3000

aggctggagc gcggcgtaga gctgctgatc gggctggagg cgatttccga ccccgacgag 3060

cgtgggatgc gcacggtgat gtgcctactc aacggccagc tgcggccagt gctggtgcgc 3120

gaccgcagca tcgccagcgc ggtgcccgcc gccgagaagg ccgagcgcgc gaaccccgcc 3180

cacatcgcgg caccattcgc cggtgtcgtc accgtcagcg tggcggaggg cggcgaggtg 3240

gccgccggtc agaccgtcgc gacgatcgag gcgatgaaga tggaagccgc aatcaccgcg 3300

ccgaaggccg gaaccgtcga gcgcatcgcc gtgtcagaga ccgcccaggt cgagggcggc 3360

gatctgttga tggtgatcag ctga 3384

<210> 289

<211> 1127

<212> PRT

<213> 未知

<220>

<223> 来自环境样品的未知细菌物种的pyc_24序列

<400> 289

Val Ile Ser Lys Val Leu Val Ala Asn Arg Gly Glu Ile Ala Ile Arg

1 5 10 15

Ala Phe Arg Ala Ala Tyr Glu Leu Gly Ile Thr Thr Val Ala Val Tyr

20 25 30

Pro Phe Glu Asp Arg Asn Ser Gln His Arg Leu Lys Ala Asp Glu Ser

35 40 45

Tyr Gln Ile Gly Glu Lys Gly His Pro Val Arg Ala Tyr Leu Ser Val

50 55 60

Asp Glu Ile Val Ser Thr Ala Arg Arg Ala Gly Ala Asp Ala Val Tyr

65 70 75 80

Pro Gly Tyr Gly Phe Leu Ser Glu Asn Pro Glu Leu Ala Glu Ala Cys

85 90 95

Ala Ala Ala Gly Ile Lys Phe Ile Gly Pro Ser Ala Ala Ile Leu Glu

100 105 110

Leu Thr Gly Asn Lys Ser Arg Ala Ile Gly Glu Ala Arg Ala Ala Gly

115 120 125

Leu Pro Val Leu Asn Ser Ser Ala Pro Ser Ser Ser Val Tyr Glu Leu

130 135 140

Leu Ala Ala Ala Gln Asn Met Pro Phe Pro Leu Phe Val Lys Ala Val

145 150 155 160

Ser Gly Gly Gly Gly Arg Gly Met Arg Arg Val Asn Asp Pro Asp Ala

165 170 175

Leu Arg Glu Ala Ile Glu Ala Ala Ser Arg Glu Ala Glu Ser Ser Phe

180 185 190

Gly Asp Pro Ser Val Tyr Leu Glu Gln Ala Val Arg Asn Pro Arg His

195 200 205

Ile Glu Val Gln Ile Leu Ala Asp Ala His Gly Asn Val Met His Leu

210 215 220

Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Val Ile Glu

225 230 235 240

Leu Ala Pro Ala Pro Asn Leu Pro Thr Glu Leu Arg Glu Lys Ile Cys

245 250 255

Ala Asp Ala Val Ala Phe Ala Arg Arg Ile Asn Tyr Thr Tyr Ala Gly

260 265 270

Thr Val Glu Phe Leu Leu Asp Glu Arg Gly His Tyr Val Phe Ile Glu

275 280 285

Met Asn Pro Arg Ile Gln Val Glu His Thr Val Thr Glu Glu Val Thr

290 295 300

Asp Val Asp Leu Val Ala Ser Gln Met Arg Ile Ala Asp Gly Glu Thr

305 310 315 320

Leu Glu Asp Leu Gly Leu Asn Gln Asp Ser Leu Arg Thr Arg Gly Ala

325 330 335

Ala Leu Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala Asn Gly Phe Arg

340 345 350

Pro Asp Thr Gly Arg Ile Thr Gly Tyr Arg Ser Ala Gly Gly Ala Gly

355 360 365

Ile Arg Leu Asp Gly Ala Ala Asn Leu Gly Ala Glu Ile Gly Ala His

370 375 380

Phe Asp Ser Met Leu Val Lys Leu Thr Cys Arg Gly Arg Asp Phe Ala

385 390 395 400

Thr Ala Val Ala Arg Ala Arg Arg Ala Leu Ala Glu Phe Arg Val Arg

405 410 415

Gly Val Ser Thr Asn Ile Pro Phe Leu Leu Ala Val Val Thr Asp Ser

420 425 430

Asp Phe Arg Ala Gly Arg Ile Asn Thr Ser Phe Ile Asp Glu Arg Pro

435 440 445

Tyr Leu Leu Thr Ala Arg Thr Pro Ala Asp Arg Gly Thr Lys Ile Leu

450 455 460

Asn Tyr Leu Ala Asp Val Thr Val Asn Gln Pro His Gly Thr Arg Gln

465 470 475 480

Ser Thr Ala Tyr Pro Gln Asp Lys Leu Pro Gln Ile Asp Leu Ser Ala

485 490 495

Pro Pro Pro Ala Gly Ser Lys Gln Leu Leu Thr Glu Leu Gly Pro Glu

500 505 510

Gly Phe Ala Arg Arg Leu Arg Glu Ser Pro Ala Val Gly Val Thr Asp

515 520 525

Thr Thr Phe Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Ile Arg

530 535 540

Thr Thr Gly Leu Leu Met Val Ala Pro Tyr Ile Ala Arg Met Ile Pro

545 550 555 560

Gln Leu Leu Ser Ile Glu Cys Trp Gly Gly Ala Thr Tyr Asp Val Ala

565 570 575

Leu Arg Phe Leu Lys Glu Asn Pro Trp Glu Arg Leu Ala Ala Leu Arg

580 585 590

Glu Ala Val Pro Asn Ile Cys Leu Gln Met Leu Leu Arg Gly Arg Asn

595 600 605

Thr Val Gly Tyr Thr Pro Tyr Pro Glu Ser Val Thr Thr Ala Phe Ile

610 615 620

Glu Glu Ala Thr Ala Thr Gly Val Asp Ile Tyr Arg Ile Phe Asp Ala

625 630 635 640

Leu Asn Asn Val Glu Ser Met Arg Pro Ala Ile Asp Ala Val Arg Glu

645 650 655

Thr Gly Thr Ala Ile Ala Glu Val Ala Met Ser Tyr Thr Gly Asp Leu

660 665 670

Ser Asp Pro Gly Glu Arg Leu Tyr Thr Leu Asp Tyr Tyr Leu Lys Leu

675 680 685

Ala Glu Gln Ile Val Asp Ala Gly Ala His Val Leu Ala Ile Lys Asp

690 695 700

Met Ala Gly Leu Leu Lys Ala Pro Ala Ala Thr Ala Leu Val Gly Ala

705 710 715 720

Leu Arg Ser Arg Phe Asp Leu Pro Val His Val His Thr His Asp Thr

725 730 735

Pro Gly Gly Gln Leu Ala Thr Tyr Trp Ala Ala Trp His Ala Gly Ala

740 745 750

Asn Ala Val Asp Gly Ala Ser Ala Pro Leu Ala Gly Thr Thr Ser Gln

755 760 765

Pro Ala Leu Ser Ser Ile Val Ala Ala Ala Ala Asn Thr Glu Tyr Asp

770 775 780

Thr Gly Leu Ala Leu Ser Ala Val Cys Glu Leu Glu Pro Tyr Trp Asp

785 790 795 800

Ala Leu Arg Lys Val Tyr Ala Pro Phe Glu Ser Gly Leu Pro Ala Pro

805 810 815

Thr Gly Arg Val Tyr Asn His Glu Ile Pro Gly Gly Gln Leu Ser Asn

820 825 830

Leu Arg Gln Gln Ala Ile Ala Leu Gly Phe Gly Asp Arg Phe Glu Glu

835 840 845

Ile Glu Ala Asn Tyr Ala Ala Ala Asp Arg Ile Leu Gly Arg Leu Val

850 855 860

Lys Val Thr Pro Ser Ser Lys Val Val Gly Asp Leu Ala Leu Ala Leu

865 870 875 880

Val Gly Ala Gly Val Ser Ala Asp Glu Phe Ala Ala Asp Pro Ala Arg

885 890 895

Phe Asp Ile Pro Asp Ser Val Ile Gly Phe Leu Arg Gly Glu Leu Gly

900 905 910

Asp Pro Pro Gly Gly Trp Pro Glu Pro Leu Arg Thr Lys Ala Leu Ala

915 920 925

Gly Arg Pro Pro Ala Lys Pro Gln Val Ala Leu Ala Pro Asp Asp Glu

930 935 940

Ala Ala Leu Thr Ile Pro Gly Ser Glu Arg Gln Ser Thr Leu Asn Arg

945 950 955 960

Leu Leu Phe Pro Gly Pro Thr Lys Glu Phe Glu Ala His Arg Glu Leu

965 970 975

Tyr Gly Asp Thr Ser Arg Leu Ser Ala Asn Gln Phe Phe Tyr Gly Leu

980 985 990

Arg Gln Gly Glu Glu His Arg Val Arg Leu Glu Arg Gly Val Glu Leu

995 1000 1005

Leu Ile Gly Leu Glu Ala Ile Ser Asp Pro Asp Glu Arg Gly Met

1010 1015 1020

Arg Thr Val Met Cys Leu Leu Asn Gly Gln Leu Arg Pro Val Leu

1025 1030 1035

Val Arg Asp Arg Ser Ile Ala Ser Ala Val Pro Ala Ala Glu Lys

1040 1045 1050

Ala Glu Arg Ala Asn Pro Ala His Ile Ala Ala Pro Phe Ala Gly

1055 1060 1065

Val Val Thr Val Ser Val Ala Glu Gly Gly Glu Val Ala Ala Gly

1070 1075 1080

Gln Thr Val Ala Thr Ile Glu Ala Met Lys Met Glu Ala Ala Ile

1085 1090 1095

Thr Ala Pro Lys Ala Gly Thr Val Glu Arg Ile Ala Val Ser Glu

1100 1105 1110

Thr Ala Gln Val Glu Gly Gly Asp Leu Leu Met Val Ile Ser

1115 1120 1125

<210> 290

<211> 309

<212> DNA

<213> 人工序列

<220>

<223> 菌株331829 gapA截短

<400> 290

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcaa tgaaggtctc cggcaacacc gtcaaggttg tttcctggta 240

cgacaacgag tggggctaca cctgccagct cctgcgtctg accgagctcg tagcttccaa 300

gctctttag 309

<210> 291

<211> 216

<212> DNA

<213> 人工序列

<220>

<223> 菌株331831 gapA截短

<400> 291

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggca ccaaggacaa caagaccctt 180

tccacccttc tcaagttcga ctcgatctcg aggtag 216

<210> 292

<211> 282

<212> DNA

<213> 人工序列

<220>

<223> 菌株331897 gapA截短

<400> 292

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc cgccacaacg ttgacatcgt ga 282

<210> 293

<211> 102

<212> DNA

<213> 人工序列

<220>

<223> 菌株331904 gapA截短

<400> 293

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttcgtcgca 60

ggtgccaaga aggtcatcat ctcccgatgc aaacgcggct aa 102

<210> 294

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> GapAv9-L224S (331772)

<400> 294

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Ser

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Ser

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 295

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> GapAv9-L224S (331772)

<400> 295

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcaagccacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagaga gcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 296

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> GapAv9-H110D (331828)

<400> 296

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala Asp Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Ser

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 297

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> GapAv9-H110D (331828)

<400> 297

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctgac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcaagccacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 298

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> GapAv9-K37P (331009)

<400> 298

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Pro Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Ser

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 299

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> GapAv9-K37P (331009)

<400> 299

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcacccc ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcaagccacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 300

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> GapAv9-Y140G (331005)

<400> 300

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Gly Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Ser

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 301

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> GapAv9-Y140G (331005)

<400> 301

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtccggc 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcaagccacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 302

<211> 334

<212> PRT

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的gapAv9

<400> 302

Met Thr Ile Arg Val Gly Ile Asn Gly Phe Gly Arg Ile Gly Arg Asn

1 5 10 15

Phe Phe Arg Ala Ile Leu Glu Arg Ser Asp Asp Leu Glu Val Val Ala

20 25 30

Val Asn Gly Thr Lys Asp Asn Lys Thr Leu Ser Thr Leu Leu Lys Phe

35 40 45

Asp Ser Ile Met Gly Arg Leu Gly Gln Glu Val Glu Tyr Asp Asp Asp

50 55 60

Ser Ile Thr Val Gly Gly Lys Arg Ile Ala Val Tyr Ala Glu Arg Asp

65 70 75 80

Pro Lys Asn Leu Asp Trp Ala Ala His Asn Val Asp Ile Val Ile Glu

85 90 95

Ser Thr Gly Phe Phe Thr Asp Ala Asn Ala Ala Lys Ala His Ile Glu

100 105 110

Ala Gly Ala Lys Lys Val Ile Ile Ser Ala Pro Ala Ser Asn Glu Asp

115 120 125

Ala Thr Phe Val Tyr Gly Val Asn His Glu Ser Tyr Asp Pro Glu Asn

130 135 140

His Asn Val Ile Ser Gly Ala Ser Cys Thr Thr Asn Cys Leu Ala Pro

145 150 155 160

Met Ala Lys Val Leu Asn Asp Lys Phe Gly Ile Glu Asn Gly Leu Met

165 170 175

Thr Thr Val His Ala Tyr Thr Gly Asp Gln Arg Leu His Asp Ala Ser

180 185 190

His Arg Asp Leu Arg Arg Ala Arg Ala Ala Ala Val Asn Ile Val Pro

195 200 205

Thr Ser Thr Gly Ala Ala Lys Ala Val Ala Leu Val Leu Pro Glu Leu

210 215 220

Lys Gly Lys Leu Asp Gly Tyr Ala Leu Arg Val Pro Val Ile Thr Gly

225 230 235 240

Ser Ala Thr Asp Leu Thr Phe Asn Thr Lys Ser Glu Val Thr Val Glu

245 250 255

Ser Ile Asn Ala Ala Ile Lys Glu Ala Ala Val Gly Glu Phe Gly Glu

260 265 270

Thr Leu Ala Tyr Ser Glu Glu Pro Leu Val Ser Thr Asp Ile Val His

275 280 285

Asp Ser His Gly Ser Ile Phe Asp Ala Gly Leu Thr Lys Val Ser Gly

290 295 300

Asn Thr Val Lys Val Val Ser Trp Tyr Asp Asn Glu Trp Gly Tyr Thr

305 310 315 320

Cys Gln Leu Leu Arg Leu Thr Glu Leu Val Ala Ser Lys Leu

325 330

<210> 303

<211> 1005

<212> DNA

<213> 人工序列

<220>

<223> 来自谷氨酸棒状杆菌的gapAv9

<400> 303

atgaccattc gtgttggtat taacggattt ggccgtatcg gacgtaactt cttccgcgca 60

attctggagc gcagcgacga tctcgaggta gttgcagtca acggcaccaa ggacaacaag 120

accctttcca cccttctcaa gttcgactcc atcatgggcc gccttggcca ggaagttgaa 180

tacgacgatg actccatcac cgttggtggc aagcgcatcg ctgtttacgc agagcgcgat 240

ccaaagaacc tggactgggc tgcacacaac gttgacatcg tgatcgagtc caccggcttc 300

ttcaccgatg caaacgcggc taaggctcac atcgaagcag gtgccaagaa ggtcatcatc 360

tccgcaccag caagcaacga agacgcaacc ttcgtttacg gtgtgaacca cgagtcctac 420

gatcctgaga accacaacgt gatctccggc gcatcttgca ccaccaactg cctcgcacca 480

atggcaaagg tcctgaacga caagttcggc atcgagaacg gtctcatgac caccgttcac 540

gcatacaccg gcgaccagcg cctgcacgat gcaagccacc gcgacctgcg tcgtgcacgt 600

gcagcagcag tcaacatcgt tcctacctcc accggtgcag ctaaggctgt tgctctggtt 660

ctcccagagc tcaagggcaa gcttgacggc tacgcacttc gcgttccagt tatcaccggt 720

tccgcaaccg acctgacctt caacaccaag tctgaggtca ccgttgagtc catcaacgct 780

gcaatcaagg aagctgcagt cggcgagttc ggcgagaccc tggcttactc cgaagagcca 840

ctggtttcca ccgacatcgt ccacgattcc cacggctcca tcttcgacgc tggcctgacc 900

aaggtctccg gcaacaccgt caaggttgtt tcctggtacg acaacgagtg gggctacacc 960

tgccagctcc tgcgtctgac cgagctcgta gcttccaagc tctaa 1005

<210> 304

<211> 1266

<212> DNA

<213> 谷氨酸棒状杆菌

<400> 304

atggccctgg tcgtacagaa atatggcggt tcctcgcttg agagtgcgga acgcattaga 60

aacgtcgctg aacggatcgt tgccaccaag aaggctggaa atgatgtcgt ggttgtctgc 120

tccgcaatgg gagacaccac ggatgaactt ctagaacttg cagcggcagt gaatcccgtt 180

ccgccagctc gtgaaatgga tatgctcctg actgctggtg agcgtatttc taacgctctc 240

gtcgccatgg ctattgagtc ccttggcgca gaagctcaat ctttcactgg ctctcaggct 300

ggtgtgctca ccaccgagcg ccacggaaac gcacgcattg ttgacgtcac accgggtcgt 360

gtgcgtgaag cactcgatga gggcaagatc tgcattgttg ctggttttca gggtgttaat 420

aaagaaaccc gcgatgtcac cacgttgggt cgtggtggtt ctgacaccac tgcagttgcg 480

ttggcagctg ctttgaacgc tgatgtgtgt gagatttact cggacgttga cggtgtgtat 540

accgctgacc cgcgcatcgt tcctaatgca cagaagctgg aaaagctcag cttcgaagaa 600

atgctggaac ttgctgctgt tggctccaag attttggtgc tgcgcagtgt tgaatacgct 660

cgtgcattca atgtgccact tcgcgtacgc tcgtcttata gtaatgatcc cggcactttg 720

attgccggct ctatggagga tattcctgtg gaagaagcag tccttaccgg tgtcgcaacc 780

gacaagtccg aagccaaagt aaccgttctg ggtatttccg ataagccagg cgagactgcc 840

aaggttttcc gtgcgttggc tgatgcagaa atcaacattg acatggttct gcagaacgtc 900

ttctctgtgg aagacggcac caccgacatc acgttcacct gccctcgcgc tgacggacgc 960

cgtgcgatgg agatcttgaa gaagcttcag gttcagggca actggaccaa tgtgctttac 1020

gacgaccagg tcggcaaagt ctccctcgtg ggtgctggca tgaagtctca cccaggtgtt 1080

accgcagagt tcatggaagc tctgcgcgat gtcaacgtga acatcgaatt gatttccacc 1140

tctgagatcc gcatttccgt gctgatccgt gaagatgatc tggatgctgc tgcacgtgca 1200

ttgcatgagc agttccagct gggcggcgaa gacgaagccg tcgtttatgc aggcaccgga 1260

cgctaa 1266

<210> 305

<211> 421

<212> PRT

<213> 谷氨酸棒状杆菌

<400> 305

Met Ala Leu Val Val Gln Lys Tyr Gly Gly Ser Ser Leu Glu Ser Ala

1 5 10 15

Glu Arg Ile Arg Asn Val Ala Glu Arg Ile Val Ala Thr Lys Lys Ala

20 25 30

Gly Asn Asp Val Val Val Val Cys Ser Ala Met Gly Asp Thr Thr Asp

35 40 45

Glu Leu Leu Glu Leu Ala Ala Ala Val Asn Pro Val Pro Pro Ala Arg

50 55 60

Glu Met Asp Met Leu Leu Thr Ala Gly Glu Arg Ile Ser Asn Ala Leu

65 70 75 80

Val Ala Met Ala Ile Glu Ser Leu Gly Ala Glu Ala Gln Ser Phe Thr

85 90 95

Gly Ser Gln Ala Gly Val Leu Thr Thr Glu Arg His Gly Asn Ala Arg

100 105 110

Ile Val Asp Val Thr Pro Gly Arg Val Arg Glu Ala Leu Asp Glu Gly

115 120 125

Lys Ile Cys Ile Val Ala Gly Phe Gln Gly Val Asn Lys Glu Thr Arg

130 135 140

Asp Val Thr Thr Leu Gly Arg Gly Gly Ser Asp Thr Thr Ala Val Ala

145 150 155 160

Leu Ala Ala Ala Leu Asn Ala Asp Val Cys Glu Ile Tyr Ser Asp Val

165 170 175

Asp Gly Val Tyr Thr Ala Asp Pro Arg Ile Val Pro Asn Ala Gln Lys

180 185 190

Leu Glu Lys Leu Ser Phe Glu Glu Met Leu Glu Leu Ala Ala Val Gly

195 200 205

Ser Lys Ile Leu Val Leu Arg Ser Val Glu Tyr Ala Arg Ala Phe Asn

210 215 220

Val Pro Leu Arg Val Arg Ser Ser Tyr Ser Asn Asp Pro Gly Thr Leu

225 230 235 240

Ile Ala Gly Ser Met Glu Asp Ile Pro Val Glu Glu Ala Val Leu Thr

245 250 255

Gly Val Ala Thr Asp Lys Ser Glu Ala Lys Val Thr Val Leu Gly Ile

260 265 270

Ser Asp Lys Pro Gly Glu Thr Ala Lys Val Phe Arg Ala Leu Ala Asp

275 280 285

Ala Glu Ile Asn Ile Asp Met Val Leu Gln Asn Val Phe Ser Val Glu

290 295 300

Asp Gly Thr Thr Asp Ile Thr Phe Thr Cys Pro Arg Ala Asp Gly Arg

305 310 315 320

Arg Ala Met Glu Ile Leu Lys Lys Leu Gln Val Gln Gly Asn Trp Thr

325 330 335

Asn Val Leu Tyr Asp Asp Gln Val Gly Lys Val Ser Leu Val Gly Ala

340 345 350

Gly Met Lys Ser His Pro Gly Val Thr Ala Glu Phe Met Glu Ala Leu

355 360 365

Arg Asp Val Asn Val Asn Ile Glu Leu Ile Ser Thr Ser Glu Ile Arg

370 375 380

Ile Ser Val Leu Ile Arg Glu Asp Asp Leu Asp Ala Ala Ala Arg Ala

385 390 395 400

Leu His Glu Gln Phe Gln Leu Gly Gly Glu Asp Glu Ala Val Val Tyr

405 410 415

Ala Gly Thr Gly Arg

420

<210> 306

<211> 14

<212> DNA

<213> 人工序列

<220>

<223> 人工核糖体结合位点

<400> 306

agctggtgga atat 14

<210> 307

<211> 10

<212> DNA

<213> 人工序列

<220>

<223> 人工核糖体结合位点

<400> 307

aggaggttgt 10

<210> 308

<211> 12

<212> DNA

<213> 人工序列

<220>

<223> 人工核糖体结合位点

<400> 308

tgacacctat tg 12

516页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:扩增DNA以维持甲基化状态的方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!