Biosynthesis method of protein heterogeneous catenane

文档序号:1250171 发布日期:2020-08-21 浏览:2次 中文

阅读说明:本技术 一种蛋白质异质索烃的生物合成方法 (Biosynthesis method of protein heterogeneous catenane ) 是由 张文彬 刘雅杰 于 2020-05-21 设计创作,主要内容包括:本发明公布了一种蛋白质异质索烃的生物合成方法。通过模拟天然拓扑蛋白质合成中的多步翻译后修饰过程,基于合理的基因序列设计,结合原位组装、链断裂和定点环化,发展了基于两种正交偶联方式的生物合成体系,可以实现蛋白质异质索烃的模块化合成。该方法利用p53dim结构域等缠结基元的分子内二聚来提高蛋白质异质索烃的产量,同时不需要额外的胞外反应。当多肽-蛋白质反应对与断裂内含肽联用时,可以实现支化蛋白质异质索烃的生物合成;当两种正交断裂内含肽联用时,可以获得完全主链环化的蛋白质异质索烃。本发明拓展了拓扑蛋白质的合成方法,可以简洁地实现蛋白质异质索烃的生物合成。(The invention discloses a biosynthesis method of protein heterogeneous catenane. Through simulating a multi-step post-translational modification process in natural topological protein synthesis, based on reasonable gene sequence design, in combination with in-situ assembly, chain breakage and site-specific cyclization, a biosynthesis system based on two orthogonal coupling modes is developed, and modular synthesis of protein heterogeneous hydrocarbons can be realized. The method utilizes intramolecular dimerization of entanglement motifs such as p53dim structural domains and the like to improve the yield of protein heterocatenes without additional extracellular reaction. When the polypeptide-protein reaction pair is combined with the broken intein, the biosynthesis of the branched protein heterosoxocarbon can be realized; when two orthogonal split inteins are combined, a fully backbone cyclized protein heterosoxohydrocarbon can be obtained. The invention expands the synthesis method of the topological protein and can simply realize the biosynthesis of the heterogeneous protein catenane.)

1. A biosynthesis method of protein heterosoxohydrocarbons comprises the following steps:

1) designing a protein precursor sequence of protein heterosoxohydrocarbon, wherein the basic structure of the protein precursor sequence comprises from N end to C end: l is1-1-X-L1-2- (in situ cleavage site) -L2-1-X-L2-2Wherein X represents an entanglement motif that forms a dimer, and may be homogeneous or heterogeneous, i.e., two X's may be the same or different; l is1-1/L1-2、L2-1/L2-2Two pairs of cyclisation motifs representing orthogonal coupling reactions occurring intracellularly, which pairs may be two orthogonal polypeptide-protein reaction pairs, or a combination of a polypeptide-protein reaction pair and a split intein, or two orthogonal split inteins; when L is1-1/L1-2For polypeptide-protein reaction time pairing, at L1-2And L2-1The in-situ enzyme cutting site inserted between the two elements is an essential element, the site is subjected to in-situ enzyme cutting by coexpression protease in cells, otherwise, the in-situ enzyme cutting site is an unnecessary element; inserting a target protein sequence into the basic structure, wherein the insertion site is selected from the group consisting of: before and/or after the X domain, N-and/or C-terminus of the polypeptide-protein reaction pair;

2) constructing a coding gene sequence corresponding to the protein precursor sequence in the step 1) and introducing the coding gene sequence into an expression vector;

3) transferring the expression vector constructed in the step 2) into cells for expression, and if necessary, co-expressing protease for cutting the in-situ enzyme cutting site in the cells;

4) purifying the fusion protein obtained in the step 3) to obtain corresponding protein heterogeneous catenane.

2. The method of claim 1, wherein the entanglement motif in step 1) is a p53dim domain or a p53dim mutant capable of forming a dimeric structure, wherein the amino acid sequence of the p53dim domain is shown as SEQ ID NO 3 in the sequence Listing.

3. The method of claim 1, wherein said polypeptide-protein reaction pair of step 1) is selected from the group consisting of a spy tag-spy capture reaction pair and a probe tag-probe capture reaction pair.

4. The method as claimed in claim 3, wherein the spy tag-spyware reaction has the amino acid sequences of spywtag and spyware as shown in SEQ ID NO 1 and SEQ ID NO 2 of the sequence Listing, respectively.

5. The method of claim 1, wherein said split intein of step 1) is an NpuDnaE split intein comprising a cyclization motif consisting of IntC1 and IntN1 or an IntC2 and IntN2, wherein the amino acid sequences of IntC1, IntN1, IntC2 and IntN2 are as set forth in SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6 and SEQ ID No. 7 of the sequence listing, respectively.

6. The method of claim 1, wherein the designed in situ cleavage site of step 1) is the recognition sequence ETVRFQG of TVMV protease or the recognition sequence ENLYFQG of TEV protease; correspondingly co-expressing the TVMV protease or TEV protease in step 3).

7. The method of claim 1, wherein step 1) is preceded by a histidine tag sequence and step 4) is followed by nickel column affinity chromatography for protein purification.

8. The method of claim 1, wherein the proprotein sequence designed in step 1) has the basic structure SpyCatcher-p53dim-SpyTag-IntC1-p53dim-IntN1, which in order from N-to C-terminus is the cyclization reaction motif SpyCatcher, the entanglement motif p53dim domain, the cyclization reaction motif spycag, the split intein C-terminal part IntC1, the entanglement motif p53dim domain and the split intein N-terminal part IntN 1; the recognition sequence of TVMV protease is inserted between SpyTag and IntC1, and a histidine tag sequence is introduced before the second p53dim domain; the fusion site of one or more of the same or different target proteins is selected from: the N-terminal of the SpyCatcher and the C-terminal of the SpyTag are arranged before and/or after the p53dim structural domain.

9. The method of claim 1, wherein the protein precursor sequence designed in step 1) has the basic structure IntC1-p53dim-IntN1-IntC2-p53dim-IntN2, which in order from N-terminus to C-terminus is split intein C-terminal portion IntC1, entanglement motif p53dim domain, split intein N-terminal portion IntN1, split intein C-terminal portion IntC2, entanglement motif p53dim domain, and split intein N-terminal portion IntN 2; a histidine tag sequence is introduced in front of the second p53dim domain; one or more identical or different target proteins are inserted before and/or after the two p53dim domains.

10. The method of claim 1, wherein in step 4) for the protein heterogeneous cable into which the histidine tag sequence is introduced, the expressed protein is purified by nickel column affinity chromatography, and the purity of the protein heterogeneous cable is further improved by combining gradient elution or size exclusion chromatography.

Technical Field

The invention relates to a biosynthesis method of protein heterogeneous hydrocarbons, in particular to a biosynthesis system based on polypeptide-protein reaction pairs and/or broken inteins and a method for constructing multidomain protein heterogeneous hydrocarbons by two orthogonal coupling cyclization modes based on the biosynthesis system.

Background

In nature, many natural biological macromolecules exist in specific topologies and are closely related to their respective biological functions. The native topoproteins found to date include circular, kinked, lasso proteins and proteoliposomes, among others. Because the construction of the cyclic protein only needs to realize coupling on polypeptide chains, the method is the focus of the research of artificially synthesizing topological protein at present and generally shows the improvement of the stability of obvious heat. Due to the complexity of the protein folding mechanism, it is relatively difficult to control the topology of the protein by controlling the entanglement relationships between polypeptide chains. The simplest [2] catenane of catenane is composed of two mechanically interlocked ring motifs, so that the corresponding protein heterogeneous catenane structure can combine the advantages of the ring proteins and realize synergistic effect by regulating the relative positions of the two ring motifs, and the structure is not found in nature. Therefore, the development of a preparation method of protein heterogeneous hydrocarbons is a very attractive research direction.

The current reports on the artificial synthesis of protein catenanes are relatively few, and the synthetic strategies can be roughly divided into three types, but the essence of realizing mechanical interlocking structures is based on the folding structure of the protein. The first type is that the four-coalesced domain p53tet of the tumor suppressor protein p53 or the mutant dimerized domain p53dim thereof is used to guide the entanglement of molecular chains, and then the high-efficiency specific natural chemical ligation or spy tag-spy catcher (SpyTag-SpyCatcher) reaction pair is used to close the loops, thereby realizing the synthesis of protein homosoxhlet. The second type is topology structure conversion based on the lasso peptide, and is gradually converted into high-order soxhlet hydrocarbon through enzyme digestion and assembly. The third type is that the spyware is split into BDtag and spyware enzyme (spyStapler), three elements are reasonably recombined based on a folding structure of a spyware tag-spyware reaction pair, and the characteristics of a fragmentation intein-mediated cyclization reaction and autocatalysis of the fragmentation intein-mediated cyclization reaction to generate isopeptide bonds are combined, so that the synthesis of protein heterogeneous catenane is realized for the first time, but the reaction cannot be complete, and the whole purification process is complicated. Based on the assembly-reaction cooperative strategy, the method for further developing the biosynthesis of the protein heterosoxohydrocarbon is helpful for further researching the influence of the topological structure on the properties and functions of the protein, and lays a foundation for the application of the protein in the biomedical field.

Disclosure of Invention

The invention aims to provide a biosynthesis strategy of protein heterosoxhlet, which can realize the efficient construction of multi-domain protein heterosoxhlet without an extra extracellular reaction process.

The invention develops a synthesis system based on two orthogonal coupling modes by simulating a multi-step post-translational modification process in natural topological protein synthesis, combining in-situ assembly, chain breakage and site-specific cyclization based on reasonable gene sequence design, can realize modular synthesis of protein heterogeneous cable hydrocarbon, and has the structural characteristic of branching or complete main chain cyclization.

The basic structure of the protein precursor sequence for preparing the protein heterosoxohydrocarbon comprises: l is1-1-X-L1-2- (in situ cleavage site) -L2-1-X-L2-2Wherein:

(1) x represents an entanglement motif that can form dimers, and is one of the key elements for the formation of heterosoxohydrocarbons. Two xs may be the same, e.g., the tumor suppressor-derived p53dim domain, the helicobacter pylori HP0242 protein, etc. form the entanglement motif of homodimers; the two xs may also be different, e.g. heterodimeric motifs derived from the above-mentioned dimeric motifs by substitution of amino acid residues or the like, or natural heterotangled dimeric motifs as they occur in nature.

(2)L1-1/L1-2、L2-1/L2-2Two pairs of cyclization motifs, representing orthogonal coupling reactions that can occur intracellularly, are another key element in the formation of heterohydrocarbons. The cyclization motif can be selected from a polypeptide-protein reaction pair, a split intein and the like, and in order to avoid excessive side reactions, certain orthogonality should be provided between the two cyclization modes. In certain cases at L1-2And L2-1In-situ enzyme cutting sites are inserted between the two, and the in-situ enzyme cutting of the sites can be realized by coexpressing protease in cellsAnd (3) synthesis of heterogeneous catenane, such as insertion of a recognition site of a TVMV enzyme.

The selection of two pairs of cyclization motifs includes mainly the following three ways:

two orthogonal polypeptide-protein reaction pairs, such as a spy tag-spy capture reaction pair and a probe tag-spy capture reaction pair. In this case, an in situ cleavage site must be inserted between the two reaction pairs to cleave one polypeptide chain into two polypeptide chains by a coexpression protease.

② polypeptide-protein reaction pairs in combination with a split intein, e.g. spy tag-spy captain reaction pair with NpuDnaE split intein (including a C-terminal portion and an N-terminal portion)1-1/L1-2Is a polypeptide-protein reaction pair, L2-1/L2-2In order to cleave inteins, L needs to be initiated by in situ cleavage because the intracellular cyclization reaction of the polypeptide-protein reaction pair is a side chain coupling reaction, and the resulting complex will be present in the final structure2-1/L2-2A cyclization reaction of (a); when the split intein is in front, the polypeptide-protein reaction pair is in the back, i.e.L1-1/L1-2For cleavage of intein, L2-1/L2-2For the purposes of peptide-protein reaction pair, in situ cleavage may be optional based on the property that cleavage intein-mediated cyclization is backbone coupling, which is released from the precursor protein by self-splicing after cleavage intein cyclization.

③ two orthogonal split inteins, such as IntC1/IntN1, IntC2/IntN2, formed by two different splitting patterns of NpuDnaE split inteins, and other split inteins, such as gp41-1, gp41-8, NrdJ-1 and IMPDH-1, the two split inteins may have a certain orthogonality. The advantages of split intein mediated cyclization are backbone cyclization and self-splicing away, leaving few redundant amino acids. In situ cleavage sites may not be inserted when two orthogonal cleavage inteins are used.

By inserting one or more of the same or different target proteins into the basic structure of the above-mentioned protein precursor sequence, a protein heterosoxhlet comprising the target protein can be constructed. The insertion site of the protein of interest may be within the loop, i.e., before the X domain and/or after the X domain. Because the polypeptide-protein reaction pair mediated cyclization is side chain coupling, and the N end and the C end are still remained after the cyclization, the insertion site of the target protein can be out of the loop, namely the N end and/or the C end of the polypeptide-protein reaction pair, thereby constructing the branched heterogeneous catenane.

Gene construction of the protein of interest see FIG. 1, at the protein precursor sequence L1-1-X-POI1-L1-2-(TVMV)-L2-1-X-POI2-L2-2In, L1-1/L1-2And L2-1/L2-2Cyclisation motifs representing two orthogonal cyclisation modes, X representing the entanglement motif, POI1 and POI2 representing protein of interest 1 and protein of interest 2; the TVMV site represents a recognition site of the TVMV enzyme, can be recognized by the co-expressed TVMV enzyme and carries out in-situ enzyme digestion; the introduction of a purification tag (e.g.a histidine tag sequence) before the second entanglement motif X facilitates the purification of the synthesized heterosoxhlet hydrocarbon. The following illustrates the fuseable sites of the protein of interest:

① when L1-1/L1-2And L2-1/L2-2When orthogonal polypeptide-protein reaction is carried out, side chain coupling cyclization is carried out, in-situ enzyme digestion is required, and a formed compound L1And L2Will be present in the final catenane structure and thus, in addition to inserting the proteins of interest POI1 and POI2 into two loops, respectively, a heterogeneous catenane cat-L is formed1(X-POI1)-L2(X-POI2), further fusion of a protein of interest (POI3, POI4, POI5, POI6) to the N-and C-termini of the polypeptide-protein reaction pair can create branched heterocatenes, with the protein of interest inserted at the following positions: POI3-L1-1-X-POI1-L1-2-POI4-(TVMV)-POI5-L2-1-X-POI2-L2-2-POI6。

② when L1-1/L1-2And L2-1/L2-2When combined with a cleaved intein for a polypeptide-protein reaction pair, the complex formed by the cleaved intein will self-splice away if L1-1/L1-2Is a polypeptide-protein reaction pair, L2-1/L2-2For fracture inclusionPeptide, target protein POI1 and POI2 are respectively inserted into two rings to form heterogenous catenane cat-L1(X-POI1) - (X-POI2), further at L1-1/L1-2The N-and C-terminal fusions of the polypeptide-protein reaction pair to the target protein (POI3, POI4) can create branched heterosoxhlet hydrocarbons, with the target protein inserted at the following positions: POI3-L1-1-X-POI1-L1-2-POI4-(TVMV)-L2-1-X-POI2-L2-2(ii) a On the contrary, if L1-1/L1-2For cleavage of intein, L2-1/L2-2For the polypeptide-protein reaction pair, a heterogeneous catenane cat- (X-POI1) -L will be formed2(X-POI2), further at L2-1/L2-2The N-and C-terminal fusions of the polypeptide-protein reaction pair to the target protein (POI3, POI4) can create branched heterosoxhlet hydrocarbons, with the target protein inserted at the following positions: l is1-1-X-POI1-L1-2-(TVMV)-POI3-L2-1-X-POI2-L2-2-POI4。

③ when L1-1/L1-2And L2-1/L2-2In order to orthogonally cleave inteins, the target proteins POI1 and POI2 are respectively inserted into two rings due to the fact that a complex formed by the cleaved inteins can be separated from splicing and mediate backbone cyclization, and the formed heterogeneous catenane is cat- (X-POI1) - (X-POI2), wherein the two cyclic protein motifs contained in the heterogeneous catenane realize the backbone cyclization and do not contain entanglement motifs and other redundant components except the target proteins.

The strategy of the invention for protein heterosoxhlet biosynthesis takes the following aspects into consideration: (1) mechanical interlocking is realized by utilizing entanglement motifs (X) such as p53dim structural domains, and the yield of heterogeneous catenanes is improved by converting intermolecular dimerization into intramolecular dimerization; (2) the choice of the cyclization mode that can occur intracellularly, most currently used are protein-polypeptide reaction pairs and split inteins; (3) the two cyclization modes have certain orthogonality, so that excessive side reactions are avoided, such as a spy tag-spy catcher reaction pair is combined with a split intein, or two split inteins with certain orthogonality are selected; (4) a split intein typically includes a larger size N-terminal portion (IntN) and a relatively smaller size C-terminal portion (IntC), and when the IntC is located in the chain, which results in a hindered reaction, the nascent polypeptide chain can be cleaved in situ by co-expressing proteases, which initiates the split intein-mediated trans-splicing reaction.

The split inteins of the present invention are preferably NpuDnaE split inteins, which are naturally split into IntC1 with 36 amino acids and IntN1 with 102 amino acids. IntC2 containing 15 amino acids and the corresponding IntN2 containing 123 amino acids, obtained by systematically truncating the IntC part, also had good trans-splicing efficiency. Although some reactivity between IntC1 and IntN2 exists, IntC2 cannot react with IntN1, and certain orthogonality is shown.

The biosynthesis systems of protein heterosoxohydrocarbons in the invention all utilize intramolecular dimerization of entanglement motifs such as p53dim structural domains and the like to guide entanglement of polypeptide chains, but the modes for realizing orthogonal coupling are different. The intracellular cyclization reaction based on the polypeptide-protein reaction pair is a side chain coupling reaction with complete N-/C-terminal, and the formed complex exists in the final structure, so that the branched protein heterosoxhlet can be prepared by further fusing other target proteins. In contrast, backbone cyclization can be achieved by linking the two ends of the peptide chain with natural peptide bonds based on the intracellular cyclization reaction of the split intein, which is released from the precursor protein by self-splicing.

The biosynthesis method of the protein heterosoxohydrocarbon mainly comprises the following steps:

1) designing a protein precursor sequence of protein heterosoxohydrocarbon, wherein the basic structure of the protein precursor sequence comprises from N end to C end: l is1-1-X-L1-2- (in situ cleavage site) -L2-1-X-L2-2Wherein X represents an entanglement motif that forms a dimer; l is1-1/L1-2、L2-1/L2-2Two pairs of cyclisation motifs representing orthogonal coupling reactions occurring intracellularly, which pairs may be two orthogonal polypeptide-protein reaction pairs, or a combination of a polypeptide-protein reaction pair and a split intein, or two orthogonal split inteins; when L is1-1/L1-2For peptide-protein reaction time settingAt L1-2And L2-1The in-situ enzyme cutting site inserted between the two elements is an essential element, the site is subjected to in-situ enzyme cutting by coexpression protease in cells, otherwise, the in-situ enzyme cutting site is an unnecessary element; inserting a target protein sequence into the basic structure, wherein the insertion site is selected from the group consisting of: before and/or after the X domain, N-and/or C-terminus of the polypeptide-protein reaction pair;

2) constructing a coding gene sequence corresponding to the protein precursor sequence in the step 1) and introducing the coding gene sequence into an expression vector;

3) transferring the expression vector constructed in the step 2) into cells for expression, and if necessary, co-expressing protease for cutting the in-situ enzyme cutting site in the cells;

4) purifying the fusion protein obtained in the step 3) to obtain corresponding protein heterogeneous catenane.

In the step 1), the polypeptide-protein reaction pair is preferably a spy tag-spy capture reaction pair or a probe tag-probe capture reaction pair, and typical amino acid sequences of a spy tag (SpyTag) and a spy capture (SpyCatcher) are shown as SEQ ID nos. 1 and 2 in the sequence listing, respectively, or a SpyTag/SpyCatcher mutant having reactivity may be used. The mutant is a peptide chain derived by substituting, deleting or adding amino acid residues on the basis of the amino acid sequence of the SpyTag/SpyCatcher, and the substituted, deleted or added amino acid residues do not influence the coupling reaction of the SpyTag/SpyCatcher to generate isopeptide bonds.

In the step 1), the entanglement motif X is preferably a p53dim structural domain derived from a tumor suppressor, the amino acid sequence of a typical p53dim structural domain is shown as SEQ ID NO:3 in the sequence table, and a p53dim mutant capable of forming a similar dimeric structure can also be applied. The mutant refers to a peptide chain derived by substituting, deleting or adding amino acid residues on the basis of the amino acid sequence of p53dim, and the substituted, deleted or added amino acid residues do not influence the generation of the tangled dimer.

In step 1) above, the N-terminal part (IntN) and C-terminal part (IntC) of the split intein, preferably NpuDnaE, constitute the cyclization motif, and the amino acid sequences of IntC1, IntN1, IntC2 and IntN2 generated by the two splitting modes are respectively shown as SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 and SEQ ID NO:7 in the sequence list. In addition, other conditional cleavage inteins may be used in the present invention to effect the biosynthesis of protein heterohydrocarbons.

In the step 1), the in situ cleavage site is preferably a recognition sequence ETVRFQG of Tobacco Vein Mottle Virus (TVMV) protease.

In step 1) above, for the purpose of topology confirmation of the synthesized protein heterosoxohydrocarbon, the recognition sequence ENLYFQG of the Tobacco plaque virus (TEV) protease, which can also be used as the in situ cleavage site, can be introduced before the first entanglement motif X. Further, for the purpose of purification, a histidine tag sequence was introduced before the second entanglement motif X, and the protein was purified by nickel column affinity chromatography in step 4).

In the above step 3), for L1-1/L1-2In the case of polypeptide-protein reaction pairs, the protein heterosoxohydrocarbon biosynthesis can be realized only by coexpression with protease at an in-situ enzyme cutting site; and for L1-1/L1-2In the case of a split intein, it is not necessary to co-express the protease.

In the step 4), the expressed protein is purified by nickel column affinity chromatography for the protein heterogeneous cable introduced with the histidine tag sequence, and the purity of the protein heterogeneous cable can be further improved by combining gradient elution or size exclusion chromatography.

In the examples of the present invention, as shown in FIG. 2, the following proprotein sequences were designed:

SpyCatcher (B) -p53dim (X) -SpyTag (A) -IntC1-p53dim (X) -IntN1, abbreviated as BXA-IntC1-X-IntN 1;

IntC1-p53dim (X) -POI1-IntN1-IntC2-p53dim (X) -POI2-IntN2, abbreviated as IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN 2.

The coding gene is introduced into an expression vector pMCSG19, and then the expression vector is transferred into BL21(DE3) competent cells for expression. For the system BXA-IntC1-X-IntN1 requiring co-expression of protease, BL21(DE3) competent cells also contained pRK1037 plasmid encoding TVMV protease; and the IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2 can realize the biosynthesis of protein heterosoxohydrocarbon under the condition of single expression or co-expression with TVMV enzyme, and the two have no obvious difference, so that the expression vector of the protein heterosoxohydrocarbon can be transferred into a conventional BL21(DE3) competent cell for expression. And finally purifying the obtained fusion protein to obtain the corresponding protein heterogeneous catenane.

The protein precursor expressed by BXA-IntC1-X-IntN1 or IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2 recombinant plasmid forms intramolecular entangled structure through dimerization of p53dim structural domain, and realizes site-directed cyclization through two orthogonal coupling modes. In a BXA-IntC1-X-IntN1 system, in-situ enzyme digestion is realized by coexpression with TVMV enzyme, so that trans-splicing reaction mediated by IntC1/IntN1 is initiated, and side chain cyclization reaction mediated by spy tag-spy catcher reaction is combined, and finally preparation of protein heterogeneous catenane cat-BXA-X is realized. In the system of IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2, two pairs of broken inteins can sequentially undergo trans-splicing reaction to sequentially mediate the cyclization of two target proteins, and finally the preparation of the protein heterogeneous catenane cat-XPOI1-XPOI2 is realized.

On the basis of BXA-IntC1-X-IntN1, biosynthesis of branched protein heterocates can be achieved on the basis of the same co-expression manner by introducing other folded proteins, such as the affiHER2 which has high affinity for HER2, at the N-terminus of Spycatcher and at the C-terminus of SpyTag. In the system of IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2, small ubiquitin modifying protein SUMO and super-folding protein GFP are selected as model proteins, and the biosynthesis of protein heterogeneous catenanes cat-XSUMO-X and cat-XSUMO-XGFP is respectively realized.

The sequence of the protein precursor in the protein heterosoxhlet biosynthesis process is illustrated below by some specific examples:

(a) SpyCatcher (B) -p53dim (X) -SpyTag (A) -IntC1-p53dim (X) -IntN1(BXA-IntC1-X-IntN 1): from the N-terminus to the C-terminus are the response motif SpyCatcher, the entanglement motif p53dim domain, the response motif SpyTag, the split intein C-terminal portion IntC1, the entanglement motif p53dim domain and the split intein N-terminal portion IntN1, respectively, wherein a recognition sequence of TEV protease is inserted between SpyCatcher and the first p53dim domain, a recognition sequence of TVMV protease is inserted between SpyTag and IntC1, and a histidine tag sequence is introduced before the second p53dim domain. The gene sequence corresponding to BXA-IntC1-X-IntN1 is shown as SEQ ID No. 8 in the list, wherein the amino acid residues 8-122 are SpyCatcher, the amino acid residues 132-138 are the recognition sequence of TEV protease, the amino acid residues 186-198 are SpyTag, the amino acid residues 143-180 and 274-311 are p53dim structural domains, the amino acid residues 205-211 are the recognition sequence of TVMV protease, the amino acid residues 221-255 are IntC1, the amino acid residues 261-266 are 6 XHis tags, and the amino acid residues 319-420 are IntN 1.

(b) AffiHER2-SpyCatcher (B) -p53dim (X) -SpyTag (A) -AffiHER2-IntC1-p53dim (X) -IntN1(AffiHER2-BXA-AffiHER2-IntC1-X-IntN 1): from the N-terminus to the C-terminus, respectively, the target protein AffiHER2, the response motif SpyCatcher, the entanglement motif p53dim domain, the response motif SpyTag, the target protein AffiHER2, the intein C-terminal part IntC1, the entanglement motif p53dim domain and the intein N-terminal part IntN1 are inserted, wherein a recognition sequence of the TEV protease is inserted between SpyCatcher and the first p53dim domain, a recognition sequence of the TVMV protease is inserted between the second AffiHER2 and IntC1, and a histidine tag sequence is introduced before the second p53dim domain. The gene sequence corresponding to the AffiHER2-BXA-AffiHER2-IntC1-X-IntN1 is shown as SEQ ID No. 9 in the list, wherein the amino acid residues at the 6-75 th position and the 279-348 th position are AffiHER2, the amino acid residues at the 82-196 th position are SpyCatcher, the amino acid residue at the 206-212 th position is a recognition sequence of TEV protease, the amino acid residue at the 260-272 th position is SpyTag, the amino acid residues at the 217-254 th position and the 424-461 position are a p53dim structural domain, the amino acid residue at the 355-361 th position is a recognition sequence of TVMV protease, the amino acid residue at the 371-405 th position is IntC1, the amino acid residue at the 411-416-th position is a 6 xHis tag, and the amino acid residue at the 469-570-position is IntN 1.

(c) IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-IntN 2): from N-terminus to C-terminus are the split intein C-terminal part IntC1, the entanglement motif p53dim domain, the protein of interest SUMO, the split intein N-terminal part IntN1, the split intein C-terminal part IntC2, the entanglement motif p53dim domain and the split intein N-terminal part IntN2, respectively, wherein the recognition sequence for the TEV protease is inserted between IntC1 and the first p53dim domain, the recognition sequence for the TVMV protease is inserted between IntN1 and IntC2, and a histidine tag sequence is introduced before the second p53dim domain. The gene sequence corresponding to IntC1-XSUMO-IntN1-IntC2-X-IntN2 is shown as SEQ ID No. 10 in the list, wherein the amino acid residues at positions 8-42 are IntC1, the amino acid residues at positions 48-54 are recognition sequences of TEV protease, the amino acid residues at positions 62-99 and 358-395 are a p53dim structure domain, the amino acid residue at position 100-195 is the SUMO of the target protein, the amino acid residue at position 203-304 is IntN1, the amino acid residue at position 311-317 is the recognition sequence of TVMV protease, the amino acid residue at position 345-350 is 6 XHis, the amino acid residue at position 326-339 is IntC2, and the amino acid residue at position 403-504 is IntN 2.

(d) IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN 2): from the N-terminus to the C-terminus are the split intein C-terminal portion IntC1, the entanglement motif p53dim domain, the protein of interest SUMO, the split intein N-terminal portion IntN1, the split intein C-terminal portion IntC2, the entanglement motif p53dim domain, the protein of interest GFP and the split intein N-terminal portion IntN2, respectively, wherein the recognition sequence of TEV protease is inserted between IntC1 and the first p53dim domain, the recognition sequence of TVMV protease is inserted between IntN1 and IntC2, and the histidine tag sequence is introduced before the second p53dim domain. The gene sequence corresponding to IntC1-XSUMO-IntN1-IntC2-XGFP-IntN2 is shown as SEQ ID No. 11 in the list, wherein the amino acid residues at positions 8-42 are IntC1, the amino acid residues at positions 48-54 are a recognition sequence of TEV protease, the amino acid residues at positions 62-99 and 358-395 are a p53dim domain, the amino acid residue at position 100-195 is the SUMO of the target protein, the amino acid residue at position 203-304 is IntN1, the amino acid residue at position 311-317 is the recognition sequence of TVMV protease, the amino acid residue at position 345-350 is 6 XHis, the amino acid residue at position 326-339 is IntC2, the amino acid residue at position 403-640 is the GFP of the target protein, and the amino acid residue at position 643-765 is IntN 2.

The invention utilizes conventional characterization means, such as sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), ultra-high performance liquid chromatography-mass spectrometry (LC-MS) and TEV enzymolysis reaction, to perform basic characterization and topological structure verification on the prepared protein heterogeneous hydrocarbon.

The invention is based on reasonable gene sequence design, combines in-situ assembly, enzyme digestion and site-specific cyclization, develops a biosynthesis system based on orthogonal coupling, is suitable for synthesizing heterogeneous catenane of various functional proteins in cells, and has the main advantages that: 1) the modular synthesis of heterogeneous cable hydrocarbon can be realized by a gene coding mode, the yield of protein heterogeneous cable hydrocarbon is improved by utilizing intramolecular dimerization of dimeric entanglement motifs such as p53dim structural domain and the like, and the corresponding entanglement motifs and coupling means have various choices; 2) simulating a multi-step post-translational modification process in natural topological protein synthesis, directly completing polypeptide chain entanglement and two orthogonal covalent cyclization reactions in cells, and obtaining corresponding protein heterogeneous catenane after expression and purification without additional extracellular reaction. 3) In the construction of a reaction pair containing polypeptide-protein, such as BXA-IntC1-X-IntN1, the biosynthesis of heterosoxhlet of branched protein can be realized by introducing other folded proteins into the N end of Spycatcher and the C end of SpyTag; in the construction of the peptide containing two orthogonal split inteins, such as IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2, the biosynthesis of the protein heterocatenes with complete backbone cyclization is achieved. Both systems can realize the expansion of the existing protein heterogeneous catenane structure.

Drawings

FIG. 1 shows a schematic structural diagram of a part of the protein heterohydrocarbons synthesized by the different orthogonal coupling reactions of the present invention, wherein L1-1/L1-2And L2-1/L2-2Cyclization motifs representing two orthogonal cyclization modes; when the cyclization motif is polypeptide-protein reaction pair, side chain coupling occurs, and the formed complexes are respectively L1And L2Will be present in the synthesized heterogeneous catenane; when the cyclization motif is a cleaved intein, backbone coupling occurs, leaving from splicing after cyclization and not present in the synthesized heterosoxhlet.

FIG. 2 shows two representative schemes for achieving protein heterosoxohydrocarbon synthesis using orthogonal coupling reactions in accordance with the present invention, wherein: (a) biosynthesis of protein heterosoxohydrocarbons is mediated by in situ enzyme digestion, a SpyTag-SpyCatcher reaction pair and a broken intein IntC1/IntN 1; (b) the biosynthesis of the heterosoxhlet proteins is mediated by two orthogonal split inteins, IntC1/IntN1 and IntC2/IntN 2.

FIG. 3 shows size exclusion chromatography (a) of the example synthesized protein heterosoxhlet cat-BXA-X, SDS-PAGE characterization before and after TEV cleavage (b) and mass spectrum (c) of cat-BXA-X.

FIG. 4 shows size exclusion chromatography (a) of cat- (AffiHER2-BXA-AffiHER2) -X, SDS-PAGE characterization before and after TEV cleavage (b) and mass spectra (c) of cat- (AffiHER2-BXA-AffiHER2) -X, which are protein heterohydrocarbons synthesized in the examples.

FIG. 5 shows size exclusion chromatography (a) of the protein heterosoxhlet cat-XSUMO-X synthesized in the examples, SDS-PAGE characterization before and after TEV cleavage (b) and mass spectrum (c) of cat-XSUMO-X.

FIG. 6 shows size exclusion chromatography (a) of cat-XSUMO-XGFP, a SDS-PAGE characterization before and after TEV cleavage, and mass spectra (c) of cat-XSUMO-XGFP, a protein heterologue synthesized in the examples.

FIG. 7 shows the mass spectra of TEV cleavage products l-BXA (a) and c-X (b) of cat- (AffiHER2-BXA-AffiHER2) -X, the mass spectra of TEV cleavage products l-AffiHER2-BXA-AffiHER2(c) and c-X (d) of cat-XSUMO-X, and the mass spectra of TEV cleavage products l-XSUMO (e) and c-X (f) of cat-XSUMO-X, synthesized in the examples.

Detailed Description

The present invention is further illustrated by the following examples, which are not intended to limit the scope of the invention in any way.

The specific steps of constructing a protein precursor and a corresponding expression system thereof in the process of protein heterogeneous hydrocarbon biosynthesis are as follows:

1) for the systems in which the SpyTag-SpyCatcher reaction pair and the intein cleavage reaction pair IntC1/IntN1 together mediate the synthesis of heterogeneous catenanes of proteins, the genetic sequences containing the 6 XHis tag (for protein purification), the SpyTag and SpyCatcher reaction pair, the p53dim domain, and the intein cleavage reaction pair IntC1/IntN1, i.e., SpyCatcher (B) -p53dim (X) -SpyTag (A) -IntC1-p53dim (X) -IntN1(BXA-IntC1-X-IntN1), were constructed using recombinant genetic engineering techniques. On the basis of the gene sequence, a folded protein AffiHER2 is further introduced into the N end of the SpyCatcher and the C end of the SpyTag respectively to construct the gene sequence of AffiHER2-SpyCatcher (B) -p53dim (X) -SpyTag (A) -AffiHER2-IntC1-p53dim (X) -IntN1(AffiHER2-BXA-AffiHER2-IntC1-X-IntN 1). The two gene sequences are respectively inserted into an expression vector pMSCG19, and transferred into BL21(DE3) competent cells containing pRK1037 plasmid for expression, wherein the pRK1037 plasmid can encode TVMV protease. During the expression process, the biosynthesis of the heterogeneous protein catenanes cat-BXA-X and cat- (AffiHER2-BXA-AffiHER2) -X is realized through in-situ assembly, enzyme digestion and site-directed cyclization.

2) For the system of orthogonal split intein mediated protein heterogeneous soxhlet synthesis, recombinant genetic engineering techniques were used to construct gene sequences containing 6 × His tag (for protein purification), p53dim domain, split intein IntC1/IntN1, split intein IntC2/IntN2 and the protein of interest SUMO/GFP, i.e., IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-IntN2), or IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2(IntC1-X-SUMO-IntN 1-C2-X-GFP-IntN 2). The two gene sequences are respectively inserted into an expression vector pMSCG19 and transferred into BL21(DE3) competent cells for expression. In the expression process, the biosynthesis of the protein heterogeneous catenanes cat-XSUMO-X and cat-XSUMO-XGFP is realized through in-situ assembly and orthogonal fragmentation intein-mediated cyclization reaction.

The prepared protein heterogeneous hydrocarbon is subjected to basic characterization and topological structure verification by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), ultra-high performance liquid chromatography-mass spectrometry (LC-MS) and TEV enzymolysis reaction.

31页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:miRNA表达载体及其应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!