aminoacyl-tRNA synthetases and uses thereof

文档序号:54216 发布日期:2021-09-28 浏览:27次 中文

阅读说明:本技术 氨酰基-tRNA合成酶及其用途 (aminoacyl-tRNA synthetases and uses thereof ) 是由 关宏涛 于 2020-02-19 设计创作,主要内容包括:本发明涉及用2-氨基异丁酸(Aib)对tRNA进行氨酰化的氨酰基-tRNA合成酶,从而能够在翻译过程中将Aib掺入正在增长的多肽链中,例如在真细菌宿主细胞如大肠杆菌中。例如,但非限制性地,本发明涉及新的氨酰基-tRNA合成酶及其用途,以及产生含有一个或多个Aib的多肽的方法。(The present invention relates to aminoacyl-tRNA synthetases that aminoacylate tRNA's with 2-aminoisobutyric acid (Aib) to enable Aib to be incorporated into a growing polypeptide chain during translation, e.g., in a eubacterial host cell such as E.coli. For example, but not by way of limitation, the invention relates to novel aminoacyl-tRNA synthetases and uses thereof, as well as methods of producing polypeptides comprising one or more aibs.)

1. A 2-aminoisobutyric acid-tRNA synthetase (AibRS) comprising the amino acid sequence of SEQ ID NO:7 or a variant thereof, wherein said variant of SEQ ID NO:7 comprises 215 Gly.

2. An AibRS according to any preceding claim; wherein said variant of SEQ ID NO. 7 is a variant of formula I, wherein formula I is:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Xaa-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Cys-Gly-Asp-Xaa-Pro-Cys-Xaa-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Glu-Xaa-Arg-Xaa-Xaa-Phe-Xaa-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Xaa-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Xaa-Ile-Arg-Xaa-Xaa-Asp-Xaa-Asp-Xaa-Val-Gly-Xaa-Xaa-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Xaa-Glu-Thr-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Thr-Phe-Xaa-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Xaa-Xaa-Xaa-Glu-Val-Xaa-Xaa-Xaa-Gly-Xaa-Glu-Xaa-Ala-Thr-Leu-Val-Phe-Met-Xaa-Tyr-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Xaa-Trp-Xaa-Ser-Xaa-Gly-Xaa-Pro-Thr-Xaa-Tyr-Asp-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Asp-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Xaa-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Xaa-Xaa-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Xaa-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Xaa-Xaa-Xaa-Pro-Leu-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Thr-Xaa-Xaa-Arg-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Lys-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa

wherein each Xaa of formula I is independently selected and is one or more amino acids, or absent;

wherein the variant of formula I is at least 90% identical to SEQ ID NO 7 at a position not designated as Xaa.

3. The AibRS of any preceding claim, wherein the variant of SEQ ID No. 7 is at least 41% identical to SEQ ID No. 7.

4. The AibRS of any preceding claim, wherein the variant of SEQ ID No. 7 is at least 55% identical to SEQ ID No. 7.

5. The AibRS of any preceding claim, wherein the variant of SEQ ID No. 7 is at least 90% identical to SEQ ID No. 7.

6. The AibRS of any preceding claim, wherein the variant of SEQ ID No. 7 is characterized by:

position 192 is Trp, His, Val, Ile or Leu

Position 193 is Ala, Leu, Ile or Gly

Position 213Thr, Ser, Cys or Ala

Position 216Phe or Trp

Position 217 as Met, Ile or Leu

Position 249Thr, Ser, Val or Phe

Position 360 is Asn or Ala,

position 459 is Glu or Ala.

7. An AibRS according to any preceding claim; wherein the variant of SEQ ID NO. 7 comprises: [192 His; 215Gly ], [192 His; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Leu; 360Ala.459Ala ], [192 His; 215 Gly; 193 Leu; 217 Leu; 360 Ala; 459Ala ], [192 Val; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Ile; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Leu; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 193 Gly; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 213 Ser; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Ser; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Val; 360 Ala; 459Ala ], [192 His; 213 Cys; 215 Gly; 249 Val; 360 Ala; 459Ala ] or [192 His; 213 Ala; 215 Gly; 249 Phe; 360 Ala; 459Ala ].

8. An AibRS according to any preceding claim; wherein said variant of SEQ ID NO. 7 is selected from the list consisting of: SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29 and SEQ ID NO 30.

9. An AibRS according to any preceding claim; wherein the AibRS is configured to perform a function of aminoacylating the tRNA with Aib.

10. The AibRS of claims 7-9; wherein the tRNA is a suppressor tRNA.

11. Use of an AibRS according to any preceding claim in the preparation of a resulting polypeptide containing one or more Aib residues.

12. Use of the AibRS of claim 11; wherein the resulting polypeptide comprises SEQ ID NO 32.

13. A method of preparing a compound comprising the steps of:

i. using the AibRS of any preceding claim to prepare a resulting polypeptide comprising one or more Aib residues,

derivatization of the resulting polypeptide.

14. The method of claim 13; wherein the compound is of formula 2:

chemical formula 2:

Technical Field

The present invention is in the field of translation biochemistry. The present invention relates to aminoacyl-tRNA synthetases that can be used to incorporate unnatural amino acids into polypeptides.

Incorporation by reference of sequence listing

This application is filed with a sequence listing in electronic form. The entire contents of this sequence listing are incorporated herein by reference.

Background

Recombinant expression is an efficient way to produce natural or engineered proteins and peptides (individually and collectively referred to as "polypeptides"). Most known organisms encode the same 20 naturally occurring amino acids, and thus recombinant expression is limited without further development to polypeptides consisting only of naturally occurring amino acids. A strategy to overcome this limitation has been developed for the in vivo site-specific incorporation of a variety of unnatural amino acids into polypeptides in both prokaryotic and eukaryotic organisms (Wang L1, Schultz PG, A general approach for the generation of orthogonal tRNAs, chem.biol.,2001Sep,8(9): 883-90; Liu CC, Schultz PG, Adding new chemistry to the genetic code, Annu.Rev.biochem.,2010,79: 413-44). These methods utilize an aminoacyl-tRNA synthetase (RS) that aminoacylates a tRNA with a desired unnatural amino acid, which is incorporated into a growing polypeptide chain during translation in response to a selector codon. The translation components can be developed to cross-react with endogenous trnas, RSs, or amino acids in the host organism, but with reduced efficiency.

2-Aminoisobutyric acid (Aib) is a non-protein amino acid of formula H2N-C(CH3)2-COOH. Aib can be incorporated into a polypeptide and impart desired properties to the polypeptide. One example of a polypeptide having desirable properties that includes Aib is semaglutide (semaglutide); this is a biologically active GLP-1 (glucagon-like peptide-1) analogue [ Lau j. et al, Discovery of the once-week glucagon-like peptide-1(GLP-1) analog semaglutide, j.med.chem., 2015; 58:7370-7380]Has been used as antidiabeticAnd (5) selling. Sermetreuptade is disclosed in WO 06097537.

No genetic translation component has been reported for Aib, and thus Aib-containing polypeptides are not currently recombinantly expressed.

Disclosure of Invention

The invention relates to aminoacyl-tRNA synthetases that aminoacylate tRNA's with 2-aminoisobutyric acid (Aib) to enable Aib to be incorporated into a growing polypeptide chain during translation. In other words, the invention provides means for making polypeptides having Aib translationally incorporated.

In a first aspect, the invention relates to a 2-aminoisobutyric acid-tRNA synthetase (AibRS) that aminoacylates a tRNA with 2-aminoisobutyric acid. In a second aspect, the invention relates to the use of AibRS in the preparation of a polypeptide comprising Aib. In a third aspect, the invention relates to a method of making a compound comprising an Aib-containing polypeptide, wherein the method comprises the step of using AibRS to make an Aib-containing polypeptide.

Description of the invention

Terms presented in the singular also include the plural unless otherwise indicated in the specification and the appended claims. Thus, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. For example, reference to "a polypeptide" actually includes a plurality of polypeptides.

In a first aspect, the invention relates to a 2-aminoisobutyric acid-tRNA synthetase (AibRS). The 2-aminoisobutyric acid-tRNA synthetase (AibRS) comprises the amino acid sequence of SEQ ID NO:7 or a variant thereof, wherein the variant of SEQ ID NO:7 has Gly at position 215.

In another aspect, the invention relates to the use of an AibRS of the invention in the preparation of an Aib-containing polypeptide. In a further aspect, the invention relates to the use of an AibRS of the invention in the preparation of a resulting polypeptide comprising SEQ ID NO: 32.

In another aspect, the present invention relates to a method of preparing chemical formula 2, comprising the steps of: (i) preparing the resulting polypeptide by using the AibRS of the invention, and (ii) derivatization of the resulting polypeptide.

2-Aminoisobutyric acid (Aib)

As used herein, the term "2-aminoisobutyric acid" (Aib) refers to an unnatural amino acid represented by chemical formula 1. Aib may also be referred to as "alpha-methylalanine".

Chemical formula 1:

2-Aminoisobutyric acid-tRNA synthetase (AibRS)

As used herein, a "2-aminoisobutyric acid-tRNA synthetase" (AibRS) refers to an aminoacyl-tRNA synthetase (RS) that can aminoacylate a tRNA with Aib. The AibRS can: (i) identical or substantially similar to a naturally occurring alanine-tRNA synthetase (AlaRS), (ii) derived from a naturally occurring AlaRS by natural or artificial mutagenesis, (iii) or derivatized by any process that takes into account the sequence of the wild-type or mutant AlaRS sequences of (i) or (ii), e.g., library screening or rational design. The AibRS can be derived from the AlaRS of Pyrococcus delbrueckii (Pyrococcus horikoshii) or any naturally occurring AlaRS.

As used herein, the term "translation" refers to the same or similar translation operations or portions of translation operations as those for gene expression known in the fields of molecular biology and genetics. The result of translation is an ever-growing polypeptide chain, also referred to as the "resulting polypeptide". An AibRS of the invention can aminoacylate a tRNA with Aib, making the tRNA available for translation, wherein the result of the translation is a resulting polypeptide that comprises one or more Aib residues. tRNAs which function in this manner are also referred to as "tRNAsAib". In other words, when used in conjunction with a translation process, the AibRS can be configured to incorporate Aib into the resulting polypeptide. The function of Aib is not limited to any particular resulting polypeptide. The translation system is particularly useful for preparing the resulting polypeptide comprising SEQ ID NO: 32.

Aminoacylation of the tRNA with Aib by AibRS can occur in competition with other amino acids present in the host cell, e.g., a natural amino acid, such as Ala. In other words, the invention is not limited to AibRS specifically designed to aminoacylate tRNA's of the translation system with Aib; the AibRS of the invention can aminoacylate a tRNA with any amino acid. Preferably, the AibRS aminoacylates the tRNA with Aib. The efficiency of AibRS function can be determined by assaying polypeptides obtained after tRNA translation. The efficiency can be expressed as the ratio between the amount of the resulting polypeptide containing Aib and the amount of the resulting polypeptide containing another amino acid, e.g., Ala. Quantification of the resulting polypeptide can be performed using LC-MS.

The function of the AibRS of the present invention is not limited to any particular biological system. The AibRS may operate in the context of in vitro conditions and/or in vivo conditions. The AibRS are suitable for operation in the context of a host cell, wherein translation of the resulting polypeptide comprising Aib is effected by utilizing one or more components of the host cell's translational machinery. The AibRS can interact with endogenous components of a biological system that utilizes the AibRS, such as components of a host cell.

The term "aminoacylation" as used herein refers to the manipulation by the AibRS (or any other aminoacyl-tRNA synthetase) of a catalytic tRNA to amino acid linkage. AibRS is said to "aminoacylate" tRNA. The terms "aminoacylation" and "loading" are used interchangeably.

In one embodiment, the AibRS comprises the amino acid sequence of SEQ ID No. 7 or a variant thereof, wherein the variant comprises 215 Gly. In one embodiment, the variant of SEQ ID NO. 7 is a variant of formula I. In one embodiment, the variant of formula I is at least 90% identical to SEQ ID No. 7 at a position not designated as Xaa. In one embodiment, the variant of formula I is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to formula I at a sequence position not designated as Xaa. In one embodiment, the variant of SEQ ID NO. 7 is at least 51%, at least 55% or at least 90% identical to SEQ ID NO. 7. In one embodiment, said SEQ IThe variant of D NO. 7 is at least 50%, preferably at least 60%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably 99% identical to SEQ ID NO. 7. In one embodiment, the variant of SEQ ID NO. 7 is characterized in that: position 192 is Trp, His, Val, he or Leu, position 193 is Ala, Leu, Ile or Gly, position 213 is Thr, Ser, Cys or Ala, position 216 is Phe or Trp, position 217 is Met, Ile or Leu, position 249 is Thr, Ser, Val or Phe, position 360 is Asn or Ala, and position 459 is Glu or Ala. In one embodiment, the variant of SEQ ID NO. 7 comprises: [192 His; 215Gly]、[192His;215Gly;360Ala;459Ala]、[192His;215Gly;217Ile;360Ala;459Ala]、[192His;215Gly;217Leu;360Ala;459Ala]、[192His;215Gly;193Leu;360Ala;459Ala]、[192His;215Gly;216Trp;360Ala;459Ala]、[192His;215Gly;193Leu;216Trp;360Ala;459Ala]、[192His;215Gly;193Ile;360Ala;459Ala]、[192His;215Gly;193Ile;217Ile;360Ala;459Ala]、[192His;215Gly;193Leu;217Ile;360Ala;459Ala]、[192His;215Gly;193Ile;217Leu;360Ala.459Ala]、[192His;215Gly;193Leu;217Leu;360Ala;459Ala]、[192Val;215Gly;217Ile;360Ala;459Ala]、[192Ile;215Gly;217Ile;360Ala;459Ala]、[192Leu;215Gly;217Ile;360Ala;459Ala]、[192His;193Gly;215Gly;217Ile;360Ala;459Ala]、[192His;213Ser;215Gly;360Ala;459Ala]、[192His;215Gly;249Ser;360Ala;459Ala]、[192His;215Gly;249Val;360Ala;459Ala]、[192His;213Cys;215Gly;249Val;360Ala;459Ala]Or [192 His; 213 Ala; 215 Gly; 249 Phe; 360 Ala; 459Ala]. In one embodiment, the variant of SEQ ID NO 7 is selected from the list consisting of SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29 and SEQ ID NO 30. In one embodimentSaid AibRS is derived from archaebacteria. In one embodiment, the AibRS is derived from Pyrococcus horikoshii. In one embodiment, the AibRS is configured to perform the function of aminoacylating a tRNA with Aib. In one embodiment, the AibRS is capable of aminoacylating a tRNA with Aib. In one embodiment, the AibRS is configured to perform the function of aminoacylating a tRNA with Aib in a host cell. In one embodiment, the AibRS is configured to perform the function of aminoacylating a tRNA with Aib in e. In one embodiment, the efficiency of the function of aminoacylating a tRNA with Aib is determined by analyzing the expression product, i.e., the polypeptide resulting from translation of the tRNA. In one embodiment, the efficiency is expressed as the incorporation ratio between the amount of the resulting polypeptide containing Aib and Ala or Aib at the position expected to be directed against Aib. In one embodiment, the incorporation ratio is determined using LC-MS and is calculated based on mass spectrometry as follows: incorporation ratio ═ peak intensity]Aib-containing polypeptide/([ peak intensity)]Aib-containing polypeptide+ [ peak intensity]Ala-containing polypeptides) 100%. In one embodiment, the incorporation ratio is at least 1%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55, at least 60%, at least 65%, or at least 70%.

tRNAAib

As used herein, "tRNAAib"refers to a tRNA that can be aminoacylated by Aib via AibRS and that, during translation, incorporates Aib into the resulting polypeptide. The tRNA can be: (i) with naturally occurring alanine-tRNA (tRNA)Ala) Identical or substantially similar, (ii) derived from a naturally occurring tRNA by natural or artificial mutagenesisAla(iii) by considering the wild type or mutant tRNA of (i) or (ii)AlaAny process of sequence derivatization, such as library screening and/or rational design. In some embodiments, the tRNA may be derived from Pyrococcus digitorum (Pyrococcus horikoshii). The tRNAAibCan be a suppressor tRNA.

tRNAAibCan be present in a loaded state (i.e., aminoacylated with an amino acid) or an unloaded state (i.e., not aminoacylated with an amino acid). tRNAAibHomologous to AibRS of the invention, and aminoacylated with Aib. Aminoacylation of the tRNA with Aib by AibRS can occur in competition with another amino acid present in the host cell, e.g., a natural amino acid, such as Ala. In other words, tRNAAibtRNA molecules, not restricted to aminoacylation by AibRS with Aib alone, tRNAsAibAminoacylation with another amino acid can be performed by AibRS. tRNAAibFor inserting amino acids into the resulting polypeptide in response to a selector codon during translation. Preferably, the amino acid is Aib.

As used herein, the term "responsive to" refers to the process by which a tRNA of the invention recognizes a selector codon and mediates the incorporation of an unnatural amino acid coupled to the tRNA into the resulting polypeptide. tRNAAibAib can be incorporated in response to a selector codon that is a stop codon, e.g., TAG.

As used herein, the term "encoding" refers to any process that uses information in a molecule or sequence string to direct the production of a second molecule or sequence string that is different from the first molecule or sequence string. In particular, the RNA molecule may encode a polypeptide, in which case the translation process needs to take place before the resulting polypeptide can be obtained. The term "encode" when used to describe the translation process may also be extended to a triplet codon encoding an amino acid or a stop codon. In particular, the DNA molecule may encode the resulting polypeptide, in which case both the transcription and translation processes need to occur before the resulting peptide can be obtained. It is to be understood that the present invention is not limited to working in the context of polynucleotides (e.g., RNA) that only require a translation process to produce the resulting polypeptide. The invention can also work in the context of polynucleotides (e.g., DNA) that require both transcription and translation processes to produce the resulting polypeptide.

In one embodiment, the AibRS is configured to perform the function of aminoacylating a tRNA with Aib. In one embodiment, the tRNA is a suppressor tRNA. In one embodiment, the tRNA comprises one or more anticodon encoding Aib. In one embodiment, the anticodon of the tRNA encoding Aib is complementary to the stop codon. In one embodiment, the selector codon is a nonsense codon, such as a stop codon, a four base codon, a rare codon, and a codon derived from a natural, non-natural base pair. In one embodiment, the stop codon is amber, ochre and/or opal codon. In one embodiment, the anticodon encoding Aib is CTA. In one embodiment, the tRNA is encoded by SEQ ID NO 3 or a variant thereof. In one embodiment, the variant of SEQ ID NO. 3 that encodes a tRNA comprises the G3A mutation. In one embodiment, the variant of SEQ ID No. 3 encoding a tRNA is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or at least 100% identical to SEQ ID No. 3. In one embodiment, the variant of SEQ ID NO 3 encoding tRNA is SEQ ID NO 4. In one embodiment, the tRNA is encoded by SEQ ID NO 4. In one embodiment, the tRNA is derived from an archaebacterium. In one embodiment, the tRNA is derived from Pyrococcus horikoshii (Pyrococcus horikoshii).

Amino acids

As used herein, the term "natural amino acid" refers to the 20 standard amino acids encoded by the human standard genetic code. Natural amino acids may also be referred to as "protein amino acids". The term "unnatural amino acid" refers to amino acids that can occur in nature (but do not belong to the 20 standard amino acids encoded by the human standard genetic code) or are purely synthetic, e.g., modified amino acids and amino acid analogs. Unnatural amino acids can also be referred to as "non-protein amino acids" or "non-coding amino acids". Non-limiting examples of non-natural amino acids are Aib and the D-isomer of a natural amino acid. If the term "amino acid" is used herein without specifying whether it is natural or unnatural, it is to be construed as including both natural and unnatural amino acids. The term "any amino acid" as used herein is to be construed as including natural and unnatural amino acids.

Suppressor tRNA and selector codon

The term "suppressor tRNA" as used herein refers to a tRNA that alters the reading of an mRNA in a given translation system, thereby allowing translational readthrough of a codon (e.g., a selector codon that is a stop codon) that would otherwise result in translation termination or mis-translation (e.g., frameshifting). Typically, the suppressor tRNA allows the incorporation of an amino acid in response to a stop codon during translation of the polypeptide (a process known as "readthrough").

As used herein, the term "selector codon" refers to a codon in response to which a tRNA (e.g., suppressor tRNA) incorporates an amino acid into the resulting polypeptide. the tRNA is referred to as the "anticodon" of the selector codon. Selector codons can be nonsense codons, such as stop codons, four base codons, rare codons, and codons derived from natural, unnatural base pairs, and the like. Non-limiting examples of stop codons are amber, ochre and opal codons.

Mutations and mutants

The term "mutation" as used herein in the context of an amino acid sequence or polypeptide refers to an amino acid that is (i) substituted with another amino acid, (ii) deleted, or (iii) added. Mutations in the context of amino acid sequences may be referred to as "amino acid changes". A particular amino acid at a particular position of an amino acid sequence (or polypeptide) can be described by a three letter code that lists the sequence position in question, followed by the amino acids present at that position. In the context of an amino acid sequence or polypeptide, one non-limiting example of such nomenclature is: by "192 His and 215 Gly", it is meant that His is present at position 192 and Gly is present at position 215. Substitutions in an amino acid sequence (or polypeptide) can be described by listing the three letter code for the amino acid being substituted, followed by the sequence position in question, followed by the amino acid after substitution. One non-limiting example of such nomenclature is: "Trp 192 His; val215Gly ", wherein Trp at position 192 is replaced by His and Val at position 215 is replaced by Gly. Substitutions may be provided with respect to a particular sequence described in the sequence listing (typically with respect to the wild-type tRNA synthetase), and the position of the amino acid substituted is determined based on a number that can be deduced from the particular sequence in the sequence listing. In the sequence Listing, the first amino acid residue (counted from the N-terminus) (Met) of SEQ ID NO. 1 is designated as number 1; the second amino acid residue (Glu) of SEQ ID NO 1 is designated as number 2, and so on. As used herein in the context of a nucleotide sequence or polynucleotide, the term "mutation" refers to a nucleotide that is (i) substituted with another nucleotide, (ii) deleted, or (iii) added. Substitutions in a nucleotide sequence or polynucleotide can be described by listing the single letter code for the nucleobase of the nucleotide being substituted, followed by listing the sequence position of the nucleotide being substituted, followed by listing the nucleobase of the nucleotide after substitution. In the context of a nucleotide sequence or polynucleotide, one non-limiting example of a replacement nomenclature is: "G3A" in which guanine has been replaced by adenine. Substitutions may be provided with respect to a particular sequence described in the sequence listing (typically with respect to the wild type tRNA), and the position of the nucleotide substituted is determined based on a number that can be deduced from the particular sequence in the sequence listing. In the sequence Listing, the first nucleotide residue (counted from the 5' -terminus) (G) of SEQ ID NO. 2 is designated as number 1; the second amino acid residue (G) of SEQ ID NO. 2 is designated as number 2, and so on.

The term "mutant" as used herein in the context of an amino acid sequence or polypeptide refers to a sequence having one or more mutations compared to the parent sequence of the mutant.

Variants

As used herein, the term "variant" refers to a component that may have a structural difference compared to the parent component of the variant, but retains a similar function. For example, the AibRS and its variants do not necessarily have the same amino acid sequence, but they all aminoacylate a cognate tRNA with an unnatural amino acid. A variant of an AibRS of the invention can have one or more mutations compared to the AibRS, so long as the variant still has the function of aminoacylating the tRNA with Aib. Mutations may include substitutions, deletions and additions. Mutations in variants may also be referred to as "variations".

Polynucleotides and polypeptides

As used herein, the term "polynucleotide" refers to a sequence of two or more nucleotides. The terms "polynucleotide" and "nucleotide sequence" are used interchangeably. As used herein, the term "polypeptide" refers to a sequence of two or more natural or unnatural amino acids. Thus, the term "polypeptide" includes peptides and proteins. The terms "polypeptide" and "amino acid sequence" are used interchangeably. As used herein, the term "resulting polypeptide" refers to a polypeptide that is the product of a translation process, typically involving an AibRS of the invention. The resulting polypeptide may also be referred to as a "growing polypeptide chain," particularly when the resulting polypeptide is described in the context of a translation process. The AibRS of the invention are particularly useful for preparing the resulting polypeptide comprising SEQ ID NO. 32, SEQ ID NO. 32 being the amino acid sequence of semaglutide [ Lau J. et al, Discovery of the once-weekly glucagon-like peptide-1(GLP-1) analog semaglutide, J.Med.chem., 2015; 58:7370-7380].

In one embodiment, the AibRS is used to prepare a resulting polypeptide comprising one or more Aib residues. In one embodiment, the AibRS is used to prepare a resulting polypeptide comprising one or more Aib residues, wherein the resulting polypeptide comprising one or more Aib residues is a GLP-1 analog. In one embodiment, the AibRS is used to prepare a resulting polypeptide, wherein the resulting polypeptide comprises SEQ ID NO 32. In one embodiment, the AibRS is used to prepare a resulting polypeptide, wherein the resulting polypeptide is SEQ ID NO 32.

Derivatives of the same

As used herein in the context of a polypeptide, the term "derivative" means a chemically modified polypeptide to which one or more substituents have been covalently attached. As used herein, the term "derivatizing" refers to the process of obtaining a derivative from a polypeptide. For example, chemical formula 2 can be obtained from SEQ ID NO:32 by derivatization with the introduction of a substituent (also referred to as a "side chain") of chemical formula 3.

The AibRS of the present invention is particularly useful in a method for preparing semaglutide (chemical formula 2), which comprises the steps of: (i) preparing a resultant polypeptide comprising SEQ ID NO:32 by using the AibRS of the present invention, and (ii) derivatizing the polypeptide comprising SEQ ID NO:32 with substituent chemical formula 3.

Chemical formula 2 (semaglutide):

chemical formula 3 (substituent [ or side chain ] of semaglutide, wherein denotes the point of attachment to the polypeptide sequence):

in one embodiment, the present invention relates to a method of producing a compound comprising the steps of: (i) preparing a resulting polypeptide comprising one or more Aib residues using the AibRS of the invention, and (ii) derivatization of the resulting polypeptide; in a preferred embodiment, the compound is of formula 2; in a preferred embodiment, the resulting polypeptide is SEQ ID NO 32; in a preferred embodiment, the resulting polypeptide is derivatized with chemical formula 3. Derivatization may be as for example Lau j. et al, Discovery of the once-weekly glucagon-like peptide-1(GLP-1) analog semaglutide, j.med.chem., 2015; 58: 7370-.

Homologues

As used herein, "homolog" refers to components that work together or have some aspect of specificity for each other, e.g., tRNAAibAnd AibRS, as well as an anti-codon and a stop codon. These components may also be referred to as "complementary". In one embodiment, the AibRS is conjugated to a tRNAAibAnd (4) homology is obtained.

Orthogonal

As used herein, the term "orthogonal" refers to a molecule that functions with or fails to function with endogenous components of a host cell (e.g., AibRS and/or tRNA) with reduced efficiency as compared to a corresponding molecule endogenous to the host cell (e.g., AibRS and/or tRNA)Aib). In the context of tRNA and RS, orthogonal means that the endogenous tRNA and endogenous RS function togetherThe orthogonal RS has a reduced ability or efficiency to function with the endogenous RS compared to when the orthogonal tRNA is absent or less effective than when the endogenous RS is present. Orthogonal molecules lack endogenous complementary molecules that function normally in the cell. For example, an orthogonal tRNA in a cell is aminoacylated by any endogenous RS of the cell with reduced, or even zero, efficiency as compared to aminoacylation of the endogenous tRNA by the endogenous RS. In another example, an orthogonal RS aminoacylates any endogenous tRNA of a cell of interest with reduced or even zero efficiency, as compared to aminoacylation of the endogenous tRNA by the endogenous RS. Orthogonality may be expressed as the efficiency of the AibRS function.

Translational components of the invention (e.g., AibRS and/or tRNA)Aib) Can be derived from any organism (or combination of organisms) for use in a host translation system from any other species, but it is noted that the translation components and host system function in an orthogonal manner. In some embodiments, the translational component is derived from an archaeal gene (i.e., archaebacteria) for use in a eubacterial host system. For example, orthogonal tRNAAibCan be derived from a protozoan, for example, archaea such as Methanococcus jannaschii (Methanococcus jannaschii), Methanobacterium thermoautotrophicum (Methanobacterium thermoautotrophicum), Halobacterium (Halobacterium) (e.g., Haloferax volcanii) and species NRC-I of Halobacterium), Archaeoglobus fulgidus (Archaeoglobus fulgidus), Pyrococcus furiosus (Pyrococcus furiosus), Pyrococcus delbrueckii (Pyrococcus horikoshii), Aeuropyrum pernix, Methanococcus marinus (Methanococcus marudus ipuis), Methanopyrus kansuis (Methanopyrus kandelii), Methanococcus jannaschii (Methanopyrus mazeii), Pyrobacterium aeolicus, Pyrobacterium thermophilum mazeii, Thermomum thermophilum, Thermomum thermosiphorum, Thermoascus, and the like, such as Thermomyces thermosulfuricus (Thermomyces, and the like. In some embodiments, the translational component is derived from a eukaryotic source, e.g., plant, algae, protozoanSubstances, fungi, yeasts, animals (e.g., mammals, insects, arthropods, etc.), and the like.

Host cell

As used herein, the term "host cell" refers to a cell in which an AibRS of the invention can perform its function. The host cell may be prokaryotic, such as bacterial and archaeal cells, or eukaryotic, such as yeast, algae, filamentous fungi, mammalian cells, plant cells, and insect cells. Microbial host cells may also be referred to as "microorganisms". Non-limiting examples of eubacteria include Escherichia coli, Thermus thermophilus, Bacillus subtilis, Bacillus stearothermophilus, Corynebacterium glutamicum. Non-limiting examples of archaea include Methanococcus jannaschii, Methanosarcina meissii, Methanobacterium thermoautotrophicum, Methanococcus maripaludis, Methanothermus kansuensis, Halobacterium (e.g., Halobacterium vorrichum and Halobacterium species NRC-I), Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrobaculum aerophilum, Pyrococcus profundus, Sulfolobus solfatarii, Sulfolobus tokodaii, Aeuropyrum pernix, Thermoplasma acidophilum, and Thermoplasma volcanii. Non-limiting examples of yeasts and filamentous fungi include Saccharomyces cerevisiae (Saccharomyces cerevisiae), Pichia pastoris (Pichia pastoris), Pichia pfaffi, Hansenula polymorpha (Hansenula polymorpha), Aspergillus niger (Aspergillus niger) and Trichoderma reesei (Trichoderma reesei). Non-limiting examples of mammalian cells include CHO, CHO-K1, CHO-DXB11, CHO-DG44, CHO-S, HEK293, or a derivative of any of these cells.

Host cells are typically genetically engineered (e.g., transformed, transduced or transfected) with a polynucleotide encoding an AibRS of the invention, e.g., using one or more vectors. The coding region of the AibRS of the invention and the polypeptide to be translated can be operably linked to gene expression control elements that function in the desired host cell. The vector may contain transcription and translation terminators, transcription and translation initiation sequences, and promoters useful for regulating the expression of the particular target nucleic acid. The vector may comprise a universal expression cassette containing at least one independent terminator sequence, sequences allowing replication of the cassette in eukaryotes or prokaryotes or both (e.g., shuttle vectors) and selection markers for prokaryotic and eukaryotic systems. Non-limiting examples of vectors are plasmids, bacteria, viruses, naked polynucleotides or conjugated polynucleotides. In one embodiment, the host cell is E.coli.

Consensus sequences

As used herein, the term "consensus sequence" refers to the calculated order of the most common amino acid residues found at each position in a sequence alignment. It represents the result of an alignment of multiple sequences in which related sequences (e.g., archaea AlaRS sequences) are compared to each other and similar sequence motifs are calculated. Thus, a consensus sequence is a model of putative amino acid positions necessary for the functionality of a biologically active polypeptide (e.g., by participating in folding and/or amino acid binding). In general, amino acid sequence positions not identified as putative active sites have a high degree of freedom, as the amino acids at these positions are typically exposed to solvents in the loop, and thus a high degree of variation can be introduced at these positions while maintaining the function performed by the biologically active polypeptide. In one embodiment, Aib is defined as a variant of a consensus sequence. In one embodiment, Aib is defined as a variant of formula I. In one embodiment, Aib is defined as a variant of formula II. In one embodiment, Aib is defined as a variant of formula III.

Sequence identity

Sequence identity is the degree to which two (nucleotide or amino acid) sequences have the same residue at the same position in an alignment. Sequence identity is conveniently expressed as a percentage, i.e. if 85 amino acids in 100 alignment positions between two sequences are identical, the degree of identity is 85%. If one of the two sequences is a consensus sequence, only the conserved positions of the consensus sequence are considered in the calculation. That is, if 85 amino acids in 100 aligned and conserved positions are identical, the degree of identity is 85%, even though the sequence may be longer than 100 amino acids. For the purposes of the present invention, sequence identity between two amino acid sequences is determined by using simple handwriting and visualization, and/or standard protein or peptide alignment programs, such as "alignments" based on the Needleman-Wunsch algorithm. The algorithm is described in Needleman, S.B. and Wunsch, C.D. (1970), Journal of Molecular Biology,48: 443-. For alignment, the default scoring matrix BLOSUM62 and the default identity matrix (identity matrix) can be used, and the penalty for the first residue in a gap can be set to-12, or preferably-10, while the penalty for the other residues in the gap can be set to-2, or preferably-0.5.

Detailed description of the preferred embodiments

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting.

1. A2-aminoisobutyric acid-tRNA synthetase (AibRS) comprising the amino acid sequence of SEQ ID NO. 7 or a variant thereof, wherein the variant comprises 215 Gly.

2. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 7 is a variant of formula I; wherein formula I is:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Xaa-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Cys-Gly-Asp-Xaa-Pro-Cys-Xaa-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Glu-Xaa-Arg-Xaa-Xaa-Phe-Xaa-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Xaa-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Xaa-Ile-Arg-Xaa-Xaa-Asp-Xaa-Asp-Xaa-Val-Gly-Xaa-Xaa-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Xaa-Glu-Thr-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Thr-Phe-Xaa-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Xaa-Xaa-Xaa-Glu-Val-Xaa-Xaa-Xaa-Gly-Xaa-Glu-Xaa-Ala-Thr-Leu-Val-Phe-Met-Xaa-Tyr-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Xaa-Trp-Xaa-Ser-Xaa-Gly-Xaa-Pro-Thr-Xaa-Tyr-Asp-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Asp-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Xaa-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Xaa-Xaa-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Xaa-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Xaa-Xaa-Xaa-Pro-Leu-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Thr-Xaa-Xaa-Arg-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Lys-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa

wherein each Xaa in formula I is independently selected and is one or more amino acids, or is absent;

wherein the variant of formula I is at least 90% identical to SEQ ID NO 7 at a position not designated as Xaa.

3. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 7 is a variant of formula I; wherein formula I corresponds to a consensus sequence of the formula:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Xaa-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Cys-Gly-Asp-Xaa-Pro-Cys-Xaa-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Glu-Xaa-Arg-Xaa-Xaa-Phe-Xaa-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Xaa-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Xaa-Ile-Arg-Xaa-Xaa-Asp-Xaa-Asp-Xaa-Val-Gly-Xaa-Xaa-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Xaa-Glu-Thr-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Thr-Phe-Xaa-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Xaa-Xaa-Xaa-Glu-Val-Xaa-Xaa-Xaa-Gly-Xaa-Glu-Xaa-Ala-Thr-Leu-Val-Phe-Met-Xaa-Tyr-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Xaa-Trp-Xaa-Ser-Xaa-Gly-Xaa-Pro-Thr-Xaa-Tyr-Asp-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Asp-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Xaa-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Xaa-Xaa-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Xaa-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Xaa-Xaa-Xaa-Pro-Leu-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Thr-Xaa-Xaa-Arg-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Lys-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa

wherein each Xaa of formula I is independently selected and is one or more amino acids, or absent;

wherein the variant of formula I is at least 90% identical to SEQ ID NO 7 at a position not designated as Xaa.

4. The AibRS of any preceding embodiment; wherein the variant of formula I is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to formula I at a sequence position not designated as Xaa.

5. The AibRS of any preceding embodiment, wherein Xaa is one or more of any species of amino acid, or is absent.

6. The AibRS of any preceding embodiment, wherein Xaa of formula I is any amino acid or absent.

7. The AibRS of any preceding embodiment; wherein Xaa of formula I is any amino acid.

8. The AibRS of any preceding embodiment, wherein the variant of SEQ ID No. 7 is at least 41% identical to SEQ ID No. 7.

9. The AibRS of any preceding embodiment, wherein the variant of SEQ ID No. 7 is at least 55% identical to SEQ ID No. 7.

10. The AibRS of any preceding embodiment, wherein the variant of SEQ ID No. 7 is at least 90% identical to SEQ ID No. 7.

11. The AibRS according to any preceding embodiment, wherein the variant of SEQ ID No. 7 is at least 50%, preferably at least 60%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably 99% identical to SEQ ID No. 7.

12. The AibRS of any preceding embodiment; wherein said variant of SEQ ID NO. 7 is characterized in that:

position 192 is Trp, His, Val, Ile or Leu

Position 193 is Ala, Leu, Ile or Gly

Position 213Thr, Ser, Cys or Ala

Position 216Phe or Trp

Position 217 as Met, Ile or Leu

Position 249Thr, Ser, Val or Phe

Position 360 is Asn or Ala,

position 459 is Glu or Ala.

13. The AibRS of any preceding embodiment; wherein said variant of SEQ ID NO. 7 is characterized in that:

position 192 is His, Val, Ile or Leu

Position 193 is Ala, Leu, Ile or Gly

Position 213Thr, Ser, Cys or Ala

Position 216Phe or Trp

Position 217 as Met, Ile or Leu

Position 249Thr, Ser, Val or Phe

An Ala at position 360, in which,

ala at position 459.

14. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 7 comprises: [192 His; 215Gly ], [192 His; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Leu; 360Ala.459Ala ], [192 His; 215 Gly; 193 Leu; 217 Leu; 360 Ala; 459Ala ], [192 Val; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Ile; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Leu; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 193 Gly; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 213 Ser; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Ser; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Val; 360 Ala; 459Ala ], [192 His; 213 Cys; 215 Gly; 249 Val; 360 Ala; 459Ala ] or [192 His; 213 Ala; 215 Gly; 249 Phe; 360 Ala; 459Ala ].

15. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 7 comprises: [192 His; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 217 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 360 Ala; 459Ala ], [192 His; 215 Gly; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 216 Trp; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Leu; 217 Ile; 360 Ala; 459Ala ], [192 His; 215 Gly; 193 Ile; 217 Leu; 360Ala.459Ala ], [192 His; 215 Gly; 193 Leu; 217 Leu; 360 Ala; 459Ala ], [192 Val; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Ile; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 Leu; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 193 Gly; 215 Gly; 217 Ile; 360 Ala; 459Ala ], [192 His; 213 Ser; 215 Gly; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Ser; 360 Ala; 459Ala ], [192 His; 215 Gly; 249 Val; 360 Ala; 459Ala ], [192 His; 213 Cys; 215 Gly; 249 Val; 360 Ala; 459Ala ] or [192 His; 213 Ala; 215 Gly; 249 Phe; 360 Ala; 459Ala ].

16. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO 7 is selected from the list consisting of SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29 and SEQ ID NO 30.

17. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO 7 is selected from the list consisting of SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29 and SEQ ID NO 30.

18. The AibRS of any preceding embodiment; wherein the AibRS is derived from archaebacteria.

19. The AibRS of any preceding embodiment; wherein the AibRS is derived from Pyrococcus horikoshii.

20. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform a function of aminoacylating the tRNA with Aib.

21. An AibRS according to any preceding claim; wherein the AibRS can aminoacylate the tRNA with Aib.

22. The AibRS of any preceding embodiment; wherein the tRNA is a suppressor tRNA.

23. The AibRS of any preceding embodiment; wherein the tRNA comprises one or more anticodon that encode Aib.

24. The AibRS of any preceding embodiment; wherein the anticodon of the tRNA encoding Aib is complementary to the stop codon.

25. The AibRS of any preceding embodiment; wherein the anti-codon encoding Aib is CTA.

26. The AibRS of any preceding embodiment; wherein the tRNA is encoded by SEQ ID NO 3 or a variant thereof.

27. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 3 encoding a tRNA comprises the G3A mutation.

28. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO. 3 encoding the tRNA is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 3.

29. The AibRS of any preceding embodiment; wherein the variant of SEQ ID NO 3 encoding the tRNA is SEQ ID NO 4.

30. The AibRS of any preceding embodiment; wherein the tRNA is encoded by SEQ ID NO 4.

31. The AibRS of any preceding embodiment; wherein the tRNA is derived from archaebacteria.

32. The AibRS of any preceding embodiment; wherein the tRNA is derived from Pyrococcus horikoshii (Pyrococcus horikoshii).

33. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform a function of aminoacylating the tRNA with Aib in the host cell.

34. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform the function of aminoacylating the tRNA with Aib in E.coli.

35. The AibRS of any preceding embodiment; wherein the efficiency of said function is determined by analyzing the expression product, i.e.the polypeptide resulting from the translation of the tRNA.

36. The AibRS of any preceding embodiment; wherein the efficiency is expressed as the incorporation ratio between the amount of the resulting polypeptide containing Aib and the amount of the resulting polypeptide containing Ala or Aib at the position expected for Aib.

37. The AibRS of any preceding embodiment; wherein the incorporation ratio is determined using LC-MS and calculated based on mass spectrometry as follows: incorporation ratio ═ peak intensity]Aib-containing polypeptide/([ peak intensity)]Aib-containing polypeptide+ [ peak intensity]Ala-containing polypeptides)*100%。

38. The AibRS of any preceding embodiment; wherein the incorporation ratio is at least 1%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, or at least 70%.

39. Use of an AibRS according to any preceding embodiment in the preparation of a resulting polypeptide comprising one or more Aib residues.

40. The use of an AibRS according to any preceding embodiment, wherein the resulting polypeptide containing one or more Aib residues is a GLP-1 analog.

41. The use of an AibRS according to any preceding embodiment, wherein the resulting polypeptide comprising one or more Aib residues comprises SEQ ID NO: 32.

42. A method of preparing a compound comprising the steps of:

i. using the AibRS of any preceding embodiment to prepare a resulting polypeptide containing one or more Aib residues,

derivatization of the resulting polypeptide.

43. The method according to any one of the preceding embodiments; wherein the compound is

Chemical formula 2:

further embodiments

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting.

1. A2-aminoisobutyric acid-tRNA synthetase (AibRS) comprising the amino acid sequence of SEQ ID NO. 7 or a variant thereof, wherein the variant comprises 215 Gly.

2. An AibRS comprising an amino acid sequence of SEQ ID NO. 7 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 7.

3. An AibRS comprising an amino acid sequence of SEQ ID NO. 10 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 10.

4. An AibRS comprising an amino acid sequence of SEQ ID NO. 11 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 11.

5. An AibRS comprising an amino acid sequence of SEQ ID NO. 12 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 12.

6. An AibRS comprising an amino acid sequence of SEQ ID NO. 13 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 13.

7. An AibRS comprising an amino acid sequence of SEQ ID NO. 14 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 14.

8. An AibRS comprising an amino acid sequence of SEQ ID NO. 15 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 15.

9. An AibRS comprising an amino acid sequence of SEQ ID NO 16 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO 16.

10. An AibRS comprising an amino acid sequence of SEQ ID NO 17 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO 17.

11. An AibRS comprising an amino acid sequence of SEQ ID NO. 18 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 18.

12. An AibRS comprising an amino acid sequence of SEQ ID NO 19 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO 19.

13. An AibRS comprising an amino acid sequence of SEQ ID NO. 20 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 20.

14. An AibRS comprising an amino acid sequence of SEQ ID NO. 21 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 21.

15. An AibRS comprising an amino acid sequence of SEQ ID NO. 22, or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 22.

16. An AibRS comprising an amino acid sequence of SEQ ID NO. 23 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 23.

17. An AibRS comprising an amino acid sequence of SEQ ID NO. 24 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 24.

18. An AibRS comprising an amino acid sequence of SEQ ID NO. 25 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 25.

19. An AibRS comprising an amino acid sequence of SEQ ID NO. 26, or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 26.

20. An AibRS comprising an amino acid sequence of SEQ ID NO. 27 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 27.

21. An AibRS comprising an amino acid sequence of SEQ ID NO 28, or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO 28.

22. An AibRS comprising an amino acid sequence of SEQ ID NO. 29 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 29.

23. An AibRS comprising an amino acid sequence of SEQ ID NO. 30 or a variant thereof, wherein the variant is at least 41%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO. 30.

24. The AibRS of any one of embodiments 4-23, wherein the variant comprises 215 Gly.

25. The AibRS of any preceding embodiment; wherein the variant is of formula I or a variant thereof:

formula I:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Xaa-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Cys-Gly-Asp-Xaa-Pro-Cys-Xaa-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Glu-Xaa-Arg-Xaa-Xaa-Phe-Xaa-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Xaa-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Xaa-Ile-Arg-Xaa-Xaa-Asp-Xaa-Asp-Xaa-Val-Gly-Xaa-Xaa-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Xaa-Glu-Thr-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Thr-Phe-Xaa-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Xaa-Xaa-Xaa-Glu-Val-Xaa-Xaa-Xaa-Gly-Xaa-Glu-Xaa-Ala-Thr-Leu-Val-Phe-Met-Xaa-Tyr-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Xaa-Trp-Xaa-Ser-Xaa-Gly-Xaa-Pro-Thr-Xaa-Tyr-Asp-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Asp-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Xaa-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Xaa-Xaa-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Xaa-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Xaa-Xaa-Xaa-Pro-Leu-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Thr-Xaa-Xaa-Arg-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Lys-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa,

wherein Xaa is one or more of any species of amino acid, or is absent.

26. The AibRS of any preceding embodiment; wherein the variant of formula I is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to formula I at a sequence position not designated as Xaa.

27. The AibRS of any preceding embodiment; wherein Xaa of formula I is any amino acid or is absent.

28. The AibRS of any preceding embodiment; wherein Xaa of formula I is any amino acid.

29. The AibRS of any preceding embodiment; wherein the variant is of formula II or a variant thereof:

formula II:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Arg-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Asp-Pro-Asp-Arg-Glu-Thr-Cys-Gly-Asp-Xaa-Pro-Cys-Asp-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Xaa-Leu-Xaa-Glu-Xaa-Arg-Glu-Xaa-Phe-Leu-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Arg-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Ser-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Ser-Ile-Arg-Xaa-Thr-Asp-Xaa-Asp-Asn-Val-Gly-Xaa-Thr-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Asp-Glu-Thr-Val-Glu-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Thr-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Xaa-Glu-Xaa-Ile-Thr-Phe-Lys-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Pro-Xaa-Xaa-Glu-Val-Leu-Xaa-Arg-Gly-Leu-Glu-Val-Ala-Thr-Leu-Val-Phe-Met-Gln-Tyr-Lys-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Met-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Val-Trp-Xaa-Ser-Xaa-Gly-Thr-Pro-Thr-Ala-Tyr-Asp-Ala-Val-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Glu-Asn-Ser-Xaa-Leu-Ala-Gly-Xaa-Xaa-Asp-Ile-Glu-Xaa-Xaa-Xaa-Asp-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Gly-Ile-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Ile-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Val-Lys-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Leu-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Leu-Xaa-Xaa-Pro-Leu-Xaa-Glu-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Tyr-Xaa-Xaa-Thr-Leu-Xaa-Arg-Gly-Xaa-Xaa-Leu-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Lys-Lys-Xaa-Gly-Xaa-Xaa-Glu-Xaa-Pro-Leu-Glu-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Val-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Gly-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Leu-Val-Ala-Xaa-Xaa-Xaa-Glu-Xaa-Xaa,

wherein Xaa is one or more of any species of amino acid, or is absent.

30. The AibRS of any preceding embodiment; wherein the variant of formula II is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to formula II at a sequence position not designated as Xaa.

31. The AibRS of any preceding embodiment; wherein Xaa of formula II is any amino acid or is absent.

32. The AibRS of any preceding embodiment; wherein Xaa of formula II is any amino acid.

33. The AibRS of any preceding embodiment; wherein the variant is of formula III or a variant thereof:

formula III:

Met-Xaa-Met-Asp-Met-Xaa-Thr-Arg-Met-Phe-Lys-Glu-Glu-Gly-Trp-Ile-Arg-Lys-Xaa-Cys-Lys-Xaa-Cys-Gly-Lys-Xaa-Phe-Trp-Thr-Leu-Asp-Pro-Asp-Arg-Glu-Thr-Cys-Gly-Asp-Pro-Pro-Cys-Asp-Glu-Tyr-Xaa-Phe-Ile-Gly-Lys-Pro-Gly-Ile-Pro-Lys-Lys-Tyr-Thr-Leu-Xaa-Glu-Met-Arg-Glu-Lys-Phe-Leu-Ser-Phe-Phe-Glu-Xaa-Xaa-Gly-His-Gly-Arg-Val-Lys-Arg-Tyr-Pro-Val-Leu-Pro-Arg-Trp-Arg-Asp-Asp-Val-Leu-Leu-Val-Gly-Ala-Ser-Ile-Met-Asp-Phe-Gln-Pro-Trp-Val-Ile-Ser-Gly-Glu-Ala-Asp-Pro-Pro-Ala-Asn-Pro-Leu-Thr-Ile-Ser-Gln-Pro-Ser-Ile-Arg-Phe-Thr-Asp-Ile-Asp-Asn-Val-Gly-Ile-Thr-Gly-Arg-His-Phe-Thr-Ile-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Tyr-Pro-Gly-Lys-Pro-Ile-Tyr-Trp-Met-Asp-Glu-Thr-Val-Glu-Leu-Ala-Phe-Glu-Phe-Phe-Thr-Lys-Xaa-Leu-Gly-Met-Lys-Pro-Glu-Asp-Ile-Thr-Phe-Lys-Glu-Asn-Pro-Trp-Ala-Gly-Gly-Gly-Asn-Ala-Gly-Pro-Ala-Phe-Glu-Val-Leu-Tyr-Arg-Gly-Leu-Glu-Val-Ala-Thr-Leu-Val-Phe-Met-Gln-Tyr-Lys-Xaa-Ala-Pro-Xaa-Xaa-Ala-Xaa-Xaa-Xaa-Gln-Val-Val-Xaa-Ile-Lys-Gly-Asp-Xaa-Tyr-Val-Pro-Met-Xaa-Thr-Xaa-Val-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Leu-Val-Trp-Met-Ser-Gln-Gly-Thr-Pro-Thr-Ala-Tyr-Asp-Ala-Val-Leu-Gly-Tyr-Val-Val-Glu-Pro-Leu-Lys-Xaa-Met-Ala-Gly-Xaa-Glu-Lys-Ile-Asp-Xaa-Xaa-Ile-Leu-Met-Glu-Asn-Ser-Arg-Leu-Ala-Gly-Met-Phe-Asp-Ile-Glu-Asp-Met-Gly-Asp-Leu-Arg-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Arg-Val-Gly-Ile-Ser-Val-Glu-Glu-Leu-Glu-Xaa-Xaa-Xaa-Arg-Pro-Tyr-Glu-Leu-Ile-Tyr-Ala-Ile-Ala-Asp-His-Thr-Lys-Ala-Leu-Thr-Phe-Met-Leu-Ala-Asp-Gly-Val-Ile-Pro-Ser-Asn-Val-Lys-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Leu-Ile-Arg-Lys-Ser-Ile-Arg-His-Leu-Arg-Glu-Leu-Gly-Leu-Glu-Xaa-Pro-Leu-Ser-Glu-Ile-Val-Ala-Met-His-Ile-Lys-Glu-Leu-Xaa-Xaa-Thr-Phe-Pro-Glu-Phe-Lys-Glu-Met-Glu-Asp-Val-Ile-Leu-Asp-Ile-Xaa-Xaa-Val-Glu-Glu-Lys-Arg-Tyr-Xaa-Glu-Thr-Leu-Xaa-Arg-Gly-Ser-Xaa-Leu-Val-Xaa-Arg-Glu-Ile-Xaa-Lys-Leu-Lys-Lys-Xaa-Gly-Xaa-Xaa-Glu-Xaa-Pro-Leu-Glu-Lys-Leu-Ile-Leu-Phe-Tyr-Glu-Ser-His-Gly-Leu-Thr-Pro-Glu-Ile-Val-Xaa-Glu-Ile-Ala-Glu-Lys-Glu-Gly-Xaa-Lys-Val-Xaa-Ile-Pro-Asp-Asn-Phe-Tyr-Ser-Leu-Val-Ala-Lys-Xaa-Ala-Glu-Xaa-Xaa,

wherein Xaa is one or more of any species of amino acid, or is absent.

34. The AibRS of any preceding embodiment; wherein the variant of formula III is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to formula III at a sequence position not designated as Xaa.

35. The AibRS of any preceding embodiment; wherein Xaa of formula III is any amino acid or absent.

36. The AibRS of any preceding embodiment; wherein Xaa of formula III is any amino acid.

37. The AibRS of any preceding embodiment; wherein the variant comprises at least 197 amino acids selected from the list consisting of: 10Phe, 12Glu, 14Gly, 18Lys, 20Cys, 23Cys, 24Gly, 27Phe, 28Trp, 29Thr, 34Arg, 37Cys, 38Gly, 39Asp, 41Pro, 42Cys, 45Tyr, 47Phe, 48Ile, 49Gly, 51Pro, 59Leu, 61Glu, 63Arg, 66Phe, 69Phe, 70Phe, 71Glu, 75His, 80Arg, 81Tyr, 82Pro, 83Val, 86Arg, 87Trp, 88Arg, 89Asp, 90Asp, 91Val, 93Leu, 94Val, 95Gly, 96Ala, 97Ser, 98Ile, 100Asp, 101Phe, 102Gln, 103Pro, 104Trp, 105Val, 108Gly, 112Pro, 113Pro, 114Ala, 115Asn, 116Pro, Leu 117, Ile, 120Ser, 121Gln, 122Pro, 124Ile, 125Arg, 128Asp, 132, 133Gly, 137 Gly, 136 His, 147, 150 Glu, 150 Glu, 148, 150, 146, 150, 146, 150, Ala, 150, Ala, 150, Phe, 150, Ala, 150, Phe, 150, III, 150, III, 164Val, 170Phe, 182Ile, 183Thr, 184Phe, 186Glu, 189Trp, 191Gly, 192Gly, 193Gly, 194Asn, 195Ala, 196Gly, 200Glu, 201Val, 205Gly, 207Glu, 209Ala, 210Thr, 211Leu, 212Val, 213Phe, 214Met, 216Tyr, 238Pro, 244Val, 245Asp, 246Thr, 247Gly, 248Tyr, 249Gly, 250Leu, 251Glu, Arg, 255Trp, 257Ser, 259Gly, 261Pro, 262Thr, 264Tyr, 265Asp, 266, 275Leu, 279Ala, 280Gly, 289Ile, 290Leu, 297Ala, 298Gly, 301Asp, 308Leu, 311Leu, 312Arg, 315Val, 316Ala, 325Glu, 326Leu, 332Pro, 334Glu, 337Tyr, 338Ala, 339Ile, 340Ala, Asp, 342 Ala, 341, His, 343Thr, Leu, 350, 358, 350 Arg, 354, Asp, 364, Asp, 363, 356, Ty, 363, 361, Ty, 361, 363, Ty, 356, Ty 362, Ty, 361, Ty, 361, 362, 361, 362, 361, 150, 361, 150, 361, 150, 361, 150, 379Leu, 380Gly, 384Pro, 385Leu, 388Ile, 401Pro, 402Glu, 405Glu, 417Glu, 418Glu, 424Thr, 427Arg, 428Gly, 440Lys, 443Gly, 448Pro, 452Leu, 456Tyr, 458Ser, 459His, 460Gly, 463Pro, 464Glu, 468Glu, 470Ala, 477Val, 480Pro, 481Asp, 482Asn, 483Phe, 484Tyr, 487Val and 488 Ala.

38. The AibRS of any preceding embodiment; wherein the variant comprises at least 264 amino acids selected from the list consisting of: 10Phe, 12Glu, 14Gly, 17Arg, 18Lys, 20Cys, 23Cys, 24Gly, 27Phe, 28Trp, 29Thr, 31Asp, 32Pro, 33Asp, 34Arg, 35Glu, 36Thr, 37Cys, 38Gly, 39Asp, 41Pro, 42Cys, 43Asp, 45Tyr, 47Phe, 48Ile, 49Gly, 51Pro, 57Tyr, 59Leu, 61Glu, 63Arg, 64Glu, 66Phe, 67Leu, 69Phe, 70Phe, 71Glu, 75His, 77Arg, 80Arg, 81Tyr, 82Pro, Val 83, 86Arg, 87Trp, 88Arg, 89Asp, 90Asp, 91Val, 93Leu, 94Val, 95Gly, 96Ala, 97Ser, 98Ile, 100Asp, 101Phe, 102Gln, 103Pro, 104Trp, 105, 107Ser, 108Gly, 112Pro, 113Pro, 114, Ala 115, 116Pro, Leu 117, 119, 120 Ile, 120, 122, 123, 122, 123, 122, 123, 122, 123, 122, 123, 122, 123, 122, 23, Ser, 23, Ser, 23, Ser, III, 138His, 140Thr, 142Phe, 143Glu, 144Met, 145Met, 146Ala, 147His, 148His, 149Ala, 150Phe, 151Asn, 153Pro, 158Tyr, 159Trp, 161Asp, 162Glu, 163Thr, 164Val, 165Glu, 170Phe, 172Thr, 175Leu, 180Glu, 182Ile, 183Thr, 184Phe, 185Lys, 186Glu, 189Trp, 191, 192Gly, 193Gly, 194Asn, 195Ala, 196Gly, 197Pro, 200Glu, 201Val, 202Leu, 204Arg, 205Gly, 206Leu, 207Glu, 208Val, 209Ala, 210Thr, 211Leu, 212Val, 213Phe, 214Met, 215 n, 216Tyr, 217Lys, 233, Pro, 239Met, 244Val, 245Asp, Thr, 247, Gly, 249Gly, 250Leu, 251Glu, 271 Glu, 252, 254 Ser, 264 p 255, 257 Lys, 261, 260, Gly, 279, Gly, 293 Val, 260, 293 Val, 150, 293 Val, 265, 150, and 275, 294Ser, 296Leu, 297Ala, 298Gly, 301Asp, 302Ile, 303Glu, 307Asp, 308Leu, 311Leu, 312Arg, 315Val, 316Ala, 320Gly, 321Ile, 325Glu, 326Leu, 332Pro, 334Glu, 336Ile, 337Tyr, 338Ala, 339Ile, 340Ala, 341Asp, 342His, 343Thr, 346Leu, 348Phe, 349Met, 350Leu, 352Asp, 353Gly, 354Val, 356Pro, 357Ser, 358Asn, 359Val, 360Lys, 361Ala, 362Gly, 363, 364Leu, 365Ala, 366Arg, 367Leu, 368Leu, 369Ile, 370Arg, 374Arg, 379Leu, 380Gly, Leu, 384Pro, 385Leu, 387 Lys, Il388E, 396, 401 Leu, Glu 402, 405Glu, 410 e, 417Glu, 418Glu, Gly, Lys, 418Glu, Gly, Tyr, 425, Thr, 448, Gly, Ser 460Gly, 463, Ty 449, Gly, 459, Lys, Gly 449, 443, Gly 449, 443, Gly 449, 443, Gly 449, 443, Gly, 466Val, 468Glu, 470Ala, 474Gly, 477Val, 480Pro, 481Asp, 482Asn, 483Phe, 484Tyr, 486Leu, 487Val, 488Ala and 492 Glu.

39. The AibRS of any preceding embodiment; wherein the variant comprises at least 434 amino acids selected from the list consisting of: 1Met, 3Met, 4Asp, 5Met, 7Thr, 8Arg, 9Met, 10Phe, 11Lys, 12Glu, 13Glu, 14Gly, 15Trp, 16Ile, 17Arg, 18Lys, 20Cys, 21Lys, 23Cys, 24Gly, 25Lys, 27Phe, 28Trp, 29Thr, 30Leu, 31Asp, 32Pro, 33Asp, 34Arg, 35Glu, 36Thr, 37Cys, 38Gly, 39Asp, 40Pro, 41Pro, 42Cys, 43Asp, 44Glu, 45Tyr, 47Phe, 48Ile, 49Gly, 50Lys, 51Pro, 52Gly, 53Ile, 54Pro, 55Lys, 56Lys, 57Tyr, 58Thr, 59Leu, 61Glu, 62Met, 63Arg, 64Glu, 65, 66Phe, 67Leu, 68Ser 69Phe, 70Phe, 71Glu, 74Gly, 75 Gly, His 77, 76, 80Arg, 80 Lys, 80 Val, 81, 80 Val, 85, 95, 96Ala, 97Ser, 98Ile, 99Met, 100Asp, 101Phe, 102Gln, 103Pro, 104Trp, 105Val, 106Ile, 107Ser, 108Gly, 109Glu, 110Ala, 111Asp, 112Pro, 113Pro, 114Ala, 115Asn, 116Pro, 117Leu, 118Thr, 119Ile, 120Ser, 121Gln, 122Pro, 123Ser, 124Ile, 125Arg, 126Phe, 127Thr, 128Asp, 129Ile, 130Asp, 131Asn, 132Val, 133Gly, 134Ile, 135Thr, 136Gly, 137Arg, 138His, 139Phe, 140Thr, 141Ile, 142Phe, 143Glu, 144Met, 145Met, 146Ala, 147His, 148His, 149Ala, 150Phe, 151Asn, 152Tyr, 153Pro, 154Gly 157 Lys, 156Pro Ile, 158Tyr, Trp, 160, Asp, 161, 167, Thr, 167, Lys, 150Phe, 175 Glu, 175, 150, 175, 150, 175, 150, III, 150, III, IV, 183Thr, 184Phe, 185Lys, 186Glu, 187Asn, 188Pro, 189Trp, 190Ala, 191Gly, 192Gly, 193Gly, 194Asn, 195Ala, 196Gly, 197Pro, 198Ala, 199Phe, 200Glu, 201Val, 202Leu, 203Tyr, 204Arg, 205Gly, 206Leu, 207Glu, 208Val, 209Ala, 210Thr, 211Leu, 212Val, 213Phe, 214Met, 215Gln, 216Tyr, 217Lys, 219Ala, 220Pro, 223Ala, 227Gln, 228Val, 229Val, 231Ile, 232Lys, 233Gly, 234Asp, Tyr 236 Val, 237Val, 238Pro 239, Met 241, Thr, 243Val, 244, 245Asp, 246Thr, 247Gly, 248Tyr, 249Gly, 250Leu, 251Glu, 252Arg, 253Leu, Leu 254, 255Trp, 257 Met, Ser, 258Gln, 259, Gly 271, Thr 262, 263, Thr, 275 Val, 266, 150 Val, 266, 150 Val, 150, 85, 150, III, 150, III, 150, III, 150, III, 150, III, 150, III, 150, III, 35, III, 35, III, IV, III, IV, III, IV, III, 280Gly, 282Glu, 283Lys, 284Ile, 285Asp, 288Ile, 289Leu, 290Met, 291Glu, 292Asn, 293Ser, 294Arg, 295Leu, 296Ala, 297Gly, 298Met, 299Phe, 300Asp, 301Ile, 302Glu, 303Asp, 304Met, 305Gly, 306Asp, 307Leu, 308Arg, 310Leu, 311Arg, 314Val, 315Ala, 317Arg, 318Val, 319Gly, 320Ile, 321Ser, 322Val, 323Glu, 324Glu, 325Leu, 326Glu, 330Arg 369, 331, 332Tyr, Glu, Leu 334, Ile, 336Tyr, 337Ala, 338Ile, 339Ala, 340Asp, 341, 342Thr, 343Lys, 344Ala, Leu 345, 346Thr, 347Phe, 348, 349Leu, 350Ala, 351Asp, 352, Gly 353, 354, 355Pro Val, 356, Asp, 375 Asn, 358, 375, Lys, 375, Tyr, 364, Lys, 368, Tyr 366, Tyr, 376, 377Glu, 378Leu, 379Gly, 380Leu, 381Glu, 383Pro, 384Leu, 385Ser, 386Glu, 387Ile, 388Val, 389Ala, 390Met, 391His, 392Ile, 393Lys, 394Glu, 395Leu, Thr398, 399Phe, 400Pro, 401Glu, 402Phe, 403Lys, 404Glu, 405Met, 406Glu, 407Asp, 408Val, 409Ile, 410Leu, 411Asp, 412Ile, 415Val, 416Glu, 417Glu, 473 Lys, 419Arg, 420Tyr, 422Glu, 423Thr, Leu 424, 426Arg, 427Gly, Ser 428, 430Leu, 431Val, 433, 434Glu, 435Ile, 437Lys, 440 Leu 439Lys, 440Lys, 442Gly, Glu, Pro 447, 448Leu, 449Glu, 450Lys, Leu 452 Lys, Leu 453, 454Phe, 455Tyr, 456, His 460Gly, 463, 459, 467, 440, Lys, Gly, Ty, 475, Ty, 467, 475, Ty, 440, Ty, 475, Ty, 440, Ty, 475, Ty, 440, Ty, 475, Ty, 475, Ty, 475, Ty, 475, Ty, 440, Ty, 440, Ty, 430, Ty, 475, 440, 430, 475, Ty, 430, Ty, 440, Ty, 475, Ty, 430, Ty, 430, 440, 430, Ty, 440, Ty, 430, Ty, 430, Ty, 440, Ty, 479Pro, 480Asp, 481Asn, 482Phe, 483Tyr, 484Ser, 485Leu, 486Val, 487Ala, 488Lys, 490Ala and 491 Glu.

40. The AibRS of any preceding embodiment; wherein the variant is further characterized by:

position 192 is Trp, His, Val, Ile or Leu

Position 193 is Ala, Leu, Ile or Gly

Position 213Thr, Ser, Cys or Ala

Position 216Phe or Trp

Position 217 as Met, Ile or Leu

Position 249Thr, Ser, Val or Phe

Position 360 is Asn or Ala,

position 459 is Glu or Ala.

41. The AibRS of any preceding embodiment; wherein the variant is further characterized by:

position 192 is His, Val, Ile or Leu

Position 193 is Ala, Leu, Ile or Gly

Position 213Thr, Ser, Cys or Ala

Position 216Phe or Trp

Position 217 as Met, Ile or Leu

Position 249Thr, Ser, Val or Phe

An Ala at position 360, in which,

ala at position 459.

42. The AibRS of any preceding embodiment; wherein the variant further comprises one or more amino acids selected from the list consisting of: 192Trp, 192His, 192Val, 192Ile, 192Leu, 193Ala, 193Leu, 193Ile, 193Gly, 213Thr, 213Ser, 213Cys, 213Ala, 216Phe, 216Trp, 217Met, 217Ile, 217Leu, 249Thr, 249Ser, 249Val, 249Phe, 360Asn, 360Ala, 459Glu, and 459 Ala.

43. The AibRS of any preceding embodiment; wherein the variant further comprises two or more amino acids selected from the list consisting of: 192Trp, 192His, 192Val, 192Ile, 192Leu, 193Ala, 193Leu, 193Ile, 193Gly, 213Thr, 213Ser, 213Cys, 213Ala, 216Phe, 216Trp, 217Met, 217Ile, 217Leu, 249Thr, 249Ser, 249Val, 249Phe, 360Asn, 360Ala, 459Glu, and 459 Ala.

44. The AibRS of any preceding embodiment; wherein the variant further comprises three or more amino acids selected from the list consisting of: 192Trp, 192His, 192Val, 192Ile, 192Leu, 193Ala, 193Leu, 193Ile, 193Gly, 213Thr, 213Ser, 213Cys, 213Ala, 216Phe, 216Trp, 217Met, 217Ile, 217Leu, 249Thr, 249Ser, 249Val, 249Phe, 360Asn, 360Ala, 459Glu, and 459 Ala.

45. The AibRS of any preceding embodiment; wherein the variant further comprises one or more amino acids selected from the list consisting of: 192His, 192Val, 192Ile, 192Leu, 193Ile, 193Gly, 213Ser, 213Cys, 213Ala, 216Trp, 217Ile, 217Leu, 249Ser, 249Val, 249Phe, 360Ala, and 459 Ala.

46. The AibRS of any preceding embodiment; wherein the variant further comprises two or more amino acids selected from the list consisting of: 192His, 192Val, 192Ile, 192Leu, 193Ile, 193Gly, 213Ser, 213Cys, 213Ala, 216Trp, 217Ile, 217Leu, 249Ser, 249Val, 249Phe, 360Ala, and 459 Ala.

47. The AibRS of any preceding embodiment; wherein the variant further comprises three or more amino acids selected from the list consisting of: 192His, 192Val, 192Ile, 192Leu, 193Ile, 193Gly, 213Ser, 213Cys, 213Ala, 216Trp, 217Ile, 217Leu, 249Ser, 249Val, 249Phe, 360Ala, and 459 Ala.

48. The AibRS of any preceding embodiment; wherein the variant is selected from the list consisting of: SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29 and SEQ ID NO 30.

49. The AibRS of any preceding embodiment; wherein the variant is selected from the list consisting of: 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 and 30 SEQ ID NO.

50. The AibRS of any preceding embodiment; wherein the AibRS is derived from archaebacteria.

51. The AibRS of any preceding embodiment; wherein the AibRS is derived from Pyrococcus horikoshii.

52. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform a function of aminoacylating the tRNA with Aib.

53. The AibRS of any preceding embodiment; wherein the tRNA is a suppressor tRNA.

54. The AibRS of any preceding embodiment; wherein the tRNA comprises one or more anticodon that encode Aib.

55. The AibRS of any preceding embodiment; wherein the anticodon of the tRNA encoding Aib is complementary to the stop codon.

56. The AibRS of any preceding embodiment; wherein the anti-codon encoding Aib is CTA.

57. The AibRS of any preceding embodiment; wherein the tRNA is encoded by SEQ ID NO 3 or a variant thereof.

58. The AibRS of any preceding embodiment; wherein the variant encoding tRNA comprises a G3A mutation.

59. The AibRS of any preceding embodiment; wherein the variant encoding a tRNA is at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% identical to SEQ ID NO 3.

60. The AibRS of any preceding embodiment; wherein the variant encoding tRNA is SEQ ID NO 4.

61. The AibRS of any preceding embodiment; wherein the tRNA is encoded by SEQ ID NO 4.

62. The AibRS of any preceding embodiment; wherein the tRNA is derived from archaebacteria.

63. The AibRS of any preceding embodiment; wherein the tRNA is derived from Pyrococcus horikoshii (Pyrococcus horikoshii).

64. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform a function of aminoacylating the tRNA with Aib in the host cell.

65. The AibRS of any preceding embodiment; wherein the AibRS is configured to perform the function of aminoacylating the tRNA with Aib in E.coli.

66. The AibRS of any preceding embodiment; wherein the efficiency of said function is determined by analyzing the expression product, i.e.the polypeptide resulting from the translation of the tRNA.

67. The AibRS of any preceding embodiment; wherein the efficiency is expressed as the incorporation ratio between the amount of the resulting polypeptide containing Aib and the amount of the resulting polypeptide containing Ala or Aib at the position expected for Aib.

68. The AibRS of any preceding embodiment; wherein the incorporation ratio is determined using LC-MS and calculated based on mass spectrometry as follows: incorporation ratio ═ peak intensity]Aib-containing polypeptide/([ peak intensity)]Aib-containing polypeptide+ [ peak intensity]Ala-containing polypeptides)*100%。

69. The AibRS of any preceding embodiment; wherein the incorporation ratio is at least 1%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, or at least 70%.

70. Use of an AibRS according to any preceding embodiment in the preparation of a resulting polypeptide comprising one or more Aib residues.

71. The use of an AibRS according to any preceding embodiment, wherein the resulting polypeptide containing one or more Aib residues is a GLP-1 analog.

72. The use of an AibRS according to any preceding embodiment, wherein the resulting polypeptide comprising one or more Aib residues comprises SEQ ID NO: 32.

73. A method of preparing a compound comprising the steps of:

a. preparation of the resulting polypeptide containing one or more Aib residues Using the AibRS of any of the preceding embodiments

b. Derivatization of the resulting polypeptide.

74. The method according to any one of the preceding embodiments, wherein the resulting polypeptide comprises SEQ ID NO: 32.

75. The method according to any one of the preceding embodiments, wherein the compound is of formula 2:

76. the method according to any one of the preceding embodiments; wherein the resulting polypeptide comprises SEQ ID NO 32; and wherein the compound is formula 2:

chemical formula 2

77. The method according to any one of the preceding embodiments; wherein the derivatization introduces a substituent of chemical formula 3:

chemical formula 3

Examples

General procedure

SDS-PAGE

Using 4-12% NuPAGETMBis-Tris was subjected to SDS-PAGE as a standard in the art to analyze expression samples. For the preparation of the gel and the sample, the protocol of the supplier (Thermofoisher) was applied.

AibPlasmid construction for expression of tRNA, AibRS, and target polypeptide (i.e., the resulting polypeptide)

tRNAAibThe AibRS and the target polypeptide (i.e., the resulting polypeptide) are expressed from two separate plasmids or from a single plasmid. In the case of two separate plasmids, each plasmid contains a separate selectable marker, i.e., the amp and kan resistance markers. In the case of using a single plasmid, a single marker, i.e., amp, is used. tRNAAibThe AibRS and the target polypeptide (i.e., the resulting polypeptide) are controlled by separate promoter and terminator sequences, i.e., the T7 promoter and terminatorStoppages controlling expression of the target polypeptide, lac promoters and terminators controlling expression of AibRS, lpp promoters and terminators controlling tRNAAibExpression of (2). Preferably, the tRNAAibExpression was performed for multiple copies of the tDNA, i.e., as two separate 3x concatemers, where expression of each concatemer was controlled by the lpp promoter and terminator and the tet promoter and terminator, respectively. Each individual tRNA in the concatemerAibSeparated by a ValU-ValX operon linker and IleT and AlaT operon linkers, so as to be capable of being treated by RNase to release individual tRNA molecules.

Transformation of E.coli

Transformation of E.coli was performed according to Sambrook et al (1989) [ Sambrook J, Fritsch EF, Maniatis T.; molecular Cloning, A Laboratory Manual, 2 nd edition; cold Spring Harbor Laboratory Press New York; 1989] or by electroporation in 2-mm cuvettes using Bio-Rad Gene pulsers set at 25. mu.F, 200ohm and 2.5kV as described in Dower et al (1988) [ Dower, W.J., Miller, J.F., and Ragsdale, C.W, (1988) Nucleic Acids Res.16,6127-6145 ]. Transformed cells were selected on LB medium supplemented with the appropriate selective antibiotics, i.e. ampicillin and/or canavanine.

Cultivation of Escherichia coli

The E.coli TKO cells transformed with the test plasmid (WO/2010/052335) were taken from frozen stock or directly from fresh transformations on LB plates (using appropriate antibiotics). The cells were then inoculated into a 50ml spin bioreactor P50 containing 8ml of LB medium plus antibiotics. Cells were grown at 37 ℃ with shaking at 220rpm for 5 hours. Cells from preculture were diluted to fill 125mL20ml of basal-defined medium (M9) containing the appropriate antibiotic in disposable Erlenmeyer flasks. Aib was added to a concentration of 10 mM. Cells were cultured at 37 ℃ to an OD600 of 1.0. Target protein expression was induced by addition of 1mM IPTG and further incubation at 30 ℃ or 37 ℃ for at least 4 hours until overnight. Such asIn the next paragraph, inclusion body fractions were separated and collected by sonication and centrifugation.

Sample preparation and LC-MS methods

E.coli cell suspensions corresponding to 10OD were centrifuged at 13000Xg for 5 minutes. The resulting precipitate was resuspended in 1ml of 10mM potassium phosphate pH 5.0 buffer and sonicated at 30% amplitude, 5 '/5' on/off for 2 minutes. After repeated centrifugation at 13000Xg for 5 minutes, the pellet fraction was dissolved with 8M urea/100 mM DTT/50mM CAPS pH 12 buffer and shaken at 2000Xrpm for 1 hour at room temperature. The samples were filtered with a 0.2 μm UPLC filter and analyzed by LC-MS using the following settings:

tRNA synthetases and tRNAs suitable for incorporating 2-aminoisobutyric acid into polypeptides during translation

Methods have previously been reported that allow the systematic addition of unnatural amino acids to the genetic code of e.coli, yeast and mammalian cells [ Wals k. and ova H; current and future applications in the design of therapeutic proteins; front chem. apr 1; 2:15(2014)][ Wang Q. and Wang L; genetic incorporation of unnatural amino acids into proteins in yeast; biol, Methods mol; 794:199-213(2012)]And [ Schmied WH,SJ;Efficient multisite unnatural amino acid incorporation in mammalian cells via optimized pyrrolysyl tRNA synthetase/tRNA expression and engineered eRF1;Am Chem Soc.Nov 5;136(44):15577-83(2014)]. These methods are based on selection from distant organismsAn isolated orthogonal tRNA/RS pair that enables a cell to incorporate a given amino acid in response to a unique codon with no or reduced cross-reactivity with endogenous host tRNA, RS, or amino acid.

To specifically incorporate 2-aminoisobutyric acid (Aib) into a polypeptide, an RS capable of charging the tRNA with Aib can be developed and used with other translation mechanisms, such as endogenous portions of a host cell (e.g., e. An RS capable of charging a tRNA with Aib is referred to below as 2-aminoisobutyric acid-tRNA synthetase (AibRS). The Aib-loaded tRNA molecule is referred to below as tRNAAib. AibRS and tRNAAibHereinafter collectively referred to as AibRS/tRNAAibAnd (4) carrying out pairing.

AibRS and tRNAAibCan be conveniently derived from AlaRS and tRNA, respectivelyAlaBecause Ala has structural similarity to Aib. In theory, AlaRS and tRNAAlaCan originate from any organism. In this case, AibRS and tRNAAibWild-type AlaRS and wild-type tRNA derived from Pyrococcus gordonii (ph) which is an archaebacteriumAla(referred to as phAlaRS (wt) (SEQ ID NO:1) and phtRNA, respectivelyAla(wt) (SEQ ID NO: 2)). In this case, the tRNAAibOptimization and AibRS development were performed using E.coli as a model system, although in theory any host cell could be used for this purpose.

tRNAAibIs optimized

Example 1

tRNAAibIn principle, any tRNA that is homologous to an AibRS (e.g., a tRNA)Ala). However, to increase the efficiency of the translation process for incorporating Aib residues, tRNA's can be optimized such that the formation of undesired translation products, e.g., Ala-containing polypeptides other than Aib, is reduced.

To avoid the undesirable incorporation of Aib into all of the Ala-encoding positions, the phtRNA was usedAla(wt) (SEQ ID NO:2) the anticodon was changed to CTA, and the opposite complement of the amber stop codon TAG, resulting in the suppressor phtRNAAla(SEQ ID NO: 3). Any stop codon can in principle be used for this purpose.

Inhibition of phtRNAAla(SEQ ID NO:3) is not orthogonal to the endogenous E.coli translation system, which means that phtRNA is inhibitedAla(SEQ ID NO:3) can be recognized by the endogenous AlaRS (ecaAlaRS) of E.coli and aminoacylated with Ala instead of Aib. To facilitate aminoacylation with Aib, the phtRNA can be usedAla(SEQ ID NO:3) to make it orthogonal to ecaRaRS while maintaining its complementarity to AibRS. While such mutations may provide improved efficiency, they are not in principle necessary for the onset of the invention. By aligning sequences from the public domain (i.e., Genbank or EMBL) at the phtRNAAla(wt) (SEQ ID NO:2) tRNA between different species for G3 and T72 DNA nucleotides that form a G-U pair in the secondary structureAlaIs conservative. In addition, tRNA from Aquifex aeolicus has been shown for other archaeaAlaMutation of G3A in (1) makes tRNAAlaNot recognized by any AlaRS [ MA Swairjo et al; alkyl-tRNA synthetic Crystal Structure, and Design for Acceptor-Stem Recognition; molecular Cell, Vol.13, 829-841, (2004)]. Inspired by observations in Aquifex aeolicus, a suppressor phtRNA containing the G3A mutation was preparedAla(SEQ ID NO:3), thereby producing tRNAAib(SEQ ID NO:4)。

Inhibition of phtRNA was testedAla(SEQ ID NO:3) and tRNAAib(SEQ ID NO: 4). In the first sample, E.coli cells were modified by standard plasmid transformation to encode the suppressor phtRNAAla(SEQ ID NO:3) and the model polypeptide MS- (Aib) -hsLeptin (SEQ ID NO:5) into cells. In a second sample, E.coli cells were modified by standard plasmid transformation to encode tRNAAib(SEQ ID NO:4) and the model polypeptide MS- (Aib) -hsLeptin (SEQ ID NO:5) into cells. Theoretically, if endogenous ecaRaRS recognizes exogenous suppressor phtRNAAla(SEQ ID NO:3), then the full-length model polypeptide will be expressed (although the model polypeptide MS- (Aib) -hsLeptin will contain Ala instead of Aib). Cells were cultured according to the general procedure and analyzed by SDS-PAGE. Analysis showed that the presence of the inhibiting phtRNAAla(SEQ ID NO:3) expression of full-length model polypeptide in cellsPeptide containing the coding tRNAAib(SEQ ID NO:4) No (or very little) of the cells expressing the full-length model polypeptide, and therefore tRNA was foundAib(SEQ ID NO:4) is recognized by ecaAlaRS, but the efficiency is significantly reduced.

Development of 2-aminoisobutyric acid-tRNA synthetase (AibRS)

Example 2

In contrast to Ala, 2-aminoisobutyric acid (Aib) (formula 1) contains an additional methyl group at the α -carbon position, which renders Aib unable to interact with phalars (wt) due to steric hindrance.

Chemical formula 1 (2-aminoisobutyric acid (Aib)):

the AlaRS class II core catalytic domain of phAlaRS (wt) comprises an Ala binding pocket, is an amino acid sequence spanning positions 1 to 263 of the wild type sequence, and comprises Ala binding and tRNAAlaThe N-terminal domain of the binding function is hereinafter referred to as "aa 1-495". All constructs in this study have omitted the editing and oligomerisation domains (aa496-915), so all following RS constructs are based on part aa1-495 of phAlaRS (wt) (SEQ ID NO: 6). The crystal structure of the phaarars (wt) of the N-terminal domain (aa1-752) is described by m.sokabe et al (2009) [ Sokabe M, Ose T, Nakamura a, Tokunaga K, Nureki O, Yao M, Tanaka I; the structure of alkyl-tRNA synthetase with editing domain; PNAS 106(27) 11028-]It is published.

The alanine binding pocket of AlaRS was calculated based on the fact that there are residues in crystal structure 2ZZG (the crystal structure of alanyl-tRNA synthetase complexed with 5' -O- (N- (L-alanyl) -sulfamoyloxy) adenine, without the oligomerization domain) that have atoms closer than 5 angstroms to the alanine C-alpha carbon atom. Pdb files were loaded, the distance was calculated in standard structure visualization software, especially Accelrys VL Viewer, by selecting the a5a999.ca atoms and then selecting atoms within 5 angstroms. These atoms belong to a99, M147, W192, T213, V215, D248, T249, and G250. In particular, W192 and V215 are very close to the estimated positions of the extra methyl groups of Aib. It is speculated that mutations at these two positions may accommodate the extra methyl group of Aib. Mutants of aa1-495 part (SEQ ID NO:6) of phAlaRS (wt) were prepared, in particular two mutations at position 192, i.e.W 192H and W192F, and one mutation at position 215, i.e.V 215G, were tested alone and in combination with W192H, yielding a total of four mutants. In addition, part aa1-495 of wild type (SEQ ID NO:6) of phAlaRS (wt) was tested.

Table 1: aminoacyl-tRNA synthetase mutants of aa1-495 part of phAlaRS (wt) (SEQ ID NO:6)

SEQ ID NO Mutations
7 V215G
8 W192F
9 W192H
10 W192H;V215G
6 No mutation (wild type)

Mutants were combined with one copy of the suppressor phtRNA by standard plasmid transformation methodsAla(SEQ ID NO:3) and encoding the model polypeptide MS- (Aib) -hsLeptin (SEQ ID NO:5)The polynucleotides were transformed together into E.coli TKO cells. Coli cells were cultured according to the general procedure. The isolated inclusion body fractions were analyzed by SDS-PAGE. For all mutants comprising the V215G and/or W192H mutations (i.e., SEQ ID NO:7, SEQ ID NO:9 and SEQ ID NO:10), bands of different intensity were observed at about 16kDa (NO band was observed for SEQ ID NO:8 or wild type). The solubilized inclusion bodies were analyzed by LC-MS according to the general procedure, confirming that the band in the sample with SEQ ID NO:7 represents a fraction of the expressed polypeptide MS- (Aib) -hsLeptin (SEQ ID NO:5) (found [ m/1] (1)]16198.6; calculation of [ m/1]]16198.4). The efficiency of the mutants was expressed as the incorporation ratio and calculated as described in example 3. The incorporation ratio of SEQ ID NO:7 is ═<10%.

Example 3

tRNA has been reported for Archaeoglobus fulgidusAlaThe aminoacylation mechanism may be based on a single GU pair, corresponding to G3 and T72 in the DNA polynucleotide, and the key residues of the AlaRS involved in the interaction of this pair are N359 and D450, and for the AlaRS of s geotrichum, the two corresponding key residues are N360 and E459. Furthermore, these key residues in AlaRS have been reported to be conserved in the archaea domain [ M.Naganuma et al, Nature 510, p. 507-511 (2014)]。

The aa1-495 mutant (SEQ ID NO:11) of phAlaRS (wt) containing the W192H, V215G, N360A and E459A mutations was transformed into E.coli TKO cells on a plasmid containing 6 copies of tRNA according to the general procedureAib(SEQ ID NO:4) and a polynucleotide encoding the model polypeptide IL21-H (Aib) -GLP-1(SEQ ID NO:31) comprising an inclusion body-inducing region from IL-21 protein, and a GLP-1-analogous region comprising Aib at positions 1-24. The Aib-containing GLP-1 region is identical to the polypeptide backbone of semaglutide (SEQ ID NO:32) [ Lau J. et al, Discovery of the once-week glucagon-like peptide-1(GLP-1) analog semaglutide, J.Med.chem., 2015; 58:7370-7380]. Coli cells were cultured according to the general procedure and analyzed by SDS-PAGE. A strong band was identified at about 8kDa and LC-MS confirmed that the model polypeptide IL21-H (Aib) -GLP-1(SEQ ID NO:31) was expressed (found [ m/1]]8324.6; calculate [ m >1]8324.3). The efficiency of Aib function is expressed as the incorporation ratio between the amount of resultant polypeptide containing Aib and the amount of resultant polypeptide containing Aib or Ala at the position encoding Aib and is calculated based on mass spectrometry as follows: incorporation ratio ═ peak intensity]Aib-containing polypeptide/([ peak intensity)]Aib-containing polypeptide+ [ peak intensity]Ala-containing polypeptides) 100%. The incorporation ratio of SEQ ID NO:11 was calculated to be 63%.

Example 4

Mutants of aa1-495 (SEQ ID NO:12) comprising phAlaRS (wt) of W192H, V215G, M217I, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:12) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 82%.

Example 5

Mutants of aa1-495 (SEQ ID NO:13) comprising phAlaRS (wt) of W192H, V215G, M217L, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:13) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 66%.

Example 6

Mutants of aa1-495 (SEQ ID NO:14) comprising phAlaRS (wt) of W192H, V215G, A193L, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:14) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 71%.

Example 7

Mutants of aa1-495 (SEQ ID NO:15) comprising phAlaRS (wt) of W192H, V215G, F216W, N360A and E459A were prepared and tested following the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:15) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 66%.

Example 8

Mutants of aa1-495 of phAlaRS (wt) containing W192H, V215G, A193L, F216W, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 16). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 63%.

Example 9

Mutants of aa1-495 (SEQ ID NO:17) comprising PHAlaRS (wt) of W192H, V215G and A193I, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.8; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 60%.

Example 10

Mutants of aa1-495 of phAlaRS (wt) containing W192H, V215G, A193I, M217I, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 18). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 72%.

Example 11

Mutants of aa1-495 of phAlaRS (wt) containing W192H, V215G, A193L, M217I, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 19). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 82%.

Example 12

Mutants of aa1-495 of phAlaRS (wt) containing W192H, V215G, A193I, M217L, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 20). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.8; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 51%.

Example 13

Mutants of aa1-495 of phAlaRS (wt) containing W192H, V215G, A193L, M217L, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 21). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:21) was expressed (found [ m/1] ═ 8324.8; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 57%.

Example 14

Mutants of aa1-495 (SEQ ID NO:22) comprising phAlaRS (wt) of W192V, V215G, M217I, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.4; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 77%.

Example 15

Mutants of aa1-495 (SEQ ID NO:23) comprising phAlaRS (wt) of W192I, V215G, M217I, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.4; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 74%.

Example 16

Mutants of aa1-495 (SEQ ID NO:24) comprising phAlaRS (wt) of W192L, V215G, M217I, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8325.0; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 54%.

Example 17

Mutants of aa1-495 of phAlaRS (wt) containing W192H, A193G, V215G, M217I, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 25). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.4; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 86%.

Example 18

Mutants of aa1-495 (SEQ ID NO:26) comprising phAlaRS (wt) of W192H, T213S, V215G, N360A and E459A were prepared and tested according to the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.2; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 56%.

Example 19

Mutants of aa1-495 (SEQ ID NO:27) comprising phAlaRS (wt) of W192H, V215G, T249S, N360A and E459A were prepared and tested following the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8325.0; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 67%.

Example 20

Mutants of aa1-495 (SEQ ID NO:28) comprising phAlaRS (wt) of W192H, V215G, T249V, N360A and E459A were prepared and tested following the procedure described in example 3. LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.8; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 78%.

Example 21

Mutants of aa1-495 of phAlaRS (wt) containing W192H, T213C, V215G, T249V, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 29). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.8; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 82%.

Example 22

Mutants of aa1-495 of phAlaRS (wt) containing W192H, T213A, V215G, T249F, N360A and E459A were prepared and tested following the procedure described in example 3 (SEQ ID NO: 30). LC-MS confirmed that the model polypeptide IL21-h (aib) -GLP-1(SEQ ID NO:16) was expressed (found [ m/1] ═ 8324.6; calculated [ m/1] ═ 8324.3). The calculated incorporation ratio was 83%.

Example 23

By using SEQ ID NO 6as a guide "All non-redundant GenBank CDS translations + PDB + SwissProt + PIR + PRF "query sequences and 100 sequences most identical to part aa1-495 of phAlaRS (wt) (SEQ ID NO:6) were extracted from NCBI sequence databases using the NCBI online blastp suite to generate consensus sequences for archaea AlaRS. The extracted sequences were trimmed at the N-and C-termini to the same length as SEQ ID NO:6 (to remove the N-terminal extension and C-terminal editing domains) and then aligned using standard alignment algorithms, particularly using Multiple Align using Blous 62 matrix10.2.2 software. Three consensus sequences were created by setting the identity threshold between all sequences in the alignment to 85%, 75%, or 50%, respectively, to indicate whether amino acid positions are conserved. A given amino acid position in the consensus sequence is then expressed as a specific amino acid if conserved above a set threshold, or as Xaa representing any amino acid if not conserved above a threshold. Table 2 provides consensus sequences. With regular updates of the NCBI genomic database, the consensus sequence may change over time.

Table 2: archaea AlaRS consensus sequence based on SEQ ID NO 6

1Xaa represents any amino acid

The claimed AibRS may be defined in terms of a consensus sequence, e.g., as a variant of a consensus sequence. The claimed AibRS defined as a consensus variant can be further defined by a specified level of sequence identity between a conserved amino acid (i.e., an amino acid not designated as Xaa) of the consensus sequence and a reference sequence. In this context, it is understood that a claimed consensus sequence variant may have some degree of variation at the conserved amino acid positions such that it falls within the limits set by the specified sequence identity between the consensus sequence and the reference sequence. Non-limiting examples of the claimed AibRS defined as consensus sequence variants are provided below:

A. a 2-aminoisobutyric acid-tRNA synthetase (AibRS) comprising the amino acid sequence of SEQ ID NO:7 or a variant thereof, wherein said variant of SEQ ID NO:7 comprises 215 Gly.

B. The AibRS of claim a; wherein said variant of SEQ ID NO. 7 is a variant of formula I, wherein formula I is:

Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Glu-Xaa-Gly-Xaa-Xaa-Xaa-Lys-Xaa-Cys-Xaa-Xaa-Cys-Gly-Xaa-Xaa-Phe-Trp-Thr-Xaa-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Cys-Gly-Asp-Xaa-Pro-Cys-Xaa-Xaa-Tyr-Xaa-Phe-Ile-Gly-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Glu-Xaa-Arg-Xaa-Xaa-Phe-Xaa-Xaa-Phe-Phe-Glu-Xaa-Xaa-Xaa-His-Xaa-Xaa-Xaa-Xaa-Arg-Tyr-Pro-Val-Xaa-Xaa-Arg-Trp-Arg-Asp-Asp-Val-Xaa-Leu-Val-Gly-Ala-Ser-Ile-Xaa-Asp-Phe-Gln-Pro-Trp-Val-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Pro-Pro-Ala-Asn-Pro-Leu-Xaa-Ile-Ser-Gln-Pro-Xaa-Ile-Arg-Xaa-Xaa-Asp-Xaa-Asp-Xaa-Val-Gly-Xaa-Xaa-Gly-Arg-His-Xaa-Thr-Xaa-Phe-Glu-Met-Met-Ala-His-His-Ala-Phe-Asn-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Tyr-Trp-Xaa-Xaa-Glu-Thr-Val-Xaa-Xaa-Xaa-Xaa-Xaa-Phe-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Thr-Phe-Xaa-Glu-Xaa-Xaa-Trp-Xaa-Gly-Gly-Gly-Asn-Ala-Gly-Xaa-Xaa-Xaa-Glu-Val-Xaa-Xaa-Xaa-Gly-Xaa-Glu-Xaa-Ala-Thr-Leu-Val-Phe-Met-Xaa-Tyr-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Asp-Thr-Gly-Tyr-Gly-Leu-Glu-Arg-Xaa-Xaa-Trp-Xaa-Ser-Xaa-Gly-Xaa-Pro-Thr-Xaa-Tyr-Asp-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ile-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Ala-Gly-Xaa-Xaa-Asp-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Leu-Arg-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Leu-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Glu-Xaa-Xaa-Tyr-Ala-Ile-Ala-Asp-His-Thr-Xaa-Xaa-Leu-Xaa-Phe-Met-Leu-Xaa-Asp-Gly-Val-Xaa-Pro-Ser-Asn-Xaa-Xaa-Ala-Gly-Tyr-Leu-Ala-Arg-Leu-Xaa-Ile-Arg-Xaa-Xaa-Xaa-Arg-Xaa-Xaa-Xaa-Xaa-Leu-Gly-Xaa-Xaa-Xaa-Pro-Leu-Xaa-Xaa-Ile-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Glu-Glu-Xaa-Xaa-Xaa-Xaa-Xaa-Thr-Xaa-Xaa-Arg-Gly-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Lys-Xaa-Xaa-Gly-Xaa-Xaa-Xaa-Xaa-Pro-Xaa-Xaa-Xaa-Leu-Xaa-Xaa-Xaa-Tyr-Xaa-Ser-His-Gly-Xaa-Xaa-Pro-Glu-Xaa-Xaa-Xaa-Glu-Xaa-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa-Val-Xaa-Xaa-Pro-Asp-Asn-Phe-Tyr-Xaa-Xaa-Val-Ala-Xaa-Xaa-Xaa-Xaa-Xaa-Xaa

wherein each Xaa is independently selected and is one or more amino acids, or absent;

wherein the variant of formula I is at least 90% identical to SEQ ID NO 7 at a position not designated as Xaa.

In this example, the claim language of claim B defines the claimed AibRS as a variant of the consensus sequence, i.e., a variant of formula I. Furthermore, the claim language of claim B defines that the variant of formula I is at least 90% identical to SEQ ID No. 7 at a position not designated as Xaa, meaning that 10% variation is allowed for conserved amino acids compared to SEQ ID No. 7. According to claim a (claim dependent on claim B), position 215 must be Gly. However, formula I defines position 215 as Val, since the AibRS claimed in claim B is defined as a variant of formula I (rather than formula I itself), as described above, allowing for conservative amino acid variations, and thus the claimed variant of formula I may comprise Gly at position 215, so there is no conflict between claims a and B.

While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true scope of the invention.

95页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:修饰的末端脱氧核苷酸转移酶(TdT)

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!