Fundamentally diverse human antibody libraries

文档序号:863129 发布日期:2021-03-16 浏览:20次 中文

阅读说明:本技术 根本上多样的人类抗体文库 (Fundamentally diverse human antibody libraries ) 是由 雅各布·格兰维尔 大卫·毛雷尔 于 2018-12-18 设计创作,主要内容包括:公开了包含多种抗体的抗体文库,所述抗体具有来自人类中天然存在的记忆和初始B细胞的互补决定区的非天然存在的组合,并且其中所述抗体文库包含大量功能性和非冗余抗体。还公开了制备具有高水平的功能多样性的抗体文库的方法。(Disclosed is an antibody library comprising a plurality of antibodies having non-naturally occurring combinations of complementarity determining regions from memory and naive B cells that occur naturally in humans, and wherein said antibody library comprises a plurality of functional and non-redundant antibodies. Also disclosed are methods of making antibody libraries with high levels of functional diversity.)

1. An antibody library comprising a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises:

a) a VH domain comprising the sequence VH-CDR1, VH-CDR2, VH-CDR 3; and

b) a VL structure comprising a VL-CDR1 sequence, a VL-CDR2 sequence, a VL-CDR3 sequence;

wherein:

a) at least one of the VH-CDR3 sequence and the VL-CDR3 sequence is derived from a naive B cell;

b) (ii) if only one of the VH-CDR3 sequence and the VL-CDR3 sequence is derived from the naive B cell, then the VH-CDR3 sequence or VL-CDR3 sequence that is not derived from the naive B cell is derived from a memory cell; and is

c) The VH-CDR1, VH-CDR2, VL-CDR1 and VL-CDR2 sequences are derived from memory B cells.

2. The antibody library of claim 1, wherein at least one of the VH-CDR3 sequence and the VL-CDR3 sequence derived from naive B cells is a naturally occurring sequence.

3. The antibody library of claim 1, wherein the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence is a naturally occurring sequence.

4. The antibody library of claim 1, wherein the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences derived from memory B cells are naturally occurring sequences.

5. The antibody library of claim 1, wherein at least one of the VH-CDR3 sequence and the VL-CDR3 sequence derived from naive B cells comprises at least 80% sequence homology to a naturally occurring sequence.

6. The antibody library of claim 1, wherein the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence comprises at least 80% sequence homology to a naturally occurring sequence.

7. The antibody library of claim 1, wherein the memory B cell-derived VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences comprise at least 80% sequence homology to naturally occurring sequences.

8. The antibody library of claim 1, wherein the VL domain is a VK domain or a Vlambda domain.

9. The antibody library of claim 1, wherein the naive B cells are CD27-/IgM + B cells or CD27-/IgD + B cells.

10. The antibody library of claim 1, wherein the memory B cells are selected from the group consisting of CD27+/IgG + B cells, CD27+/IgM + B cells, IgA + B cells, and combinations thereof.

11. The antibody library of claim 1, wherein the naive B cell and memory B cell are from a sample comprising a plurality of naive B cells and memory B cells sampled from a plurality of individuals.

12. The antibody library of claim 11, wherein the plurality of individuals is at least 50 individuals.

13. The antibody library of claim 1, wherein the plurality of antibodies are expressed on the surface of a plurality of phage.

14. The antibody library of claim 13, wherein the plurality of phage are bacteriophage or phagemids.

15. The antibody library of claim 13, wherein each bacteriophage of said plurality of bacteriophages comprises a nucleic acid sequence encoding: i) an antibody of the plurality of antibodies, and ii) a gene encoding a phage coat protein.

16. The antibody library of claim 15, wherein the bacteriophage coat protein is protein gIII.

17. The antibody library of claim 15, wherein expression of the nucleic acid sequence of each bacteriophage produces antibodies fused to a bacteriophage coat protein.

18. The antibody library of claim 1, wherein the VH domain further comprises a framework region selected from the group consisting of: IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15 and IGHV 3-23.

19. The antibody library of claim 1, wherein the VL domain further comprises a framework region selected from the group consisting of: IGKV1-39, IGKV2-28, IGKV3-15 and IGKV 4-1.

20. The antibody library of claim 1, wherein the plurality of antibodies comprises at least 7.6 x 1010A seed antibody.

21. The antibody library of claim 1, wherein at least 95% of the plurality of antibodies are functional.

22. A method of making an antibody library comprising:

a) obtaining sequence information for a plurality of VH-CDR3 and VL-CDR3 sequences from an initial B-cell pool and for a plurality of VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences from a memory B-cell pool;

b) assembling a plurality of Variable Light (VL) domain sequences, each VL domain sequence comprising: a VL-CDR1 sequence obtained from the sequence information determined in step a from memory B cells, a VL-CDR2 sequence obtained from the sequence information determined in step a from memory B cells, and a VL-CDR3 sequence obtained from the sequence information determined in step a from memory B cells or naive B cells;

c) assembling a plurality of first nucleic acid sequences encoding a plurality of first antibodies, each first antibody comprising:

i. a Variable Light (VL) domain sequence assembled in step b; and

a single fixed heavy chain sequence;

d) inserting the plurality of first nucleic acid sequences into a plurality of bacteriophages;

e) expressing the plurality of first antibodies on the surfaces of the plurality of phage;

f) applying at least one selection pressure to the plurality of bacteriophages to generate a subset of bacteriophages comprising a subset of the first nucleic acid sequences;

g) assembling a plurality of Variable Heavy (VH) domain sequences, each VH domain sequence comprising: a VH-CDR1 sequence obtained from the sequence information determined in step a from memory B cells, a VH-CDR2 sequence obtained from the sequence information determined in step a from memory B cells, and a VH-CDR3 sequence obtained from the sequence information determined in step a from memory B cells or naive B cells, wherein at least one of said VH-CDR3 sequence and said VL-CDR3 sequence is derived from sequence information from naive B cells;

h) replacing the single fixed heavy chain sequence from the subset of the first nucleic acid sequences with the plurality of VH domain sequences assembled in step g to generate a plurality of second nucleic acid sequences, each second nucleic acid sequence comprising:

i. a Variable Light (VL) domain sequence assembled in step b, and

the Variable Heavy (VH) domain sequences assembled in step g, wherein the plurality of second nucleic acid sequences encode a plurality of second antibodies;

i) transforming a plurality of microorganisms with the plurality of bacteriophages to produce a plurality of transformants.

23. The method of claim 22, wherein the initial B cell pool comprises less than 5% cells that are not the source of initial B cells.

24. The method of claim 22, wherein the memory B cell pool comprises less than 5% cells that are not a source of memory B cells.

25. The method of claim 1, wherein at least one of the VH-CDR3 sequence and the VL-CDR3 sequence derived from naive B cells is a naturally occurring sequence.

26. The method of claim 1, wherein the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence is a naturally occurring sequence.

27. The method of claim 1, wherein the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences derived from memory B cells are naturally occurring sequences.

28. The method of claim 1, wherein at least one of the VH-CDR3 sequence and the VL-CDR3 sequence derived from naive B cells comprises at least 80% sequence homology to a naturally occurring sequence.

29. The method of claim 1, wherein the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence comprises at least 80% sequence homology to a naturally occurring sequence.

30. The method of claim 1, wherein the memory B cell-derived VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences comprise at least 80% sequence homology to naturally occurring sequences.

31. The method of claim 22, wherein the initial B-cell pool, the memory cell pool, or a combination thereof is obtained from a plurality of individuals.

32. The method of claim 25, wherein the plurality of individuals is at least 50 individuals.

33. The method of claim 23, further comprising sorting the initial B cells and memory B cells in a sample to produce the initial B cell pool and the memory B cell pool prior to obtaining the sequence information.

34. The method of claim 33, wherein sorting the naive B cell and the memory B cell comprises using flow cytometry.

35. The method of claim 34, wherein the flow cytometry is Fluorescence Activated Cell Sorting (FACS).

36. The method of claim 23, wherein the method further comprises extracting nucleic acids from the naive B cell and the memory B cell.

37. The method of claim 36, wherein the nucleic acid is DNA.

38. The method of claim 36, wherein the nucleic acid is mRNA.

39. The method of claim 38, further comprising reverse transcribing the mRNA into complementary DNA (cDNA).

40. The method of claim 22, wherein assembling each VL domain sequence comprises using overlap extension PCR (OE-PCR).

41. The method of claim 22, wherein assembling each VH domain sequence comprises using overlap extension PCR (OE-PCR).

42. The method of claim 22, wherein the single fixed heavy chain sequence is a germline sequence selected from the group consisting of: IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15 and IGHV 3-23.

43. The method of claim 22, wherein applying at least one selective pressure comprises applying thermal stress, selecting with protein a, selecting with protein L, or a combination thereof.

44. The method of claim 43, wherein the thermal stress is a temperature of at least 65 ℃.

45. The method of claim 43, wherein applying thermal stress to the plurality of bacteriophages excludes unstable and aggregation-prone bacteriophages from the subset of bacteriophages.

46. The method of claim 43, wherein applying a selection with protein A or protein L to the plurality of phage excludes phage that express antibodies that do not have the ability to bind to a protein from the subset of phage.

47. The method of claim 22, wherein the bacteriophage is a bacteriophage or a phagemid.

48. The method of claim 22, wherein the microorganism is escherichia coli.

49. The method of claim 22, wherein the transformation is accomplished by electroporation.

50. The method of claim 22, wherein the plurality of transformants comprises at least 7.6 x 1010And (4) a transformant.

51. The method of claim 22, the antibody library of claim 1, wherein at least 95% of the plurality of antibodies are functional.

52. An antibody library comprising a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises:

a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and

b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein

c) The CDR sequences are selected from: a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence, wherein the CDR sequences are the same for each antibody of the plurality of antibodies; and is

d) The unique combination of the remaining CDR sequences is selected from: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences.

53. The antibody library of claim 52, wherein the CDR sequences of (c) are VH-CDR3 sequences.

54. The antibody library of claim 53, wherein the remaining CDR sequences of (d) are VH-CDR1 sequences, VH-CDR2 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences.

55. The antibody library of claim 52, wherein the CDR sequences of (c) are identical to CDR sequences derived from an original antibody clone.

56. The antibody library of claim 52, wherein each of the remaining CDR sequences of (d) is present in high diversity in the antibody library.

57. The antibody library of claim 56, wherein the high diversity comprises at least 1 x 103Different CDR sequences.

58. The antibody library of claim 52, wherein at least one of the VH-CDR1 sequences, the VH-CDR2 sequences, the VH-CDR3 sequences, the VL-CDR1 sequences, the VL-CDR2 sequences, and the VL-CDR3 sequences comprises at least 80% sequence homology to naturally occurring CDR sequences.

59. The antibody library of claim 58, wherein the naturally occurring CDR sequences are derived from a human population.

60. The antibody library of claim 58, wherein the remaining CDR sequences of (d) are present in a non-naturally occurring combination for each antibody of the plurality of antibodies.

61. The antibody library of claim 55, wherein at least one antibody of the plurality of antibodies has at least one of: a higher melting temperature (Tm) than the initial antibody clone, a higher affinity for a target epitope than the initial antibody clone, or a higher cross-reactivity to a target epitope between two or more species than the initial antibody clone.

62. The antibody library of claim 52, wherein at least one antibody of the plurality of antibodies has a melting temperature (Tm) of about 50 ℃ to about 90 ℃.

63. The antibody library of claim 52, wherein at least one antibody of the plurality of antibodies has a K of 100nM or lessdBinding to a target epitope.

64. A method for generating an antibody library, the method comprising:

(a) selecting a CDR sequence, wherein the CDR sequence is selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences;

(b) replacing the CDR sequences of each antibody in the first antibody library with the CDR sequences selected in (a), thereby generating a second antibody library comprising a plurality of antibodies, wherein each antibody in the plurality of antibodies comprises:

(i) the CDR sequences selected in (a); and

(ii) a unique combination of remaining CDR sequences not selected in (a), wherein said remaining CDR sequences are selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences.

65. The method of claim 64, wherein the first antibody library comprises a plurality of antibodies, wherein each antibody in the plurality of antibodies comprises a unique combination of VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences, and VL-CDR3 sequences.

66. The method of claim 64, wherein the CDR sequences selected in (a) are VH-CDR3 sequences.

67. The method of claim 66, wherein the remaining CDR sequences of ii) are a VH-CDR1 sequence, a VH-CDR2 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence.

68. The method of claim 64, wherein each of the remaining CDR sequences of ii) are present in high diversity in the antibody library.

69. The method of claim 68, wherein said high diversity comprises at least 1 x 103Different CDR sequences.

70. The method of claim 64, wherein at least one of the VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences, and VL-CDR3 sequences in the antibody library comprises at least 80% sequence homology to naturally occurring CDR sequences.

71. The method of claim 70, wherein the naturally occurring CDR sequences are derived from a human population.

72. The method of claim 64, wherein the remaining CDR sequences of ii) are present in a non-naturally occurring combination for each antibody of the plurality of antibodies.

73. The method of claim 64, wherein the CDR sequences of (a) are derived from an initial antibody clone.

74. The method of claim 64, wherein at least one antibody in the antibody library has at least one of: a higher melting temperature (Tm) than the initial antibody clone, a higher affinity for a target epitope than the initial antibody clone, or a higher cross-reactivity to a target epitope between two or more species than the initial antibody clone.

75. The method of claim 64, wherein at least one antibody in the antibody library has a melting temperature (Tm) of about 50 ℃ to about 90 ℃.

76. The method of claim 64, wherein at least one antibody in the antibody library has a K of 100nM or lessdBinding to a target epitope.

77. The method of claim 64, wherein the first antibody library is an antibody library according to any one of claims 1-21.

78. The method of claim 64, wherein the second antibody library is an antibody library according to any one of claims 52-63.

79. The method according to claim 64, further comprising (c) screening the second antibody library for antibodies having a desired property.

Background

Monoclonal antibodies (mabs) are useful as therapeutics, research tools, and in diagnostic methods, but finding antibodies with affinity for a desired target can be challenging. Antibody libraries provide an effective tool for screening large numbers of antibodies against a compound of interest. Such libraries are typically based on rearrangement of naturally occurring variable genes or introduction of synthetic diversity into antibody sequences. However, natural antibody libraries typically have extremely limited diversity, while synthetic libraries may suffer from non-functional sequences. Therefore, there is a need to develop antibody libraries with a high degree of functional diversity.

Disclosure of Invention

Provided herein are antibody libraries comprising a plurality of antibodies. Various antibodies can comprise a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, and a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, a VL-CDR3 sequence. At least one of the VH-CDR3 sequence and VL-CDR3 sequence may be derived from the originalB cells. In some embodiments, a VH-CDR3 sequence or a VL-CDR3 sequence that is not derived from naive B cells is derived from memory cells if only one of the VH-CDR3 sequence and VL-CDR3 sequence is derived from naive B cells. The VH-CDR1, VH-CDR2, VL-CDR1 and VL-CDR2 sequences can be derived from memory B cells. In some embodiments, at least one of the VH-CDR3 sequence and VL-CDR3 sequence derived from naive B cells is a naturally occurring sequence. In some embodiments, the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence is a naturally occurring sequence. In some embodiments, the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences derived from memory B cells are naturally occurring sequences. In some embodiments, derived from naive B cellsAt least one of the VH-CDR3 sequence and VL-CDR3 sequence comprises at least 80% sequence homology to a naturally occurring sequence. In some embodiments, the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence comprises at least 80% sequence homology to a naturally occurring sequence. In some embodiments, the VH-CDR1 sequence, VH-CDR2 sequence, VL-CDR1 sequence, and VL-CDR2 sequence derived from memory B cells comprise at least 80% sequence homology to naturally occurring sequences. In some embodiments, the VL domain is a VK domain or a va domain. In some embodiments, the naive B cells are CD27-/IgM + B cells or CD27-/IgD + B cells. In some embodiments, the memory B cells are selected from the group consisting of CD27+/IgG + B cells, CD27+/IgM + B cells, IgA + B cells, and combinations thereof. In some embodiments, the naive B cells and memory B cells are from a sample comprising a plurality of naive B cells and memory B cells sampled from a plurality of individuals. In some embodiments, the plurality of individuals is at least 50 individuals. In some embodiments, the plurality of antibodies are expressed on the surface of a plurality of phage. In some embodiments, the plurality of bacteriophages is a bacteriophage or a phagemid. In some embodiments, each bacteriophage of the plurality of bacteriophages comprises a nucleic acid sequence encoding: i) an antibody of the plurality of antibodies, and ii) a gene encoding a phage coat protein. In some embodiments, the bacteriophage coat protein is protein gIII. In some embodiments, expression of the nucleic acid sequence of each bacteriophage produces an antibody fused to a bacteriophage coat protein. In some embodiments, the VH domain further comprises a framework region selected from the group consisting of IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15, and IGHV 3-23. In some embodiments, the VL domain further comprises a framework region selected from the group consisting of IGKV1-39, IGKV2-28, IGKV3-15, and IGKV 4-1. In some embodiments, the plurality of antibodies comprises at least 7.6 x 1010A seed antibody. In some embodiments, at least 95% of the plurality of antibodies are functional.

Also provided herein are methods of making an antibody library comprising: a) obtaining sequence information for a plurality of VH-CDR3 and VL-CDR3 sequences from an initial B-cell pool and for a plurality of VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences from a memory B-cell pool; b) assembling a plurality of Variable Light (VL) domain sequences, each VL domain sequence comprising: a VL-CDR1 sequence obtained from the sequence information determined in step a from the memory B cell, a VL-CDR2 sequence obtained from the sequence information determined in step a from the memory B cell, and a VL-CDR3 sequence obtained from the sequence information determined in step a from the memory B cell or the naive B cell, c) assembling a plurality of first nucleic acid sequences encoding a plurality of first antibodies, each first antibody comprising the Variable Light (VL) domain sequence assembled in step B. and a single immobilized heavy chain sequence; d) inserting a plurality of first nucleic acid sequences into a plurality of bacteriophages; e) expressing a plurality of first antibodies on the surface of a plurality of phage; f) applying at least one selection pressure to the plurality of bacteriophages to generate a subset of bacteriophages comprising a subset of the first nucleic acid sequences; g) assembling a plurality of Variable Heavy (VH) domain sequences, each VH domain sequence comprising: a VH-CDR1 sequence obtained from the sequence information determined in step a from memory B cells, a VH-CDR2 sequence obtained from the sequence information determined in step a from memory B cells, and a VH-CDR3 sequence obtained from the sequence information determined in step a from memory B cells or naive B cells, wherein at least one of the VH-CDR3 sequence and VL-CDR3 sequence is derived from the sequence information from naive B cells; h) replacing a single fixed heavy chain sequence from the subset of first nucleic acid sequences with the plurality of VH domain sequences assembled in step g to generate a plurality of second nucleic acid sequences, each second nucleic acid sequence comprising a Variable Light (VL) domain sequence assembled in step b and a Variable Heavy (VH) domain sequence assembled in step g, wherein the plurality of second nucleic acid sequences encode a plurality of second antibodies; and i) transforming a plurality of microorganisms with a plurality of bacteriophages to produce a plurality of transformants. In some embodiments, the initial B cell pool comprises less than 5% cells that are not the source of the initial B cells. In some embodiments, the memory B cell pool comprises less than 5% cells that are not the source of memory B cells. In some embodiments, at least one of the VH-CDR3 sequence and VL-CDR3 sequence derived from naive B cells is a naturally occurring sequence. In some embodiments, the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence is a naturally occurring sequence. In some embodiments, the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 sequences derived from memory B cells are naturally occurring sequences. In some embodiments, at least one of the VH-CDR3 sequence and VL-CDR3 sequence derived from naive B cells comprises at least 80% sequence homology to a naturally occurring sequence. In some embodiments, the memory cell-derived VH-CDR3 sequence or VL-CDR3 sequence comprises at least 80% sequence homology to a naturally occurring sequence. In some embodiments, the VH-CDR1 sequence, VH-CDR2 sequence, VL-CDR1 sequence, and VL-CDR2 sequence derived from memory B cells comprise at least 80% sequence homology to naturally occurring sequences. In some embodiments, the initial B-cell pool, memory cell pool, or combination thereof is obtained from a plurality of individuals. In some embodiments, the plurality of individuals is at least 50 individuals. In some embodiments, the method further comprises sorting the initial B cells and memory B cells in the sample to produce an initial B cell pool and a memory B cell pool prior to obtaining the sequence information. In some embodiments, sorting the naive B cells and the memory B cells comprises using flow cytometry. In some embodiments, the flow cytometry is Fluorescence Activated Cell Sorting (FACS). In some embodiments, the method further comprises extracting nucleic acids from the naive B cells and the memory B cells. In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid is mRNA. In some embodiments, the method further comprises reverse transcribing the mRNA into complementary DNA (cDNA). In some embodiments, assembling each VL domain sequence comprises using overlap extension PCR (OE-PCR). In some embodiments, assembling each VH domain sequence comprises using overlap extension PCR (OE-PCR). In some embodiments, the single fixed heavy chain sequence is a germline sequence selected from the group consisting of IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15, and IGHV 3-23. In some embodiments, applying at least one selective pressure comprises applying thermal stress, selecting with protein a, selecting with protein L, or a combination thereof. In some embodiments, the thermal stress is a temperature of at least 65 ℃. In some embodiments, applying thermal stress to the plurality of bacteriophages excludes unstable and aggregation-prone bacteriophages from the subset of bacteriophages.

Also provided herein is an antibody library comprising a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the CDR sequences are selected from: a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence, wherein the CDR sequences are the same for each of the plurality of antibodies; and (d) the unique combination of the remaining CDR sequences is selected from: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences. In some embodiments, the CDR sequence of (c) is a VH-CDR3 sequence. In some embodiments, the remaining CDR sequences of (d) are a VH-CDR1 sequence, a VH-CDR2 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence. In some embodiments, the CDR sequences of (c) are identical to the CDR sequences derived from the original antibody clone. In some embodiments, each of the remaining CDR sequences of (d) is present in the antibody library with a high degree of diversity. In some embodiments, the high diversity comprises at least 1 x 103Different CDR sequences. In some embodiments, at least one of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences comprises at least 80% sequence homology to naturally occurring sequences. In some embodiments, the naturally occurring CDR sequences are derived from a human population. In some embodiments, the remaining CDR sequences of (d) are present in a non-naturally occurring combination for each of the plurality of antibodies. In some embodiments, at least one antibody of the plurality of antibodies has at least one of: a higher melting temperature (Tm) than the initial antibody clone, a higher affinity for a target epitope than the initial antibody clone, or a higher cross-reactivity to a target epitope between two or more species than the initial antibody clone. In some embodiments, at least one antibody of the plurality of antibodies has a melting temperature (Tm) of about 50 ℃ to about 90 ℃. In some embodiments, at least one antibody of the plurality of antibodies has a K of 100nM or lessdAnd targetBinding of the epitope tag.

Further provided herein are methods for generating an antibody library, the method comprising: (a) selecting a CDR sequence, wherein the CDR sequence is selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences; (b) replacing the CDR sequences of each antibody in the first antibody library with the CDR sequences selected in (a), thereby generating a second antibody library comprising a plurality of antibodies, wherein each antibody in the plurality of antibodies comprises: (i) the CDR sequences selected in (a); and (ii) a unique combination of remaining CDR sequences not selected in (a), wherein said remaining CDR sequences are selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences. In some embodiments, the first antibody library comprises a plurality of antibodies, wherein each antibody in the plurality of antibodies comprises a unique combination of VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences, and VL-CDR3 sequences. In some embodiments, the CDR sequence selected in (a) is a VH-CDR3 sequence. In some embodiments, the remaining CDR sequences of (ii) are a VH-CDR1 sequence, a VH-CDR2 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence. In some embodiments, each of the remaining CDR sequences of (ii) is present in the antibody library with a high degree of diversity. In some embodiments, the high diversity comprises at least 1 x 103Different CDR sequences. In some embodiments, at least one of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences in the antibody library comprises at least 80% sequence homology to naturally occurring CDR sequences. In some embodiments, the naturally occurring CDR sequences are derived from a human population. In some embodiments, the remaining CDR sequences of (ii) are present in a non-naturally occurring combination for each of the plurality of antibodies. In some embodiments, the CDR sequences of (a) are derived from an initial antibody clone. In some embodiments, at least one antibody in the antibody library has at least one of: a higher melting temperature (Tm) compared to the initial antibody clone, a higher affinity for a target epitope compared to the initial antibody clone, orAntibody clones are more cross-reactive than to target epitopes between two or more species. In some embodiments, at least one antibody in the antibody library has a melting temperature (Tm) of about 50 ℃ to about 90 ℃. In some embodiments, at least one antibody in the antibody library has a K of 100nM or lessdBinding to a target epitope. In some embodiments, the first antibody library is an antibody library according to any of the antibody libraries described above. In some embodiments, the second antibody library is an antibody library according to any one of the preceding. In some embodiments, the method further comprises (c) screening a second antibody library for antibodies having the desired property.

Is incorporated by reference

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

Drawings

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the office upon request and payment of the necessary fee. The novel features believed characteristic of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:

FIG. 1 shows the amount of diversity of VH-CDR3 and VL-CDR3 obtained from naive B cells of a single individual compared to the amount of diversity of VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2 and VL-CDR3 obtained from memory B cells.

FIG. 2 shows the length of time for antibody libraries developed using different techniques including SuperHuman + Carterra and SuperHuman Zero-Day.

Figure 3 shows the percentage of clones showing affinity (nM) to PD 1.

Figure 4 shows the reactivity of PD1 against human and cynomolgus monkey cell surface against five anti-PD 1 clones. In the control, PPE control/parental cells and PPE control/transfected cells are the leftmost peaks in each figure, while positive control antibody/transfected cells are the rightmost peaks. In the selected clone maps, the PPE control/transfected cells are the leftmost peak in each map, while the PPE positive/transfected cells are the rightmost peaks.

Figure 5 shows the cross-reactivity of two anti-PD 1 clones between human, mouse and cynomolgus monkey.

Figure 6 shows a screen for ligand blocking.

Figure 7 shows the beta galactosidase (bGal) ELISA and Sanger screen of 2 plates containing 61 positive and 49 unique clones.

Figure 8 shows the diversity of antibody clone sequences.

Figure 9 shows antibody fusion variants.

FIG. 10 shows variants in VH-CDR1(CDR-H1) and VH-CDR2(CDR-H2) of anti-bGal # 27.

FIG. 11 depicts factors involved in a framework selection strategy.

Fig. 12A-12B show the framework use of mabs from phase I clinical trials. Figure 12A shows the heavy chain framework used in over 400 mabs from phase I clinical trials. Fig. 12B shows the light chain framework used in over 400 mabs from phase I clinical trials, showing that most phase I mabs are kappa-derived.

Figure 13 depicts allele frequencies of 12 frames in 14 human populations.

Figure 14 depicts the allele frequencies of 27 frames in 14 human subpopulations.

Figure 15 shows the affinity maturation landscape of the human antibody framework.

Fig. 16 shows 3 framework regions: VH-FR1(FW1), VH-FR2(FW2) and VH-FR3(FW3), and 2 CDRs: VH-CDR1(CDR-H1) and VH-CDR2 (CDR-H2).

Figure 17 shows antibody libraries designed and selected in combination to produce functionally diverse VH and VK sequences.

Figure 18 shows heavy chain redundancy during various library preparations.

Figure 19 shows the sequence overlap between clones.

Figure 20 depicts Somatic Hypermutation (SHM) from more than 100 individuals.

Figure 21 depicts the diversity of the framework (also referred to herein as scaffold) and antibody libraries used.

FIG. 22 depicts the properties of the antibody library.

FIG. 23 shows the observed and predicted pairing mutation frequencies in the VH-CDR1(CDR-H1) and VH-CDR2(CDR-H2) regions.

FIG. 24 illustrates the position offset of IGHV 3-23.

Fig. 25 shows the frequencies of the gemini.

FIG. 26 depicts IGHV1-3 allelic frequency variation in 14 human populations.

Figure 27 shows that less than 10,000 clones dominate the total number of clones in peripheral samples from human blood.

FIG. 28 shows a phagemid vector expressing an antibody described herein fused to a gIII coat protein.

FIG. 29 depicts a non-limiting example of a method of generating an antibody library described herein.

Figure 30 depicts a non-limiting example of an antibody library described herein.

Fig. 31A and 31B depict non-limiting examples of methods of screening antibody libraries of the present disclosure for antibodies with improvements in various properties.

Fig. 32 depicts a non-limiting exemplary workflow of a method of generating an antibody library as described herein and selecting one or more desired antibodies therefrom.

FIG. 33 depicts a non-limiting example of a method of generating an antibody library described herein.

Figure 34 depicts a non-limiting example of a method of screening an antibody library of the present disclosure for antibodies having improvements in various properties.

Figure 35 depicts a non-limiting example of a method of screening an antibody library of the present disclosure for antibodies with improvements in various properties.

Detailed Description

The desired property of an antibody library may be a high degree of functional diversity. Functional diversity not only ensures that a large number of antibodies are available for testing purposes, but also ensures that such diversity is functionally related, thereby enhancing the utility of these libraries in therapeutic, diagnostic and research applications. Increased library diversity can be achieved by using naturally occurring Complementarity Determining Regions (CDRs) in non-naturally occurring combinations, such as mixing CDRs from memory cells and naive cells, which increases the number of possible CDR combinations. This increase in diversity of functionality can be further achieved by selecting functionality (e.g., the ability to bind to a protein) during the preparation of the antibody library.

In certain instances, disclosed herein are antibodies with unique properties, antibody libraries comprising a high degree of functional diversity, and methods of making the antibodies and the antibody libraries.

Antibodies

Antibodies can be synthesized in vivo by B cells. Antibody isotypes synthesized by B cells include, but are not limited to, IgA, IgD, IgE, IgG, and IgM. B cells that have not yet encountered an antigen may be referred to as naive B cells, while B cells that have encountered an antigen and are activated by the antigen may be referred to as memory B cells. The naive B cells can express IgM, IgD, or a combination thereof. Memory B cells may express IgE, IgA, IgG, IgM, or a combination thereof. The IgA may be IgA1 or IgA 2. The IgG may be IgG1, IgG2, IgG3, or IgG 4. The memory B cells may be class-switching memory B cells or non-switching or marginal-zone memory B cells. Non-transformed or marginal zone memory B cells can express IgM.

Complementarity determining regions ("CDRs") are part of immunoglobulin (antibody) variable regions, which may be responsible for the antigen binding specificity of an antibody. The Heavy Chain (HC) variable region may comprise three CDR regions, abbreviated VH-CDR1, VH-CDR2 and VH-CDR3, and present in that order on the heavy chain from the N-terminus to the C-terminus; and the Light Chain (LC) variable region may comprise three CDR regions, abbreviated as VL-CDR1, VL-CDR2 and VL-CDR3, present in that order on the light chain from the N-terminus to the C-terminus. Furthermore, the light chain may be a kappa chain (VK) or a lambda chain (V lambda). Surrounding and interspersed between the CDRs is a framework region that can contribute to structure and can exhibit less variability than the CDR regions.

The heavy chain variable region may comprise four framework regions, abbreviated as VH-FR1, VH-FR2, VH-FR3 and VH-FR 4. The heavy chain may comprise from N-terminus to C-terminus: VH-FR1, VH-CDR1, VH-FR2, VH-CDR2, VH-FR3, VH-CDR3 and VH-FR 4. The light chain variable region may comprise four framework regions, abbreviated as VL-FR1, VL-FR2, VL-FR3 and VL-FR 4. The light chain may comprise, from N-terminus to C-terminus: VL-FR1, VL-CDR1, VL-FR2, VL-CDR2, VL-FR3, VL-CDR3 and VL-FR 4. In some cases, "CDR sequence" as used herein refers to a CDR sequence selected from: VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, VL-CDR3, and any combination thereof.

Fundamentally diverse antibody libraries

The antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (a) at least one of the VH-CDR3 sequence and VL-CDR3 sequence is derived from naive B cells; (b) (ii) if only one of the VH-CDR3 and VL-CDR3 is derived from naive B cells, then VH-CDR3 or VL-CDR3 that is not derived from naive B cells is derived from memory cells; and (c) the VH-CDR1 sequence, VH-CDR2 sequence, VL-CDR1 sequence and VL-CDR2 sequence are derived from memory cells. Antibody libraries may also be referred to herein as SuperHuman libraries.

In some cases, the plurality of antibodies in the antibody library has a high degree of functional diversity. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 80%, 85%, 90%, 95%, or 99% of the plurality of antibodies are functional. The functional antibody may be an antibody having the ability to bind to a protein. The ability of an antibody to bind to a protein can be determined by screening the antibody against protein a or protein L. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 80% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 85% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 90% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 95% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 99% of the plurality of antibodies are functional.

The antibody library may comprise at least 1.0X 105、2.0×105、3.0×105、4.0×105、5.0×105、6.0×105、7.0×105、8.0×105、9.0×105、1.0×1010、2.0×1010、3.0×1010、4.0×1010、5.0×1010、6.0×1010、7.0×1010、8.0×1010Or 9.0X 1010A seed antibody. The antibody library may comprise at least 1.0X 105A seed antibody. The antibody library may comprise at least 7.0X 1010A seed antibody. The antibody library may comprise at least 7.1X 1010、7.2×1010、7.3×1010、7.4×1010、7.5×1010、7.6×1010、7.7×1010、7.8×1010Or 7.9X 1010A seed antibody. The antibody library may comprise at least 7.6X 1010A seed antibody.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6×1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional.

Antibodies of the library can comprise non-naturally occurring combinations of naturally occurring CDRs, such as CDRs from memory B cells and naive B cells that are derived from naturally occurring but combined occurrence on the same antibody that is not naturally occurring. For example, a non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, a non-naturally occurring combination of naturally occurring CDRs may comprise at least one CDR derived from a cell that is primarily a primary B cell source, while the remaining CDRs may be derived from a cell that is primarily a memory B cell source. A naturally occurring CDR can refer to a CDR that naturally occurs in a human population.

A non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from an initial cell, while the remaining CDRs are derived from a memory cell. In some cases, at least VL-CDR1 is derived from the starting cell. In some cases, at least VL-CDR2 is derived from the starting cell. In some cases, at least VL-CDR3 is derived from the starting cell. In some cases, at least the VH-CDR1 is derived from the naive cell. In some cases, at least the VH-CDR2 is derived from the naive cell. In some cases, at least the VH-CDR3 is derived from the naive cell.

Non-naturally occurring combinations of naturally occurring CDRs can comprise two, three, four or five CDRs derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, two CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, three CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, four CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, five CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells.

In another non-limiting example of a non-naturally occurring combination, the VL-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, and VL-CDR2 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2, and VL-CDR3 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 and VL-CDR3 can be derived from naive cells, while the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 can be derived from memory cells.

Amino acid residues in an antibody sequence, a variable heavy chain sequence of an antibody, or a variable light chain sequence of an antibody may be referred to according to their Kabat positions. As used herein, "Kabat position" may refer to the numbering system described in Kabat et al, 1991, Sequences of Proteins of Immunological Interest, 5 th edition, US Department of Health and Human Services, NIH, USA. In some cases, an antibody described herein comprises a variation at Kabat position H93, Kabat position H94, or a combination thereof. In some cases, at least one antibody in the antibody library comprises a variation at Kabat position H93, Kabat position H94, or a combination thereof. In some cases, at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, or 50% of the antibodies in the antibody library comprise a variation at Kabat position H93, Kabat position H94, or a combination thereof. The variation may be a mutation, insertion or deletion.

When used with respect to sequences, "derived from" can refer to any CDR sequence having at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% sequence homology to a naturally occurring CDR sequence. "derived from" can refer to any CDR sequence obtained from sequencing information obtained from a pool of cells that are primarily of primary B cell origin or a pool of cells that are primarily of memory B cell origin. For example, a sequence is "derived from" a cell if (1) the sequence is observed in the cell, and (2) the identical sequence (or a sequence having at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or at least 100% sequence homology to the sequence) is chemically synthesized based on the observed sequence.

The VH-CDR1 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR1 sequence derived from naive B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR3 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from naive B cells.

The VL-CDR1 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The initial B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The VL-CDR1 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The VL-CDR2 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The initial B cell-derived VL-CDR2 sequence can be a synthetic VL-CDR2 sequence. The VL-CDR2 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The VL-CDR3 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell. The initial B cell-derived VL-CDR3 sequence can be a synthetic VL-CDR3 sequence. The VL-CDR3 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of primary B-cell origin. The VH-CDR3 sequences, VL-CDR3 sequences, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of primary B cell origin. Initial B cell pools can be obtained from multiple individuals. The initial B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of the initial B cells.

The VH-CDR1 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR1 sequence derived from memory B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells.

The memory B cell-derived VL-CDR1 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The memory B cell-derived VL-CDR1 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. The VH-CDR3 sequence, VL-CDR3 sequence, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. A pool of memory B cells can be obtained from multiple individuals. The memory B cell pool can comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of memory B cells. The memory B cells may be CD27+ B cells. The memory B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20% or 30% of cells that are not CD27+ B cell derived.

The percentage of sequence homology can be calculated as follows: the number of positions at which the identical nucleobase occurs in both sequences is determined to yield the number of matched positions, the number of matched positions is divided by the total number of positions, which may include additions or deletions, and the result is multiplied by 100 to yield the percentage of sequence homology. Percent sequence homology, also referred to as percent sequence identity, can be determined by aligning each sequence in any suitable sequence alignment program, such as Clustal Omega, multiple sequence comparison by log expectation (MUSCLE), multiple alignment using fast fourier transform (MAFFT), MegAlign, and Basic Local Alignment Search Tool (BLAST).

The starting cell may be a starting B cell. The naive B cells can be human naive B cells. The memory cell may be a memory B cell. The memory B cell can be a human memory B cell. In some cases, naive B cells showed increased diversity of VH-CDR3 and VL-CDR3 sequences compared to VH-CDR3 and VL-CDR3 sequences from memory B cells (fig. 1). The starting cells and memory cells can be obtained from a biological sample, such as blood, from an individual or multiple individuals. The naive and memory cells can be physically separated from the sample using a marker specific for naive or memory cells.

Markers can be used to identify, isolate or sort B cells, naive B cells and memory B cells from a biological sample. Examples of markers for identifying, isolating or sorting B cells include, but are not limited to, CD19 +. Examples of markers for identifying, isolating or sorting naive B cells include, but are not limited to, CD19+, CD27-, IgD +, IgM +, and combinations thereof. Examples of markers for identifying, isolating or sorting memory B cells include, but are not limited to, CD19+, CD27+, and combinations thereof. In some embodiments, memory B cells are sorted using CD27 +. Examples of markers for identifying, isolating or sorting class-switching memory B cells include, but are not limited to, CD19+, CD27+, CD27+, IgD-, IgM-, and combinations thereof. Examples of markers for identifying, isolating or sorting non-transformed or marginal zone memory B cells include, but are not limited to, CD19+, CD27+, IgD +, IgM +, and combinations thereof. In some cases, memory B cells can be identified, isolated, or sorted using the following markers: CD19+, CD27+, IgD-, IgM +, and combinations thereof. The starting cell from which VH-CDR3 is derived may be a CD27-/IgM + B-cell. The memory cells from which VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3 are derived may be CD27+/IgG + B cells.

The CDR sequences of the antibody can be those found in naive B cells and memory B cells found in a single or multiple individuals. The subject may be a mammal. The mammal can be a human, a non-human primate, a mouse, a rat, a pig, a goat, a rabbit, a horse, a cow, a cat, or a dog. In some cases, the CDR sequences are CDR sequences obtained from publicly available sources. Examples of sources from which CDR sequences can be publicly obtained include SAbData (http:// sight. stats. ox. ac. uk/webpps/SAbDab-sabpred/Welcom. php) and PylgClassify (http:// duncrack 2.fccc. edu/PyIgClassify /).

The germline antibody sequence may comprise a germline framework and a germline CDR sequence. Each CDR in an antibody found in an antibody library may comprise at least 1, 2, 3, or 4 mutations compared to the corresponding germline CDR region. Each CDR in an antibody library may comprise no more than 4 mutations compared to the corresponding framework CDR regions.

The framework of the antibody may be a naturally occurring framework. The naturally occurring framework may be a framework found in a mammal. The mammal may be a primate, mouse, rat, pig, goat, rabbit, horse, cow, cat, or dog. The primate can be a human. The framework may comprise at least one variant as compared to a naturally occurring framework. The variant may be a mutation, insertion or deletion. The variant may be a variant found in the nucleic acid sequence encoding the antibody or a variant found in the amino acid sequence of the antibody. Any suitable framework sequence may be used, such as those previously used in phase I clinical trials (fig. 12A, 12B). As used herein, the framework of an antibody may refer to the framework regions of a variable heavy chain (VH-FR1, VH-FR2, VH-FR3, and VH-FR4), the framework regions of a variable light chain (VL-FR1, VL-FR2, VL-FR3, and VL-FR4), or a combination thereof. The framework regions of the antibodies in the antibody library may be identical to the germline framework regions.

The frame may be a therapeutically optimal frame. A therapeutically optimal framework may comprise at least one, at least two, at least three, at least four, at least five, at least six or all of the following properties selected from the group consisting of: a) safety previously demonstrated in human monoclonal antibodies, b) thermostable; c) is not easy to aggregate; d) contains a single dominant allele at the amino acid level throughout the human population; e) different typical topologies containing CDRs; f) good expression in bacteria; and g) display well on phage. The framework with the safety previously demonstrated in human monoclonal antibodies can be that of antibodies already used in at least phase I clinical trials. The thermally stable frame may be a frame that is stable at least 20 ℃, 30 ℃, 40 ℃, 50 ℃, 60 ℃, 70 ℃, 80 ℃, 90 ℃, 100 ℃ or above 100 ℃. The thermally stable frame may be a frame capable of withstanding a temperature increase of at least 3 ℃, 4 ℃ or 5 ℃ per minute. A framework that expresses well in bacteria can be a framework that produces biologically active antibodies in bacteria. The bacterium may be escherichia coli (e. The bacteria may be engineered bacteria. The bacterium may be a bacterium optimized for antibody expression. A good framework displayed on a phage may be one that produces a biologically active antibody when displayed on the phage surface.

An example of a strategy for selecting a framework is depicted in fig. 11, where an ideal framework for an antibody can be an antibody that exhibits structural diversity, has been successfully used in phase I clinical trials in humans, has low immunogenicity, exhibits aggregation resistance, exhibits adaptability, and is thermostable. In some cases, an antibody framework is avoided if it has an inherent autoreactivity to blood cells (e.g., IGHV4-34), has poor stability characteristics (e.g., IGHV2-5), has V genes that are not found in at least 50% of individuals (e.g., IGHV4-b), shows V genes that are prone to aggregation (e.g., IGLV6-57), or a combination thereof.

The amino acid sequence of the antibody frameworks herein can comprise more than one dominant allele, with different dominant alleles present in different human populations (fig. 13 and 14). For example, the IGHV1-3 framework comprises 3 alleles: IGVH1-3 × 01, IGVH1-3 × 02 and IGVH1-3 × 03, which were found at different frequencies in different human populations (fig. 26). In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in at least two human populations. In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in all human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least two human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least twelve human populations. In some cases, the framework regions of the VH domain are those from IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15, or IGHV 3-23. In some cases, the framework regions of the VH domain are those from IGHV2-5, IGHV3-7, IGVH4-34, IGHV5-51, IGHV1-24, IGHV2-26, IGHV3-72, IGHV3-74, IGHV3-9, IGHV3-30, IGHV3-33, IGHV3-53, IGHV3-66, IGHV4-30-4, IGHV4-31, IGHV4-59, IGHV4-61, or IGHV 5-51. In some cases, the framework regions of the VH domains of the antibodies in the antibody library are framework regions from IGHV1-46, IGHV3-23, or a combination thereof. In some cases, the framework regions of the VL domains of the antibodies in the antibody library are framework regions from IGKV1-39, IGKV2-28, IGKV3-15, IGKV4-1, IGKV1-5, IGKV1-12, IGKV1-13, IGKV3-11, IGKV3-20, or a combination thereof. In one example, a subset of antibodies in an antibody library may have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV1-39, while the remaining antibodies in the antibody library have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV 2-28.

In some cases, disclosed herein are nucleic acid sequences encoding the antibodies described herein. The nucleic acid sequence may be a DNA or RNA sequence. The nucleic acid may be inserted into a vector. The vector may be a bacteriophage. The phage may be a phagemid or a bacteriophage. The phagemid may be pMID 21. The bacteriophage may be DY3F63, M13 bacteriophage, fd filamentous bacteriophage, T4 bacteriophage, T7 bacteriophage, or lambda bacteriophage. In some cases, the phagemid can be introduced into the microorganism in combination with a bacteriophage (i.e., a "helper" phage). The microorganism may be a filamentous bacterium. The filamentous bacterium may be E.coli.

The antibody libraries described herein comprise a plurality of antibodies. The plurality of antibodies can be at least 1.0 x 106、1.0×107、1.0×108、1.0×109、1.0×1010、2.0×1010、3.0×1010、4.0×1010、5.0×1010、6.0×1010、7.0×1010、8.0×1010、9.0×1010Or 10.0X 1010A seed antibody. The plurality of antibodies can be at least 1.0 x 1011A seed antibody. The plurality of antibodies can be at least 7.6 × 1010A seed antibody. Due to the high diversity of such libraries, they can be unique. For example, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, or at least 35% of the plurality of antibodies can be unique in any library herein. In some cases, the library has more than 7.0 × 1010A seed antibody, wherein at least 20% of the plurality of antibodies are unique. The unique antibodies may differ by at least one nucleic acid or at least one amino acid residue relative to the other antibodies of the antibody library.

The total amount of antibodies found naturally in humans (e.g., antibodies from naive cells and memory cells) as well as antibody libraries produced by other library preparation methods can contain highly redundant heavy chain sequences (fig. 18). If one heavy chain sequence of an antibody in the library is redundant with another heavy chain sequence of a different antibody, it can be indicated that the heavy chain sequences are identical. The two or more antibodies with redundant heavy chain sequences may comprise different antibody framework regions, different light chain sequences, or a combination thereof. Antibody libraries produced by the methods described herein may exhibit reduced heavy chain redundancy. Reduction of heavy chain redundancy can increase the diversity of antibody libraries. Redundancy can be measured by the percentage of the library occupied by the top-ranked clone (top clone). The antibody libraries generated herein may have a redundancy of about 2%, about 3%, about 4%, or about 5% (fig. 18). In some cases, the maximum number of heavy chains of a classical natural library is limited to 1.0 × 10 due to limitations of the combination of naturally occurring CDRs7A seed antibody. In some cases, the heavy chains of the antibodies in the libraries described herein are not limited by the combination of naturally occurring CDRs and can comprise more than 1.0 x 1011A seed antibody.

In some cases, the antibody library is an antibody library as described in figure 21. In some cases, the antibody library is an antibody library as described in figure 22.

Method for generating diverse antibody libraries

In certain instances, described herein are methods of making an antibody library comprising a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (a) at least one of the VH-CDR3 sequence and VL-CDR3 sequence is derived from naive B cells; (b) VH-CDR3 sequence or VL-CDR3 sequence not derived from naive B cells is derived from memory B cells; and (c) the VH-CDR1 sequence, VH-CDR2 sequence, VL-CDR1 sequence and VL-CDR2 sequence are derived from a memory cell.

In some cases, the methods described herein produce antibody libraries with a high degree of functional diversity. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 80%, 85%, 90%, 95%, or 99% of the plurality of antibodies are functional. The functional antibody may be an antibody having the ability to bind to a protein. The ability of an antibody to bind to a protein can be determined by screening the antibody against protein a or protein L. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 90% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 95% of the plurality of antibodies are functional. A library of antibodies with high functional diversity may comprise a plurality of antibodies, wherein at least 99% of the plurality of antibodies are functional.

The antibody library may comprise at least 1.0X 105、2.0×105、3.0×105、4.0×105、5.0×105、6.0×105、7.0×105、8.0×105、9.0×105、1.0×1010、2.0×1010、3.0×1010、4.0×1010、5.0×1010、6.0×1010、7.0×1010、8.0×1010Or 9.0X 1010A seed antibody. The antibody library may comprise at least 1.0X 105A seed antibody. The antibody library may comprise at least 7.0X 1010A seed antibody. The antibody library may comprise at least 7.1X 1010、7.2×1010、7.3×1010、7.4×1010、7.5×1010、7.6×1010、7.7×1010、7.8×1010Or 7.9X 1010A seed antibody. The antibody library may comprise at least 7.6X 1010A seed antibody.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional.

Methods for preparing antibody libraries can include: (a) obtaining sequence information for a plurality of VH-CDR3 and VL-CDR3 sequences from an initial B-cell pool and sequence information for a plurality of VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences from a memory B-cell pool; (b) assembling a plurality of Variable Light (VL) domain sequences, each VL domain sequence comprising: a VL-CDR1 sequence obtained from the sequence information determined in step a from memory B cells, a VL-CDR2 sequence obtained from the sequence information determined in step a from memory B cells, and a VL-CDR3 sequence obtained from the sequence information determined in step a from memory B cells or naive B cells; (c) assembling a plurality of first nucleic acid sequences encoding a plurality of first antibodies, each first antibody comprising: (i) a Variable Light (VL) domain sequence assembled in step b; and (ii) a single fixed heavy chain sequence; (d) inserting a plurality of first nucleic acid sequences into a plurality of bacteriophages; (e) expressing a plurality of first antibodies on the surface of a plurality of phage; (f) applying at least one selection pressure to the plurality of bacteriophages to generate a subset of bacteriophages comprising a subset of the first nucleic acid sequences; (g) assembling a plurality of Variable Heavy (VH) domain sequences, each VH domain sequence comprising: a VH-CDR1 sequence obtained from the sequence information determined in step a from memory B cells, a VH-CDR2 sequence obtained from the sequence information determined in step a from memory B cells, and a VH-CDR3 sequence obtained from the sequence information determined in step a from memory B cells or naive B cells, wherein at least one of the VH-CDR3 sequence and VL-CDR3 sequence is derived from the sequence information from naive B cells; (h) replacing a single fixed heavy chain sequence from the first subset of nucleic acid sequences with the plurality of VH domain sequences assembled in step g to generate a plurality of second nucleic acid sequences, each second nucleic acid sequence comprising: (i) a Variable Light (VL) domain sequence assembled in step b, and (ii) a Variable Heavy (VH) domain sequence assembled in step g, wherein a plurality of second nucleic acid sequences encode a plurality of second antibodies; and (i) transforming a plurality of microorganisms with a plurality of bacteriophages to produce a plurality of transformants.

The method can include obtaining samples comprising naive B cells and memory B cells from a plurality of individuals. The sample may be blood, plasma or serum. Peripheral samples of human blood may contain hundreds of thousands of memory clones and plasmablasts, with less than 10,000 pools predominating in the samples (fig. 27). The plurality of individuals may be a plurality of mammals. The plurality of mammals may be a plurality of primates, mice, rats, pigs, goats, rabbits, horses, cows, cats, or dogs. The plurality of mammals may be a plurality of humans. A plurality of individuals canTo be at least 25, 50, 75, 100, 125 or 150 individuals. The plurality of individuals may be 50-100 individuals. The plurality of individuals may be 50-140 individuals. The plurality of individuals may be at least 50 individuals. The plurality of individuals may be at least 140 individuals. A sample comprising naive B cells and memory B cells from an individual can comprise at least about 5X 107Initial B cell clones and 5X 105Individual memory B cell clones.

The method may include sorting or separating the naive B cells from the memory B cells in the sample prior to obtaining the sequence information. The memory B cells may be CD27+ B cells. Thus, the sequence information may include separate sequence information for the naive B cell and the memory B cell. Sorting the naive B cells and the memory B cells can comprise using flow cytometry. In some cases, the flow cytometry is Fluorescence Activated Cell Sorting (FACS). Sorting naive B cells and memory B cells can include an immunomagnetic cell separation procedure based on markers present on the surface of naive B cells or memory B cells. Sorting the primary B cells and memory B cells from the sample can produce a primary B cell pool and a memory B cell pool. Sorting naive B cells and memory B cells from a sample can produce a plurality of naive B cell pools and a plurality of memory B cell pools. The initial B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of the initial B cells. An initial B cell pool comprising less than 0.1%, 1%, 5%, 10%, 20%, or 30% that is not the source of the initial B cells may also be referred to herein as a pool that is primarily the source of the initial B cells. The memory B cell pool can comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of memory B cells. A memory B cell pool comprising less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not a source of memory B cells may also be referred to herein as a pool that is primarily a source of memory B cells.

In some cases, the quality of the memory B cell pool or the initial B cell pool will be checked using Next Generation Sequencing (NGS). Pools with problematic diversity or biochemical liability may be discarded. Examples of biochemical propensities include, but are not limited to, N-linked glycosylation, deamination, acid hydrolysis, positively charged intein cleavage, free cysteine, free methionine, alternative stop codons, cryptic splice sites, tev cleavage sites, and overly positively charged CDRs. In some cases, sequence data from at least one individual is removed from the pool. If the sequence data has problematic diversity or biochemical tendencies, the sequence data from the individual can be removed from the pool.

The method can include extracting nucleic acid from the initial B cell and extracting nucleic acid from the memory B cell. After isolating or isolating the initial cells and memory cells from the sample, nucleic acids can be extracted from each of the initial cells and memory cells. The nucleic acid may be DNA or messenger RNA (mRNA). If the nucleic acid is mRNA, the method may further comprise reverse transcribing the mRNA into complementary DNA (cDNA).

Methods of making antibody libraries can include obtaining sequence information for a plurality of VH-CDR3 and VL-CDR3 sequences from naive B cells from a plurality of individuals and obtaining sequence information for a plurality of VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences from memory B cells from a plurality of individuals. Sequence information from the initial B cells can be obtained from the initial B cell pool. Sequence information from memory B cells can be obtained from a pool of memory B cells. Obtaining sequence information for the CDR sequences can include sequencing the CDR sequences. Sequencing the plurality of CDR sequences can include any suitable sequencing technique, such as Next Generation Sequencing (NGS) or Sanger sequencing. Examples of next generation sequencing include, but are not limited to, pyrosequencing, sequencing by synthesis, sequencing by ligation, and single molecule sequencing. Sequencing multiple CDR sequences can yield sequence information.

Sequencing the plurality of VH-CDR and VL-CDR sequences can include sequencing nucleic acids extracted from naive B cells and nucleic acids extracted from memory B cells from the sample, respectively.

Assembling or synthesizing the VH sequence or VL sequence may comprise using overlap extension PCR (OE-PCR). In some cases, overlapping fragments are generated that comprise a CDR of a VH domain or a portion of a CDR of a VL domain. Multiple overlapping fragments comprising a portion of a CDR of a VH domain or a CDR of a VL domain may cover the entirety of a CDR of a VH domain sequence or a CDR of a VL domain sequence. The overlapping fragments may be dsDNA fragments. OE-PCR can involve the assembly of overlapping fragments to produce CDRs of the entire VH domain or CDRs of the VL domain. The CDRs of the VH domain may be VH-CDR1, VH-CDR2, VH-CDR3 or a combination thereof. The CDRs of the VL domain can be VL-CDR1, VL-CDR2, VL-CDR3 or a combination thereof. The VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2, VL-CDR3 or combinations thereof can be synthesized to include at least one of the following properties: (a) a CDR sequence derived from a human germline sequence, (b) contains no more than 4 amino acid mutations compared to the germline sequence, (c) is: (i) identified as naturally occurring in at least 2 individuals and enriched without an adaptive disadvantage or (ii) enriched in bulk during panning; (d) does not contain any biochemical tendency; or (e) a combination thereof. The germline sequence may be IGHJ4, IGHV1-69, IGHV1-46, IGHV3-23, IGKV1-39, IGKV2-28, IGKV3-15, or IGKV4-1, or a combination thereof.

VH sequences, VL sequences assembled using OE-PCR, or a combination thereof, may be cloned into a vector. The vector may be a bacteriophage. The phage may be a bacteriophage or a phagemid. The vector may be a HuCAL phage. The vector may further comprise a gene encoding a surface coat protein. The surface coat protein may be a pIII, pVIII, pVI, pVII, pIX or gIII protein. The surface coat protein may be a gIII protein. In some cases, expression of the antibody encoded by the vector comprises expression of the antibody fused to a surface coat protein of the vector. The expressed antibody may be displayed on the surface of a vector.

The method of making an antibody library can comprise assembling a plurality of VL domain sequences, each VL domain sequence of the plurality of VL domain sequences comprising: a VL-CDR1 sequence derived from sequence information from memory B cells, a VL-CDR2 sequence derived from sequence information from memory B cells, and a VL-CDR3 sequence derived from sequence information from memory B cells or naive B cells. The VL domain sequence may be cloned into the vector in combination with a single fixed heavy chain sequence. The single fixed heavy chain sequence may be IGHV3-23 or IGHJ 4. A single fixed heavy chain sequence may be referred to as a stuffer sequence.

A method of making an antibody library can comprise assembling a plurality of first nucleic acid sequences encoding a plurality of first antibodies, each first antibody of the plurality of first antibodies comprising: a Variable Light (VL) domain sequence; and a single fixed heavy chain sequence. The method of making an antibody library can comprise inserting a plurality of first nucleic acid sequences into a plurality of vectors. The vector may be a bacteriophage. Antibodies encoded by vectors comprising an assembled VL domain and a single fixed heavy chain sequence can be expressed in the vector. The antibody may be expressed on the surface of a vector.

The method of making an antibody library can comprise applying at least one selection pressure to a plurality of vectors, wherein each vector of the plurality of vectors expresses an antibody. After the selection pressure is applied, a subset of phage capable of withstanding the selection pressure can be generated. The selection pressure may be the application of thermal stress, selection with protein a, selection with protein L, or a combination thereof. The thermal stress may be a temperature of about 65 ℃. The thermal stress may be a temperature of at least 30 ℃, 40 ℃, 50 ℃, 60 ℃, 70 ℃ or 80 ℃. In some cases, applying thermal stress to the carrier excludes the carrier if the carrier is unstable or prone to aggregation. In some cases, applying a selection with protein a or protein L excludes a vector if the vector expresses an antibody that does not have the ability to bind to the protein. The choice of protein a or protein L to be applied may allow for the selection of antibodies that bind to the protein. The antibody bound to the protein may be thermostable. In some cases, after selecting an antibody that binds to a protein, the nucleotide sequence corresponding to the antibody that binds to the protein can be determined.

Methods of making antibody libraries can include assembling a plurality of VH domain sequences, wherein each VH domain sequence comprises: VH-CDR1 sequence derived from sequence information from memory B cells, VH-CDR2 sequence derived from sequence information from memory B cells, and VH-CDR3 sequence derived from sequence information from memory B cells or naive B cells. The VH-CDR1 and VH-CDR2 sequences may be sequences obtained from memory B cells, while the VH-CDR3 sequences may be sequences from naive B cells. If the vector is able to successfully withstand the applied selection pressure, the assembled VH domain sequences may be substituted for the single fixed heavy chain sequences. For each of the plurality of antibodies, at least one of the VH-CDR3 sequence and VL-CDR3 sequence can be derived from sequence information from naive B cells.

Vectors comprising an assembled VL domain and an assembled VH domain may be transformed into a microorganism. The microorganism may be a bacterium. The bacteria may be filamentous bacteria. The filamentous bacterium may be E.coli. The microorganism may be any suitable commercially available strain.

The vector can be transformed into the microorganism using electroporation, chemical transformation, heat shock transformation, or a combination thereof.

Electroporation may comprise applying a high voltage electric field to a ligation mixture comprising the microorganism to be transformed and the carrier. The high voltage may range from 1 to 25 kV/cm. The high voltage may range from 3 to 24 kV/cm. Examples of high voltages that may be applied to the microorganisms to induce transformation include, but are not limited to, 10kV/cm, 15kV/cm, 20kV/cm, and 25 kV/cm. The high voltage may be applied as one pulse or multiple pulses. The plurality of pulses may be high voltage pulses applied every 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 500, or 1000 microseconds (μ β). The plurality of pulses may be high voltage pulses applied every 10, 20, 30, 50, 60, 70, 80, 90, or 100 milliseconds (msec). Electroporation can be applied to the microorganisms at room temperature or 4 ℃.

The vector may be purified and resuspended in water or TE prior to addition to the ligation mix. The ligation mixture may comprise a buffer. Examples of buffers include, but are not limited to, Phosphate Buffered Saline (PBS), Hepes Buffer (HBSS), or culture media. The buffer may be a hypotonic buffer. The buffer may be a high resistance buffer. In some cases, after electroporation, recovery medium is added to the ligation mixture.

The chemical conversion may comprise incubating the microorganism and the carrier with the cation. The cation may be Mg2+, Mn2+, Rb +, or Ca2 +. The chemical transformation may comprise contacting the microorganism and the vector with CaCl2、MgCl2、MnCl2Or RbCl together.

Heat shock transformation may include subjecting the microorganism and vector to an elevated temperature to induce transformation. The elevated temperature may be 42 ℃. The temperature may be applied for 10 seconds, 20 seconds, 30 seconds, 40 seconds, 50 seconds, or 1 minute. Heat shock may be applied before, during or after electroporation or chemical transformation.

The vector may comprise a selectable marker. The selectable marker may be an antibiotic resistance gene or an optical selectable marker, such as green fluorescent protein. The antibiotic resistance gene can confer resistance to an antibody selected from the group consisting of: kanamycin, spectinomycin, streptomycin, ampicillin, carbenicillin, bleomycin, erythromycin (erthyromycin), polymyxin (polymxin) B, tetracycline, chloramphenicol, and combinations thereof. The selectable marker may allow exclusion of microorganisms that are not transformed by the vector.

The microorganism transformed with the vector may be referred to as a transformant. Production of multiple antibodies may include production of multiple transformants. In some cases, a plurality of transformants comprises at least 7.6 × 1010And (4) a transformant. At least 7.6X 1010Each transformant may contain 0.5X 1010Individual VH-CDR3 sequences. In some cases, at least 20% of the plurality of transformants are unique. For example, if multiple transformants included 7.6X 1010(ii) a transformant, then in some embodiments, at least 1.52X 1010Individual transformants are unique. The unique transformant may be a transformant having a unique or non-redundant sequence compared to other transformants in the plurality of transformants. In some cases, antibody libraries described herein can be screened to obtain antibodies specific for any desired target. Examples of targets include, but are not limited to, PD1, LAG3, OX40, CTLA4, SIRPA, CD47, VISTA, 41BB, TIM3, GITR, ICOS, TIGIT, GHR, HGH, amyloid beta, alpha synuclein, Tau, and beta secretase. The length of time to develop the antibody libraries described herein can be less than 2 months, less than 1 month, or less than 2 weeks. In one example, the development of an antibody library can be less than 2 months, including the time involved in panning, screening, and optimization (fig. 2). In another example, the length of time may be less than 2 weeks (fig. 2).

Antibody optimization and resulting library (Tumbler library)

In some aspects, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the CDR sequences are selected from the group consisting of: a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence, wherein the CDR sequences of each of the plurality of antibodies are the same; and (d) the remaining unique combinations of CDR sequences are selected from the following: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR 3. Antibody libraries may also be referred to herein as Tumbler libraries.

In one example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR1 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR1 sequences were derived from the original antibody clone.

In another example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR2 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR2 sequences were derived from the original antibody clone.

In another example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR3 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR3 sequences were derived from the original antibody clone.

In another example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR1 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR1 sequences were derived from the original antibody clone.

In another example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR2 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, and a VL-CDR3 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR2 sequences were derived from the original antibody clone.

In another example, an antibody library described herein can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR3 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, and a VL-CDR2 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR3 sequences were derived from the original antibody clone.

In various aspects, each antibody in the antibody library comprises a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, or a VL-CDR3 sequence selected from the original antibody clone. As used herein, the term "initial antibody clone" may refer to any antibody or antibody fragment having a desired property, such as affinity for a desired epitope, an amino acid sequence of said antibody or antibody fragment, a nucleotide sequence encoding said antibody or antibody fragment, or any computer-simulated amino acid or nucleotide sequence corresponding to said antibody or antibody fragment. In various aspects, each antibody in the antibody library can comprise the same CDR sequences derived from the original antibody clone. In addition, each antibody in the antibody library may comprise a different combination of the remaining CDR sequences that are not derived from the original antibody clone. In some cases, the remaining CDR sequences can be derived from a highly diverse antibody library (e.g., a SuperHuman antibody library as described herein). In some cases, one of the CDRs may be derived from an initial antibody clone, while the remaining CDRs may be derived from a highly diverse antibody library (e.g., a SuperHuman antibody library). In some cases, highly diverse antibody libraries may have a high diversity in each CDR sequence not derived from the original antibody clone. In a non-limiting example, as depicted in fig. 29, each antibody in the antibody library may comprise the same VH-CDR3 sequence derived from the original antibody clone ("original clone" in fig. 29). In addition, each antibody in the antibody library may comprise a different combination of the remaining CDR sequences (in this example VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3) that were not derived from the original antibody clone.

In various aspects, one or more of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring. In various aspects, each of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring, but are present in each antibody in non-naturally occurring combinations. In various aspects, one or more of the VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences, and VL-CDR3 sequences are naturally found in a human population or derived from human CDR sequences. In various aspects, each of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring in a human population or derived from human CDR sequences.

In various aspects, the antibodies of the library can comprise non-naturally occurring combinations of naturally occurring CDRs, such as CDRs from memory B cells and naive B cells that are derived from naturally occurring but co-occurring on the same antibody that are not naturally occurring. For example, a non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, a non-naturally occurring combination of naturally occurring CDRs may comprise at least one CDR derived from a cell that is primarily a primary B cell source, while the remaining CDRs may be derived from a cell that is primarily a memory B cell source.

A non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from an initial cell, while the remaining CDRs are derived from a memory cell. In some cases, at least VL-CDR1 is derived from the starting cell. In some cases, at least VL-CDR2 is derived from the starting cell. In some cases, at least VL-CDR3 is derived from the starting cell. In some cases, at least the VH-CDR1 is derived from the naive cell. In some cases, at least the VH-CDR2 is derived from the naive cell. In some cases, at least the VH-CDR3 is derived from the naive cell.

Non-naturally occurring combinations of naturally occurring CDRs can comprise two, three, four or five CDRs derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, two CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, three CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, four CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, five CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells.

In another non-limiting example of a non-naturally occurring combination, the VL-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, and VL-CDR2 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2, and VL-CDR3 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 and VL-CDR3 can be derived from naive cells, while the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 can be derived from memory cells.

The VH-CDR1 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR1 sequence derived from naive B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR3 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from naive B cells.

The VL-CDR1 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The initial B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The VL-CDR1 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The VL-CDR2 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The initial B cell-derived VL-CDR2 sequence can be a synthetic VL-CDR2 sequence. The VL-CDR2 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The VL-CDR3 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell. The initial B cell-derived VL-CDR3 sequence can be a synthetic VL-CDR3 sequence. The VL-CDR3 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of primary B-cell origin. The VH-CDR3 sequences, VL-CDR3 sequences, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of primary B cell origin. Initial B cell pools can be obtained from multiple individuals. The initial B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of the initial B cells.

The VH-CDR1 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR1 sequence derived from memory B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells.

The memory B cell-derived VL-CDR1 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The memory B cell-derived VL-CDR1 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. The VH-CDR3 sequence, VL-CDR3 sequence, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. A pool of memory B cells can be obtained from multiple individuals. The memory B cell pool can comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of memory B cells. The memory B cells may be CD27+ B cells. The memory B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20% or 30% of cells that are not CD27+ B cell derived.

The starting cell may be a starting B cell. The naive B cells can be human naive B cells. The memory cell may be a memory B cell. The memory B cell can be a human memory B cell. In some cases, naive B cells showed increased diversity of VH-CDR3 and VL-CDR3 sequences compared to VH-CDR3 and VL-CDR3 sequences from memory B cells (fig. 1). The starting cells and memory cells can be obtained from a biological sample, such as blood, from an individual or multiple individuals. The naive and memory cells can be physically separated from the sample using a marker specific for naive or memory cells.

Markers can be used to identify, isolate or sort B cells, naive B cells and memory B cells from a biological sample. Examples of markers for identifying, isolating or sorting B cells include, but are not limited to, CD19 +. Examples of markers for identifying, isolating or sorting naive B cells include, but are not limited to, CD19+, CD27-, IgD +, IgM +, and combinations thereof. Examples of markers for identifying, isolating or sorting memory B cells include, but are not limited to, CD19+, CD27+, and combinations thereof. In some embodiments, CD27+ is used to sort memory B cells. Examples of markers for identifying, isolating or sorting class-switching memory B cells include, but are not limited to, CD19+, CD27+, CD27+, IgD-, IgM-, and combinations thereof. Examples of markers for identifying, isolating or sorting non-transformed or marginal zone memory B cells include, but are not limited to, CD19+, CD27+, IgD +, IgM +, and combinations thereof. In some cases, memory B cells can be identified, isolated, or sorted using the following markers: CD19+, CD27+, IgD-, IgM +, and combinations thereof. The starting cell from which VH-CDR3 is derived may be a CD27-/IgM + B-cell. The memory cells from which VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3 are derived may be CD27+/IgG + B cells.

The CDR sequences of the antibody can be those found in naive B cells and memory B cells found in a single or multiple individuals. The subject may be a mammal. The mammal can be a human, a non-human primate, a mouse, a rat, a pig, a goat, a rabbit, a horse, a cow, a cat, or a dog. In some cases, the CDR sequences are CDR sequences obtained from publicly available sources. Examples of sources from which CDR sequences can be publicly obtained include SAbData (http:// sight. stats. ox. ac. uk/webpps/SAbDab-sabpred/Welcom. php) and PylgClassify (http:// duncrack 2.fccc. edu/PyIgClassify /).

In various aspects, each antibody in the antibody library can comprise the same scaffold, e.g., the same combination of framework sequences (see fig. 30). In various aspects, the VH domain of each antibody in the antibody library comprises a VH-FR1 sequence, a VH-FR2 sequence, a VH-FR3 sequence, and a VH-FR4 sequence. In some cases, each antibody in the antibody library has each of the VH-FR1 sequence, VH-FR2 sequence, VH-FR3 sequence, and VH-FR4 sequence identical. In some cases, each of the VH-FR1 sequence, VH-FR2 sequence, VH-FR3 sequence, and VH-FR4 sequence was derived from the original antibody clone from which the CDR sequences were derived (see fig. 30). In some cases, the VH-FR1 sequence may be the same VH-FR1 sequence as the original antibody clone. In some cases, the VH-FR2 sequence may be the same VH-FR2 sequence as the original antibody clone. In some cases, the VH-FR3 sequence may be the same VH-FR3 sequence as the original antibody clone. In some cases, the VH-FR4 sequence may be the same VH-FR4 sequence as the original antibody clone. In various aspects, the VL domain of each antibody in the antibody library comprises a VL-FR1 sequence, a VL-FR2 sequence, a VL-FR3 sequence, and a VL-FR4 sequence. In some cases, each antibody in the antibody library is identical in each of the VL-FR1 sequence, VL-FR2 sequence, VL-FR3 sequence and VL-FR4 sequence. In some cases, each of the VL-FR1 sequence, VL-FR2 sequence, VL-FR3 sequence, and VL-FR4 sequence can be derived from the original antibody clone from which the CDR sequences were derived (see fig. 30). In some cases, the VL-FR1 sequence may be the same VL-FR1 sequence as the original antibody clone. In some cases, the VL-FR2 sequence may be the same VL-FR2 sequence as the original antibody clone. In some cases, the VL-FR3 sequence may be the same VL-FR3 sequence as the original antibody clone. In some cases, the VL-FR4 sequence may be the same VL-FR4 sequence as the original antibody clone.

The framework of the antibody may be a naturally occurring framework. The naturally occurring framework may be a framework found in a mammal. The mammal may be a primate, mouse, rat, pig, goat, rabbit, horse, cow, cat, or dog. The primate can be a human. The framework may comprise at least one variant as compared to a naturally occurring framework. The variant may be a mutation, insertion or deletion. The variant may be a variant found in the nucleic acid sequence encoding the antibody or a variant found in the amino acid sequence of the antibody. Any suitable framework sequence may be used, such as those previously used in phase I clinical trials (fig. 12A, 12B). As used herein, the framework of an antibody may refer to the framework regions of a variable heavy chain (VH-FR1, VH-FR2, VH-FR3, and VH-FR4), the framework regions of a variable light chain (VL-FR1, VL-FR2, VL-FR3, and VL-FR4), or a combination thereof. The framework regions of the antibodies in the antibody library may be identical to the germline framework regions.

The frame may be a therapeutically optimal frame. A therapeutically optimal framework may comprise at least one, at least two, at least three, at least four, at least five or all of the following properties selected from the group consisting of: a) safety previously demonstrated in human monoclonal antibodies, b) thermostable; c) is not easy to aggregate; d) contains a single dominant allele at the amino acid level throughout the human population; e) different typical topologies containing CDRs; f) good expression in bacteria; and g) display well on phage. The framework with the safety previously demonstrated in human monoclonal antibodies can be that of antibodies already used in at least phase I clinical trials. The thermally stable frame may be a frame that is stable at least 20 ℃, 30 ℃, 40 ℃, 50 ℃, 60 ℃, 70 ℃, 80 ℃, 90 ℃, 100 ℃ or above 100 ℃. The thermally stable frame may be a frame capable of withstanding a temperature increase of at least 3 ℃, 4 ℃ or 5 ℃ per minute. A framework that expresses well in bacteria can be a framework that produces biologically active antibodies in bacteria. The bacterium may be Escherichia coli. The bacteria may be engineered bacteria. The bacterium may be a bacterium optimized for antibody expression. A good framework displayed on a phage may be one that produces a biologically active antibody when displayed on the phage surface.

An example of a strategy for selecting a framework is depicted in fig. 11, where an ideal framework for an antibody can be an antibody that exhibits structural diversity, has been successfully used in phase I clinical trials in humans, has low immunogenicity, exhibits aggregation resistance, exhibits adaptability, and is thermostable. In some cases, an antibody framework is avoided if it has an inherent autoreactivity to blood cells (e.g., IGHV4-34), has poor stability characteristics (e.g., IGHV2-5), has V genes that are not found in at least 50% of individuals (e.g., IGHV4-b), shows V genes that are prone to aggregation (e.g., IGLV6-57), or a combination thereof.

The amino acid sequence of the antibody frameworks herein can comprise more than one dominant allele, with different dominant alleles present in different human populations (fig. 13 and 14). For example, the IGHV1-3 framework comprises 3 alleles: IGVH1-3 × 01, IGVH1-3 × 02 and IGVH1-3 × 03, which were found at different frequencies in different human populations (fig. 26). In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in at least two human populations. In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in all human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least two human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least twelve human populations. In some cases, the framework regions of the VH domain are those from IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15, or IGHV 3-23. In some cases, the framework regions of the VH domain are those from IGHV2-5, IGHV3-7, IGVH4-34, IGHV5-51, IGHV1-24, IGHV2-26, IGHV3-72, IGHV3-74, IGHV3-9, IGHV3-30, IGHV3-33, IGHV3-53, IGHV3-66, IGHV4-30-4, IGHV4-31, IGHV4-59, IGHV4-61, or IGHV 5-51. In some cases, the framework regions of the VH domains of the antibodies in the antibody library are framework regions from IGHV1-46, IGHV3-23, or a combination thereof. In some cases, the framework regions of the VL domains of the antibodies in the antibody library are framework regions from IGKV1-39, IGKV2-28, IGKV3-15, IGKV4-1, IGKV1-5, IGKV1-12, IGKV1-13, IGKV3-11, IGKV3-20, or a combination thereof. In one example, a subset of antibodies in an antibody library may have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV1-39, while the remaining antibodies in the antibody library have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV 2-28.

In some cases, disclosed herein are nucleic acid sequences encoding the antibodies described herein. The nucleic acid sequence may be a DNA or RNA sequence. The nucleic acid may be inserted into a vector. The vector may be a bacteriophage. The phage may be a phagemid or a bacteriophage. The phagemid may be pMID 21. The bacteriophage may be DY3F63, M13 bacteriophage, fd filamentous bacteriophage, T4 bacteriophage, T7 bacteriophage, or lambda bacteriophage. In some cases, the phagemid can be introduced into the microorganism in combination with a bacteriophage (e.g., a "helper" phage). The microorganism may be a filamentous bacterium. The filamentous bacterium may be E.coli.

The antibody libraries described herein comprise a plurality of antibodies. The plurality of antibodies can be at least 1.0 x 106、1.0×107、1.0×108、1.0×109、1.0×1010、2.0×1010、3.0×1010、4.0×1010、5.0×1010、6.0×1010、7.0×1010、8.0×1010、9.0×1010Or 10.0X 1010A seed antibody. The plurality of antibodies can be at least 1.0 x 1011A seed antibody. The plurality of antibodies can be at least 7.6 × 1010A seed antibody. Due to the high diversity of such libraries, they can be unique. For example, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, or at least 35% of the plurality of antibodies can be unique in any library herein. In some cases, the library has more than 7.0 × 1010A seed antibody, wherein at least 20% of the plurality of antibodies are unique. Relative to antibody librariesOther antibodies, unique antibodies may differ by at least one nucleic acid or at least one amino acid residue.

In some cases, the antibody library comprises at least 1.0 × 105An antibody, wherein at least 80% of the plurality of antibodies are functional (e.g., with a K of less than 100 nM)dBinding to the desired antigen). In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, antibodiesThe library comprises at least 7.0X 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional.

In various aspects, the antibody library can have a high degree of diversity in one or more CDR sequences. In some cases, antibody libraries can have a high diversity in VH-CDR1 sequences. In some cases, antibody libraries can have a high diversity in VH-CDR2 sequences. In some cases, antibody libraries can have a high diversity in VH-CDR3 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR1 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR2 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR3 sequences. In some cases, the antibody library may have a high diversity in CDR sequences not derived from the original antibody clone, but a low or no diversity in CDR sequences derived from the original antibody clone. In some cases, antibody libraries with high diversity can comprise at least 1 × 103、5×103、1×104、5×104、1×105、5×105、1×106、5×106Or more unique CDR sequences. In some cases, the antibody library may comprise a high degree of diversity in five of the six CDR sequences, e.g., the antibody library may comprise a high degree of diversity in five CDR sequences selected from: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences. In such cases, the remaining CDR sequences may have low or no diversity. In some cases, the remaining CDR sequences of each antibody are the same CDR sequences. In some cases, the remaining CDR sequences are derived from the original antibody clone.

In various aspects, at least one antibody in the antibody library can exhibit an improvement in at least one property compared to the original antibody clone. In some cases, at least one antibody in the antibody library may exhibit an improvement in thermostability (e.g., a higher Tm) as compared to the original antibody clone. For example, at least one antibody in the antibody library can have a temperature (Tm) at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃,7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃,19 ℃, 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 46 ℃, 47 ℃, 48 ℃, 49 ℃, 50 ℃ or greater than 50 ℃ higher than the initial antibody clone. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library can have a higher Tm than the original antibody clone.

In some cases, at least one antibody in the antibody library may exhibit greater affinity (e.g., a lower dissociation constant (K) for the target epitope than the original antibody cloned)). For example, at least one antibody in the antibody library may have a dissociation constant (K) greater than that of the original antibody cloned) K to a target epitope at least 5X, at least 10X, at least 20X, at least 30X, at least 40X, at least 50X, at least 60X, at least 70X, at least 80X, at least 90X, at least 100X, at least 200X, at least 300X, at least 400X, at least 500X, at least 600X, at least 700X, at least 800X, at least 900X, or at least 1000X lowerd. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library may have a lower K than the original antibody cloned

In some cases, at least one antibody in the antibody library may exhibit an increase as compared to the original antibody cloneAdded species selectivity. For example, at least one antibody in the antibody library may have a K that is greater than the K of the original antibody clonedA Kd at least 5X, at least 10X, at least 20X, at least 30X, at least 40X, at least 50X, at least 60X, at least 70X, at least 80X, at least 90X, at least 100X, at least 200X, at least 300X, at least 400X, at least 500X, at least 600X, at least 700X, at least 800X, at least 900X, or at least 1000X lower for a target epitope of a particular species. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library may have a lower K to an epitope from a particular species than the original antibody cloned

In some cases, at least one antibody in the antibody library may exhibit increased species cross-reactivity (e.g., across primate species) compared to the original antibody clone. For example, at least one antibody in the antibody library may have a K lower than epitope a of species a (e.g., cynomolgus monkey)dAnd may have a higher affinity for an analogous epitope a' of species B (e.g., human). In some cases, at least one antibody in the antibody library may have a K that is greater than the K of the original antibody clonedAt least 5X, at least 10X, at least 20X, at least 30X, at least 40X, at least 50X, at least 60X, at least 70X, at least 80X, at least 90X, at least 100X, at least 200X, at least 300X, at least 400X, at least 500X, at least 600X, at least 700X, at least 800X, at least 900X, or at least 1000X lower K to a target epitope of a first speciesdAnd may also have a K greater than that of the original antibody clonedAt least 5X, at least 10X, at least 20X, at least 30X, at least 40X, at least 50X, at least 60X, at least 70X, at least 80X, at least 90X, at least 100X, at least 200X, at least 300X, at least 400X, at least 500X, at least 600X, at least 700X, at least 800X, at least 900X, or at least 1000X lower K to an analogous epitope of a second speciesd. In some cases, antibodies in the antibody library are compared to the original antibody cloneAt least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the body may have a lower K to an epitope from the first species and to a similar epitope from the second speciesd

In various aspects, an antibody library can comprise one or more antibodies that exhibit an improvement in more than one property compared to the original antibody clone. In some cases, the improvement is selected from: improved thermostability, improved affinity for target epitopes, improved selectivity for target epitopes of a particular species, and improved cross-reactivity between species. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library exhibit an improvement in two of: thermostability, affinity for a target epitope, selectivity for a target epitope of a particular species, or cross-reactivity between species. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library exhibit an improvement in three of: thermostability, affinity for a target epitope, selectivity for a target epitope of a particular species, or cross-reactivity between species. Fig. 31A and 31B depict non-limiting examples of selecting antibody clones that exhibit improved thermostability, improved affinity for epitope a from cynomolgus monkeys, and improved affinity for epitope a from human.

In various aspects, the antibodies in the antibody library can exhibit thermal stability. In some cases, the antibodies in the antibody library can have a melting temperature (Tm) between about 50 ℃ to about 90 ℃. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library have a melting temperature (Tm) between about 50 ℃ to about 90 ℃. For example, the antibodies in the antibody library can have a melting temperature (Tm) of at least 50 ℃, 55 ℃, 60 ℃, 65 ℃, 70 ℃,75 ℃, 80 ℃, 85 ℃, or 90 ℃.

In various aspects, the antibodies in the antibody library can exhibit high affinity for the target epitope. In some cases, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or greater than 95% of the antibodies in the antibody library can exhibit high affinity for the target epitope. For example, antibodies in an antibody library can have a dissociation constant (K.sub.m), of less than about 50nM, 25nM, 10nM, 5nM, 1nM, 900pM, 800pM, 700pM, 600pM, 500pM, 400pM, 300pM, 200pM, 100pM, 50pM, 25pM, 10pM, 5pM, 1pM, 900fM, 800fM, 700fM, 600fM, 500fM, 400fM, 300fM, 200fM, 100fM, 50fM, 25fM, 10fM, 5fM, 1fM, or less (K.sub.m)d) Binding to a target epitope.

Method for generating antibody library by using Tumbler

In one aspect, methods are provided for generating antibody libraries, such as those described above. In some cases, the method comprises: (a) selecting a CDR sequence, wherein the CDR sequence is selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences; (b) replacing the CDR sequences of each antibody in the first antibody library with the CDR sequences selected in (a), thereby generating a second antibody library comprising a plurality of antibodies, wherein each antibody in the plurality of antibodies comprises: (i) the CDR sequences selected in (a); and (ii) a unique combination of remaining CDR sequences not selected in (a), wherein said remaining CDR sequences are selected from the group consisting of: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences.

Fig. 32 and 33 depict non-limiting exemplary workflows of methods of generating antibody libraries and obtaining one or more desired antibodies therefrom. In some cases, initial antibody clones were obtained (fig. 32, 3201). In some cases, the initial antibody clone may be obtained from a third party (such as a customer or client). In other cases, the initial antibody clones may be obtained from highly diverse antibody libraries (e.g., the SuperHuman antibody library described herein). In some cases, the initial antibody clone may have a desired property, such as affinity for a particular epitope. In some cases, it may be desirable to improve one or more properties of the original antibody clone. For example, it may be desirable to improve the thermal stability of an antibody (e.g., increase the melting temperature (Tm)), the binding properties of an antibody (e.g., affinity), or the species cross-reactivity of an antibody between two or more species. The initial antibody clones can then be used to generate antibody libraries (fig. 32, 3203). In some cases, CDR sequences from the original antibody clone may be selected. The CDR sequences may be any of the following: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences or VL-CDR3 sequences. In a particular aspect, the CDR sequence is a VH-CDR3 sequence (see fig. 33). Typically, the CDR sequences selected from the initial antibody clone are those important for a desired property, such as for binding affinity to a target epitope. In various aspects, CDR sequences selected from the initial antibody clones can be cloned into a highly diverse antibody library. In some cases, a highly diverse antibody library can have high diversity in five of the six CDR sequences, with little diversity in one CDR sequence being replaced (see, e.g., fig. 30 and 33). In some cases, the highly diverse antibody library may be a SuperHuman antibody library or a modified SuperHuman antibody library (e.g., identical to the scaffold of the SuperHuman antibody library, but without diversity in the replaced CDR sequences; see FIG. 33). In some cases, the CDR sequences of each antibody of a highly diverse antibody library may be replaced with CDR sequences selected from the original antibody clone, such that each antibody in a subsequent antibody library has the same CDR sequences. For example, VH-CDR3 sequences from the original antibody clone can be cloned into a highly diverse antibody library such that each VH-CDR3 sequence in the highly diverse antibody library is replaced with the same VH-CDR3 sequence from the original antibody clone (see fig. 33). In this case, the remaining CDR sequences (in this example VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3) are those present in a highly diverse antibody library. In some cases, CDR sequences can be cloned into highly diverse antibody libraries using methods that introduce mutations into the CDR sequences (e.g., to introduce more diversity into the CDR sequences). In some examples, the CDR sequences can be cloned by performing an error-prone PCR method to introduce one or more mutations into the CDR sequences. In some cases, each antibody in a highly diverse antibody library may have a unique combination of CDR sequences. Thus, such methods can generate antibody libraries with high diversity in VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3, but little diversity in VH-CDR3 (see FIG. 33). Additional selection and screening steps may be performed on subsequent antibody libraries to select antibody clones with the desired properties (fig. 32, 3205). Finally, the optimal antibody sequence can be determined by computational methods (fig. 32, 3207). Fig. 34 and 35 depict methods of screening and selecting antibody clones with improved properties, such as increased thermostability, increased binding affinity for a target epitope, and/or increased species cross-reactivity, as compared to the initial antibody clone.

In various aspects, methods for generating antibody libraries are provided. In some cases, the antibody library can comprise a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the CDR sequences are selected from the group consisting of: a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence, wherein the CDR sequences of each of the plurality of antibodies are the same; and (d) the remaining unique combinations of CDR sequences are selected from the following: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR 3. Antibody libraries may also be referred to herein as Tumbler libraries.

In one example, a method is provided for generating an antibody library comprising a plurality of antibodies, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR1 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR1 sequences were derived from the original antibody clone.

In another example, a method for generating an antibody library comprising a plurality of antibodies is provided, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR2 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR2 sequences were derived from the original antibody clone.

In another example, a method for generating an antibody library comprising a plurality of antibodies is provided, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VH-CDR3 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 are present in each of the plurality of antibodies in different combinations. In some cases, the VH-CDR3 sequences were derived from the original antibody clone.

In another example, a method for generating an antibody library comprising a plurality of antibodies is provided, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR1 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR1 sequences were derived from the original antibody clone.

In another example, a method for generating an antibody library comprising a plurality of antibodies is provided, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR2 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, and a VL-CDR3 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR2 sequences were derived from the original antibody clone.

In another example, a method for generating an antibody library comprising a plurality of antibodies is provided, wherein each antibody of the plurality of antibodies comprises: (a) a VH domain comprising a VH-CDR1 sequence, a VH-CDR2 sequence, and a VH-CDR3 sequence; and (b) a VL domain comprising a VL-CDR1 sequence, a VL-CDR2 sequence, and a VL-CDR3 sequence; wherein (c) the VL-CDR3 sequence is the same for each antibody of the plurality of antibodies; and (d) a VH-CDR1 sequence, a VH-CDR2 sequence, a VH-CDR3 sequence, a VL-CDR1 sequence, and a VL-CDR2 sequence are present in each of the plurality of antibodies in different combinations. In some cases, the VL-CDR3 sequences were derived from the original antibody clone.

In various aspects, one or more of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring. In various aspects, each of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring, but are present in each antibody in non-naturally occurring combinations. In various aspects, one or more of the VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences, and VL-CDR3 sequences are naturally found in a human population or derived from human CDR sequences. In various aspects, each of the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, VL-CDR2, and VL-CDR3 sequences are naturally occurring in a human population or derived from human CDR sequences.

In various aspects, the antibodies of the library can comprise non-naturally occurring combinations of naturally occurring CDRs, such as CDRs from memory B cells and naive B cells that are derived from naturally occurring but co-occurring on the same antibody that are not naturally occurring. For example, a non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, a non-naturally occurring combination of naturally occurring CDRs may comprise at least one CDR derived from a cell that is primarily a primary B cell source, while the remaining CDRs may be derived from a cell that is primarily a memory B cell source.

A non-naturally occurring combination of naturally occurring CDRs can comprise at least one CDR derived from an initial cell, while the remaining CDRs are derived from a memory cell. In some cases, at least VL-CDR1 is derived from the starting cell. In some cases, at least VL-CDR2 is derived from the starting cell. In some cases, at least VL-CDR3 is derived from the starting cell. In some cases, at least the VH-CDR1 is derived from the naive cell. In some cases, at least the VH-CDR2 is derived from the naive cell. In some cases, at least the VH-CDR3 is derived from the naive cell.

Non-naturally occurring combinations of naturally occurring CDRs can comprise two, three, four or five CDRs derived from the starting cell, while the remaining CDRs can be derived from memory cells. For example, two CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, three CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, four CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells. In another example, five CDRs from the CDRs in the following group may be derived from the starting cell: VL-CDR1, VL-CDR2, VL-CDR3, VH-CDR1, VH-CDR2 and VH-CDR3, while the remaining CDRs may be derived from memory cells.

In another non-limiting example of a non-naturally occurring combination, the VL-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VH-CDR3, VL-CDR1, and VL-CDR2 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 can be derived from a naive cell, while the VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2, and VL-CDR3 can be derived from a memory cell. In another non-limiting example of a non-naturally occurring combination, the VH-CDR3 and VL-CDR3 can be derived from naive cells, while the VH-CDR1, VH-CDR2, VL-CDR1, and VL-CDR2 can be derived from memory cells.

The VH-CDR1 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR1 sequence derived from naive B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR2 sequence derived from naive B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VH-CDR3 sequence from naive B cells. The VH-CDR3 sequence derived from naive B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from naive B cells can comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from naive B cells.

The VL-CDR1 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The initial B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The VL-CDR1 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR1 sequence from the naive B cell. The VL-CDR2 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The initial B cell-derived VL-CDR2 sequence can be a synthetic VL-CDR2 sequence. The VL-CDR2 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR2 sequence from the naive B cell. The VL-CDR3 sequence derived from the naive B cell can comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell. The initial B cell-derived VL-CDR3 sequence can be a synthetic VL-CDR3 sequence. The VL-CDR3 sequence derived from the naive B cell can comprise 100% sequence homology to the naturally occurring VL-CDR3 sequence from the naive B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of primary B-cell origin. The VH-CDR3 sequences, VL-CDR3 sequences, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of primary B cell origin. Initial B cell pools can be obtained from multiple individuals. The initial B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of the initial B cells.

The VH-CDR1 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR1 sequence derived from memory B cells can be a synthetic VH-CDR1 sequence. The VH-CDR1 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR1 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR2 sequence derived from memory B cells can be a synthetic VH-CDR2 sequence. The VH-CDR2 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR2 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells may comprise at least 80%, 85%, 90%, 95% or 99% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells. The VH-CDR3 sequence derived from memory B cells can be a synthetic VH-CDR3 sequence. The VH-CDR3 sequence derived from memory B cells may comprise 100% sequence homology with the naturally occurring VH-CDR3 sequence from memory B cells.

The memory B cell-derived VL-CDR1 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR1 sequence can be a synthetic VL-CDR1 sequence. The memory B cell-derived VL-CDR1 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR1 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR2 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR2 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise at least 80%, 85%, 90%, 95%, or 99% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell. The memory B cell-derived VL-CDR3 sequence may comprise 100% sequence homology with the naturally occurring VL-CDR3 sequence from a memory B cell.

The VH-CDR1 sequence, VH-CDR2 sequence, VH-CDR3 sequence, VL-CDR1 sequence, VL-CDR2 sequence, VL-CDR3 sequence, or any combination thereof, can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. The VH-CDR3 sequence, VL-CDR3 sequence, or combinations thereof can be derived from sequence information obtained from a pool of cells that are primarily of memory B cell origin. A pool of memory B cells can be obtained from multiple individuals. The memory B cell pool can comprise less than 0.1%, 1%, 5%, 10%, 20%, or 30% of cells that are not the source of memory B cells. The memory B cells may be CD27+ B cells. The memory B cell pool may comprise less than 0.1%, 1%, 5%, 10%, 20% or 30% of cells that are not CD27+ B cell derived.

The starting cell may be a starting B cell. The naive B cells can be human naive B cells. The memory cell may be a memory B cell. The memory B cell can be a human memory B cell. In some cases, naive B cells showed increased diversity of VH-CDR3 and VL-CDR3 sequences compared to VH-CDR3 and VL-CDR3 sequences from memory B cells (fig. 1). The starting cells and memory cells can be obtained from a biological sample, such as blood, from an individual or multiple individuals. The naive and memory cells can be physically separated from the sample using a marker specific for naive or memory cells.

Markers can be used to identify, isolate or sort B cells, naive B cells and memory B cells from a biological sample. Examples of markers for identifying, isolating or sorting B cells include, but are not limited to, CD19 +. Examples of markers for identifying, isolating or sorting naive B cells include, but are not limited to, CD19+, CD27-, IgD +, IgM +, and combinations thereof. Examples of markers for identifying, isolating or sorting memory B cells include, but are not limited to, CD19+, CD27+, and combinations thereof. In some embodiments, CD27+ is used to sort memory B cells. Examples of markers for identifying, isolating or sorting class-switching memory B cells include, but are not limited to, CD19+, CD27+, CD27+, IgD-, IgM-, and combinations thereof. Examples of markers for identifying, isolating or sorting non-transformed or marginal zone memory B cells include, but are not limited to, CD19+, CD27+, IgD +, IgM +, and combinations thereof. In some cases, memory B cells can be identified, isolated, or sorted using the following markers: CD19+, CD27+, IgD-, IgM +, and combinations thereof. The starting cell from which VH-CDR3 is derived may be a CD27-/IgM + B-cell. The memory cells from which VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3 are derived may be CD27+/IgG + B cells.

The CDR sequences of the antibody can be those found in naive B cells and memory B cells found in a single or multiple individuals. The subject may be a mammal. The mammal can be a human, a non-human primate, a mouse, a rat, a pig, a goat, a rabbit, a horse, a cow, a cat, or a dog. In some cases, the CDR sequences are CDR sequences obtained from publicly available sources. Examples of sources from which CDR sequences can be publicly obtained include SAbData (http:// sight. stats. ox. ac. uk/webpps/SAbDab-sabpred/Welcom. php) and PylgClassify (http:// duncrack 2.fccc. edu/PyIgClassify /).

In various aspects, each antibody in the antibody library can comprise the same scaffold, e.g., the same combination of framework sequences (see fig. 30). In various aspects, the VH domain of each antibody in the antibody library comprises a VH-FR1 sequence, a VH-FR2 sequence, a VH-FR3 sequence, and a VH-FR4 sequence. In some cases, each antibody in the antibody library has each of the VH-FR1 sequence, VH-FR2 sequence, VH-FR3 sequence, and VH-FR4 sequence identical. In some cases, each of the VH-FR1 sequence, VH-FR2 sequence, VH-FR3 sequence, and VH-FR4 sequence was derived from the same original antibody clone from which the CDR sequences were derived (see fig. 30). In some cases, the VH-FR1 sequence may be the same VH-FR1 sequence as the original antibody clone. In some cases, the VH-FR2 sequence may be the same VH-FR2 sequence as the original antibody clone. In some cases, the VH-FR3 sequence may be the same VH-FR3 sequence as the original antibody clone. In some cases, the VH-FR4 sequence may be the same VH-FR4 sequence as the original antibody clone. In various aspects, the VL domain of each antibody in the antibody library comprises a VL-FR1 sequence, a VL-FR2 sequence, a VL-FR3 sequence, and a VL-FR4 sequence. In some cases, each antibody in the antibody library has each of the VL-FR1 sequence, VL-FR2 sequence, VL-FR3 sequence, and VL-FR4 sequence that is the same. In some cases, each of the VL-FR1 sequence, VL-FR2 sequence, VL-FR3 sequence, and VL-FR4 sequence can be derived from the same original antibody clone from which the CDR sequences were derived (see fig. 30). In some cases, the VL-FR1 sequence may be the same VL-FR1 sequence as the original antibody clone. In some cases, the VL-FR2 sequence may be the same VL-FR2 sequence as the original antibody clone. In some cases, the VL-FR3 sequence may be the same VL-FR3 sequence as the original antibody clone. In some cases, the VL-FR4 sequence may be the same VL-FR4 sequence as the original antibody clone.

The framework of the antibody may be a naturally occurring framework. The naturally occurring framework may be a framework found in a mammal. The mammal may be a primate, mouse, rat, pig, goat, rabbit, horse, cow, cat, or dog. The primate can be a human. The framework may comprise at least one variant as compared to a naturally occurring framework. The variant may be a mutation, insertion or deletion. The variant may be a variant found in the nucleic acid sequence encoding the antibody or a variant found in the amino acid sequence of the antibody. Any suitable framework sequence may be used, such as those previously used in phase I clinical trials (fig. 12A, 12B). As used herein, the framework of an antibody may refer to the framework regions of a variable heavy chain (VH-FR1, VH-FR2, VH-FR3, and VH-FR4), the framework regions of a variable light chain (VL-FR1, VL-FR2, VL-FR3, and VL-FR4), or a combination thereof. The framework regions of the antibodies in the antibody library may be identical to the germline framework regions.

The frame may be a therapeutically optimal frame. A therapeutically optimal framework may comprise at least one, at least two, at least three, at least four, at least five or all of the following properties selected from the group consisting of: a) safety previously demonstrated in human monoclonal antibodies, b) thermostable; c) is not easy to aggregate; d) contains a single dominant allele at the amino acid level throughout the human population; e) different typical topologies containing CDRs; f) good expression in bacteria; and g) display well on phage. The framework with the safety previously demonstrated in human monoclonal antibodies can be that of antibodies already used in at least phase I clinical trials. The thermally stable frame may be a frame that is stable at least 20 ℃, 30 ℃, 40 ℃, 50 ℃, 60 ℃, 70 ℃, 80 ℃, 90 ℃, 100 ℃ or above 100 ℃. The thermally stable frame may be a frame capable of withstanding a temperature increase of at least 3 ℃, 4 ℃ or 5 ℃ per minute. A framework that expresses well in bacteria can be a framework that produces biologically active antibodies in bacteria. The bacterium may be Escherichia coli. The bacteria may be engineered bacteria. The bacterium may be a bacterium optimized for antibody expression. A good framework displayed on a phage may be one that produces a biologically active antibody when displayed on the phage surface.

An example of a strategy for selecting a framework is depicted in fig. 11, where an ideal framework for an antibody can be an antibody that exhibits structural diversity, has been successfully used in phase I clinical trials in humans, has low immunogenicity, exhibits aggregation resistance, exhibits adaptability, and is thermostable. In some cases, an antibody framework is avoided if it has an inherent autoreactivity to blood cells (e.g., IGHV4-34), has poor stability characteristics (e.g., IGHV2-5), has V genes that are not found in at least 50% of individuals (e.g., IGHV4-b), shows V genes that are prone to aggregation (e.g., IGLV6-57), or a combination thereof.

The amino acid sequence of the antibody frameworks herein can comprise more than one dominant allele, with different dominant alleles present in different human populations (fig. 13 and 14). For example, the IGHV1-3 framework comprises 3 alleles: IGVH1-3 × 01, IGVH1-3 × 02 and IGVH1-3 × 03, which were found at different frequencies in different human populations (fig. 26). In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in at least two human populations. In some cases, the amino acid sequences of the antibody frameworks described herein have a single dominant allele in all human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least two human populations. A framework having one dominant allele can be one in which one allele is found in at least 50%, at least 75%, or at least 90% of at least twelve human populations. In some cases, the framework regions of the VH domain are those from IGHJ4, IGHV1-46, IGHV1-69, IGHV3-15, or IGHV 3-23. In some cases, the framework regions of the VH domain are those from IGHV2-5, IGHV3-7, IGVH4-34, IGHV5-51, IGHV1-24, IGHV2-26, IGHV3-72, IGHV3-74, IGHV3-9, IGHV3-30, IGHV3-33, IGHV3-53, IGHV3-66, IGHV4-30-4, IGHV4-31, IGHV4-59, IGHV4-61, or IGHV 5-51. In some cases, the framework regions of the VH domains of the antibodies in the antibody library are framework regions from IGHV1-46, IGHV3-23, or a combination thereof. In some cases, the framework regions of the VL domains of the antibodies in the antibody library are framework regions from IGKV1-39, IGKV2-28, IGKV3-15, IGKV4-1, IGKV1-5, IGKV1-12, IGKV1-13, IGKV3-11, IGKV3-20, or a combination thereof. In one example, a subset of antibodies in an antibody library may have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV1-39, while the remaining antibodies in the antibody library have framework regions from the VH domain of IGHV1-46 and framework regions from the VL domain of IGKV 2-28.

In some cases, disclosed herein are nucleic acid sequences encoding the antibodies described herein. The nucleic acid sequence may be a DNA or RNA sequence. The nucleic acid may be inserted into a vector. The vector may be a bacteriophage. The phage may be a phagemid or a bacteriophage. The phagemid may be pMID 21. The bacteriophage may be DY3F63, M13 bacteriophage, fd filamentous bacteriophage, T4 bacteriophage, T7 bacteriophage, or lambda bacteriophage. In some cases, the phagemid can be introduced into the microorganism in combination with a bacteriophage (i.e., a "helper" phage). The microorganism may be a filamentous bacterium. The filamentous bacterium may be E.coli.

The antibody libraries described herein comprise a plurality of antibodies. The plurality of antibodies can be at least 1.0 x 106、1.0×107、1.0×108、1.0×109、1.0×1010、2.0×1010、3.0×1010、4.0×1010、5.0×1010、6.0×1010、7.0×1010、8.0×1010、9.0×1010Or 10.0X 1010A seed antibody. The plurality of antibodies can be at least 1.0 x 1011A seed antibody. The plurality of antibodies can be at least 7.6 × 1010A seed antibody. Due to the high diversity of such libraries, they can be unique. For example, in any library herein, at least 2% of the plurality of antibodies are presentAt least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, or at least 35% may be unique. In some cases, the library has more than 7.0 × 1010A seed antibody, wherein at least 20% of the plurality of antibodies are unique. The unique antibodies may differ by at least one nucleic acid or at least one amino acid residue relative to the other antibodies of the antibody library.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 80% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 85% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 90% of the plurality of antibodies are functional.

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 95% of the plurality of antibodies are functionalIn (1).

In some cases, the antibody library comprises at least 1.0 × 105A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.0 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional. In some cases, the antibody library comprises at least 7.6 × 1010A seed antibody, wherein at least 99% of the plurality of antibodies are functional.

In various aspects, the methods provided herein can generate antibody libraries with a high degree of diversity in one or more CDR sequences. In some cases, antibody libraries can have a high diversity in VH-CDR1 sequences. In some cases, antibody libraries can have a high diversity in VH-CDR2 sequences. In some cases, antibody libraries can have a high diversity in VH-CDR3 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR1 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR2 sequences. In some cases, antibody libraries can have a high diversity in VL-CDR3 sequences. In some cases, the antibody library may have a high diversity in CDR sequences not derived from the original antibody clone, but a low or no diversity in CDR sequences derived from the original antibody clone. In some cases, antibody libraries of high diversity can comprise at least 1 × 10 in a particular CDR3、5×103、1×104、5×104、1×105、5×105、1×106、5×106Or more unique sequences. In some cases, the antibody library may comprise a high degree of diversity in five of the six CDR sequences, e.g., the antibody library may comprise a high degree of diversity in five CDR sequences selected from: VH-CDR1 sequences, VH-CDR2 sequences, VH-CDR3 sequences, VL-CDR1 sequences, VL-CDR2 sequences and VL-CDR3 sequences. In such cases, the remaining CDR sequences may have low or no diversity. In some cases, the remaining CDR sequences of each antibody are the same CDR sequences. In some cases, the remaining CDR sequences are derived from the original antibodyAnd (4) a long ring.

In various aspects, the methods provided herein can generate at least one antibody having an improvement in at least one property as compared to the initial antibody clone. In some cases, at least one antibody in the antibody library may exhibit an improvement in thermostability as compared to the original antibody clone. For example, at least one antibody in the antibody library can exhibit a thermostability of at least 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X, 10X, 15X, 20X, 25X, 30X, 35X, 40X, 45X, 50X, 55X, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, 100X, or greater than 100X as compared to the initial antibody clone. In some cases, at least one antibody in the antibody library may exhibit greater affinity for the target epitope than the original antibody clone. For example, at least one antibody in the antibody library can exhibit an affinity for a target epitope of at least 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X, 10X, 15X, 20X, 25X, 30X, 35X, 40X, 45X, 50X, 55X, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, 100X, or greater than 100X, as compared to the original antibody clone. In some cases, at least one antibody in the antibody library may exhibit an improvement in species selectivity. For example, at least one antibody in the antibody library can exhibit an affinity for a target epitope of a particular species of at least 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X, 10X, 15X, 20X, 25X, 30X, 35X, 40X, 45X, 50X, 55X, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, 100X, or greater than 100X as compared to the initial antibody clone. In some cases, at least one antibody in the antibody library may exhibit increased species cross-reactivity compared to the original antibody clone. For example, at least one antibody in the antibody library can have high affinity for epitope a from species a (e.g., cynomolgus monkey), and can have high affinity for epitope a from species B (e.g., human). In some cases, at least one antibody in the antibody library can exhibit an affinity for epitope a from species a of at least 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X, 10X, 15X, 20X, 25X, 30X, 35X, 40X, 45X, 50X, 55X, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, 100X, or greater than 100X as compared to the initial antibody clone and an affinity for epitope a from species B of at least 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X, 10X, 15X, 20X, 25X, 30X, 35X, 40X, 45X, 50X, 55X, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, 100X, or greater than 100X as compared to the initial antibody clone. Fig. 31A and 31B depict non-limiting examples of selecting antibody clones that exhibit improved thermostability, improved affinity for epitope a from cynomolgus monkeys, and improved affinity for epitope a from human.

In various aspects, the methods described herein can generate antibodies with thermostability. In some cases, the antibodies in the antibody library can be thermostable at a temperature of about 50 ℃ to about 90 ℃. For example, the antibodies in the antibody library can be thermostable at a temperature of at least 50 ℃, 55 ℃, 60 ℃, 65 ℃, 70 ℃,75 ℃, 80 ℃, 85 ℃ or 90 ℃.

In various aspects, the methods described herein can generate antibodies with high affinity for a target epitope. For example, antibodies in an antibody library can have a dissociation constant (K.sub.m), of less than about 50nM, 25nM, 10nM, 5nM, 1nM, 900pM, 800pM, 700pM, 600pM, 500pM, 400pM, 300pM, 200pM, 100pM, 50pM, 25pM, 10pM, 5pM, 1pM, 900fM, 800fM, 700fM, 600fM, 500fM, 400fM, 300fM, 200fM, 100fM, 50fM, 25fM, 10fM, 5fM, 1fM, or less (K.sub.m)d) Binding to a target epitope.

Certain terms

The terminology used herein is for the purpose of describing particular situations only and is not intended to be limiting. In addition to the understanding of these terms by those skilled in the art, the following terms are discussed to illustrate the meaning of the terms used in this specification. As used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. It is also to be noted that the claims may be drafted to exclude any optional element. Accordingly, this statement is intended to serve as antecedent basis for use of such exclusive terminology as "solely," "only," and the like in connection with the recitation of claim elements, or use of a "negative" limitation.

Certain ranges are given herein wherein a numerical value is preceded by the term "about". The term "about" is used herein to provide literal support for the exact number following it, as well as numbers that are near or similar to the number following the term. In determining whether a number is near or approximate to a specifically recited number, a near or approximate non-recited number can be a number that provides a substantial equivalent of the specifically recited number in the context of the given number. Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the methods and compositions described herein. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the methods and compositions described herein, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the methods and compositions described herein.

The terms "individual", "patient" or "subject" are used interchangeably. None of these terms require or are limited to situations characterized by supervision (e.g., on a continuous or intermittent basis) by a health care worker (e.g., a doctor, a registered nurse, a practicing nurse, a physician's assistant, a caregiver, or a attending care worker). Furthermore, these terms refer to a human or animal subject.

"treating" or "treatment" refers to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to prevent or slow down (lessen) the targeted pathological condition or disorder. Subjects in need of treatment include subjects already having the disorder, as well as subjects susceptible to the disorder, or subjects in whom the disorder is to be prevented.

The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., comprising immunospecific binding to an antigenA molecule of an antigen binding site. The term also refers to antibodies consisting of two immunoglobulin heavy chains and two immunoglobulin light chains, as well as forms including full-length antibodies and portions thereof; including, for example, immunoglobulin molecules, polyclonal antibodies, monoclonal antibodies, recombinant antibodies, chimeric antibodies, humanized antibodies, CDR-grafted antibodies, F (ab)2、Fv、scFv、IgGΔCH2、F(ab')2、scFv2CH3F (ab), VL, VH, scFv4, scFv3, scFv2, dsFv, Fv, scFv-Fc, (scFv)2, disulfide linked Fv, single domain antibody (dAb), diabody, multispecific antibody, bispecific antibody, anti-idiotypic antibody, bispecific antibody, any isotype (including but not limited to IgA, IgD, IgE, IgG, or IgM), modified antibody, and synthetic antibody (including but not limited to non-depleting IgG antibodies, T antibodies, or other Fc or Fab variants of antibodies).

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the methods and compositions described herein belong. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the methods and compositions described herein, representative exemplary methods and materials are now described.

Examples

Example 1: generation of SuperHuman library (SHL)2.0

The following procedure was used to generate the SuperHuman library:

1. the optimal 4 VH frameworks and the optimal 4 VK frameworks from the human repertoire (human reptotoire) of 3500 combinations (IGHV1-46, IGHV1-69, IGHV3-15, IGHV3-23 for the heavy chain, and IGKV1-39, IGKV2-28, IGKV3-15, IGKV4-1 for the light chain) were selected based on the following combinations: 1) safety previously demonstrated in human mabs, 2) thermostability; 3) is not easy to aggregate; 4) a single dominant allele in the framework at the amino acid level in all human populations (i.e., non-ethnic drugs); 5) different typical topologies of CDRs; 6) good expression in bacteria and good display on phage.

2. Blood was obtained from 140 subjects.

3. Primary (CD27-/IgM +) cells and memory (CD27+/IgG +) cells were sorted from the blood.

4. Pools were checked for quality using Next Generation Sequencing (NGS) and pools with problematic diversity or biochemical trends were discarded.

5. The VH-CDR3 sequences from the naive cells were PCR amplified using universal primers.

6. VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3 sequences from memory cells were PCR amplified using framework specific primers.

7. The frames are defined as synthetically produced germline segments.

8. The nucleic acid library was assembled using PCR-OE.

9. NGS sequencing was used to check the quality of the assembly from step 8.

10. The light chain was cloned into a vector with a filled VH.

11. In-frame materials were selected by using protein a or protein L after thermal compression.

12. Heavy chains were cloned into the vector to replace the filling VH.

13. The vector generated at the end of step 12 is used to transform the microorganism using electroporation.

Example 2: screening for affinity to PD1

Primary screening was performed for clones from two 96-well plates randomly selected after 4 th round of superfheman panning of PD 1.

Bypassing the ELISA screen samples were immediately assayed on a cartera high throughput kinetics instrument (fig. 3). Most samples (hits) were positive and 98 out of 184 were unique.

Clones showing affinity for PD1 were demonstrated for human and cynomolgus PD1 (fig. 4), (fig. 5).

Example 3: bGal ELISA and Sanger screening of 2 plate antibody clones

Antibody clones from both plates were panning by ELISA against bGal (fig. 2).

Sanger sequencing was also performed on these clones (fig. 8). The extreme diversity of the third round of output ensures that the number of samples for any epitope can be recovered by screening some clones of the 96-well plate.

Diversity was found not only in the VH-CDR3(CDR-H3) sequence, but also in the VH-CDR1(CDR-H1) and VH-CDR2(CDR-H2) sequences (FIG. 10). Example 4: combinatorial design and selection methods to generate antibody libraries with diverse VH and VK sequences

Functional selection for expression and thermostability was applied during construction to generate libraries with more than 95% functional diversity among 4000 million light chains. Using 7.6X 1010Individual transformants created antibody libraries.

First, a VK (kappa light chain) library was generated by cloning the desired light chain and temporary stuffer VH sequences into a vector. The VK library is displayed and subjected to thermal stress at temperatures exceeding 65 ℃. Protein A/L in-frame material was used. The filling VH sequences in the library generated by protein a/L selection were replaced with the target VH sequences (fig. 17).

Example 5: generation of SuperHuman library (SHL)3.0

The following procedure was used to generate the SuperHuman library:

1. six antibody frameworks (IGHV1-46, IGHV3-23, IGKV1-39, IGKV2-28, IGKV3-15 and IGKV4-1) were selected based on the following combinations: 1) safety previously demonstrated in human mabs, 2) thermostability, 3) not prone to aggregation, 4) single dominant allele in the framework at the amino acid level in all human populations (i.e. non-ethnic drugs), 5) different canonical topologies of CDRs; 6) good expression in bacteria and good display on phage.

2. Blood was obtained from 50-100 subjects.

3. Primary (CD27-/IgM + or CD27-/IgD +) cells and memory cells are sorted from the blood.

4. Pools were checked for quality using Next Generation Sequencing (NGS) and pools with problematic diversity or biochemical trends were discarded.

5. The VH-CDR3 sequences from the naive cells were PCR amplified using universal primers.

6. Advantageous VH-CDR1, VH-CDR2, VL-CDR1, VL-CDR2 and VL-CDR3 sequences were selected by DNA synthesis without tilt based on: (1) presence in human native antibodies was observed, (2) no suboptimal performance in the choice of superfuman 2.0 for multiple antigens was observed, (3) no biochemical predisposition (C, exposed M, deamination site, acid hydrolysis site, N-linked glycosylation site, amber stop codon, opal stop codon, high positive charge), (4) mutations did not exceed the threshold (e.g., no more than 3 amino acid mutations per CDR). In other words, VH-CDR1, VHCDR2, VL-CDR1, VL-CDR2 and VL-CDR3 sequences were synthesized if they met the following criteria:

a) (ii) having no more than 4 amino acid mutations compared to the respective germline CDRs for each framework used; and

b) was identified as present in at least 2 subjects during NGS and enriched without an adaptive disadvantage when evaluating pools of 55,000 samples for 11 antigens from SuperHuman 2.0 (example 1) or not observed in humans but sufficiently enriched when panning in the same SuperHuman 2.0 pool; and

c) does not contain any biochemical propensity (N-linked glycosylation, deamination, acid hydrolysis, positively charged intein cleavage, free cysteine, free methionine, alternative stop codon, cryptic splice site, tev cleavage site or overly positively charged CDR).

7. The framework was defined as a 100% germline segment without mutations that was produced synthetically.

8. The nucleic acid library is assembled using PCR-OE or another DNA assembly method.

9. NGS sequencing was used to check the quality of the assembly from step 8.

10. Cloning of light chains into vectors with filled VH

11. In-frame materials were selected by using protein a or protein L after thermal compression.

12. Heavy chains were cloned into the vector to replace the filling VH.

13. The vector generated at the end of step 12 is used to transform the microorganism using electroporation.

While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.

73页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:仿羊毛合成复丝纱线

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!