Variant nucleic acid libraries for antibody optimization

文档序号:1957764 发布日期:2021-12-10 浏览:13次 中文

阅读说明:本技术 用于抗体优化的变异核酸文库 (Variant nucleic acid libraries for antibody optimization ) 是由 亚伦·萨托 于 2020-02-26 设计创作,主要内容包括:本文提供了与优化抗体的文库有关的方法和组合物,所述优化抗体的文库具有编码包含修饰序列的抗体的核酸。本文所述的文库包括斑驳化的文库,其包含各自编码至少一个预定参考核酸序列的预定变体的核酸。本文进一步描述了当翻译所述核酸文库时生成的蛋白质文库。本文进一步描述了表达本文所述的斑驳化核酸文库的细胞文库。(Provided herein are methods and compositions related to libraries of optimized antibodies having nucleic acids encoding antibodies comprising modified sequences. The libraries described herein include variegated libraries comprising nucleic acids that each encode a predetermined variant of at least one predetermined reference nucleic acid sequence. Further described herein are protein libraries generated when translating the nucleic acid libraries. Further described herein are libraries of cells expressing the variegated nucleic acid libraries described herein.)

1. A nucleic acid library, comprising:

a plurality of sequences comprising nucleic acids that when translated encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within a CDR relative to an input sequence of the antibody;

wherein the library comprises at least 50,000 variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency;

and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence.

2. The nucleic acid library of claim 1, wherein the library comprises at least 100,000 variant sequences.

3. The nucleic acid library of claim 1, wherein at least some of the sequences encode antibody light chains.

4. The nucleic acid library of claim 1, wherein at least some of the sequences encode an antibody heavy chain.

5. The nucleic acid library of claim 1, wherein each sequence in the plurality of sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to the input sequence.

6. The nucleic acid library of claim 1, wherein each sequence of the plurality of sequences comprises at least two mutations in a CDR of a heavy chain or a light chain relative to the input sequence.

7. The nucleic acid library of claim 1, wherein at least one of said mutations is present in at least two individuals.

8. The nucleic acid library of claim 1, wherein at least one of said mutations is present in at least three individuals.

9. The nucleic acid library of claim 1, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 5-fold greater than the binding affinity of the input sequence.

10. The nucleic acid library of claim 1, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 25-fold greater than the binding affinity of the input sequence.

11. The nucleic acid library of claim 1, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 50-fold greater than the binding affinity of the input sequence.

12. The nucleic acid library of claim 1, wherein each sequence of the plurality of sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to a germline sequence of the input sequence.

13. The nucleic acid library of any one of claims 1-12, wherein the CDRs are CDR1, CDR2 and CDR3 on a heavy chain.

14. The nucleic acid library of any one of claims 1-13, wherein the CDRs are CDR1, CDR2 and CDR3 on a light chain.

15. The nucleic acid library of claim 1, wherein the at least one sequence, when translated, encodes an antibody or antibody fragment that binds with at least 70-fold greater affinity than the input sequence.

16. The nucleic acid library of claim 1, wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 50 nM.

17. The nucleic acid library of claim 1, wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 25 nM.

18. The nucleic acid library of claim 1, wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 10 nM.

19. The nucleic acid library of claim 1, wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 5 nM.

20. The nucleic acid library of claim 1, wherein the library comprises CDR sequences of any one of SEQ ID NOs 1-6 or 9-70.

21. The nucleic acid library of claim 1, wherein the library comprises a CDRH1, CDRH2 or CDRH3 sequence of any one of SEQ ID NOs 1-6 or 9-70.

22. The nucleic acid library of claim 1, wherein the library comprises at least one encoding with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1.

23. The nucleic acid library of claim 1, wherein the library comprises at least one encoding with a K of less than 5nMDA sequence of an antibody or antibody fragment that binds to PD-1.

24. The nucleic acid library of claim 1, wherein the library comprises at least five encoding with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1.

25. The nucleic acid library of claim 1, wherein the library comprises at least 100,000 variant sequences.

26. An antibody, wherein the antibody comprises the sequence of any one of SEQ ID NOs 1-6 or 9-70.

27. An antibody, wherein the antibody comprises the sequence of any one of SEQ ID NOs 1-6 or 9-34; and wherein the antibody is a monoclonal antibody, a polyclonal antibody, a bispecific antibody, a multispecific antibody, a grafted antibody, a human antibody, a humanized antibody, a synthetic antibody, a chimeric antibody, a camelized antibody, a single chain Fv (scFv), a single chain antibody, a Fab fragment, a F (ab')2 fragment, a Fd fragment, a Fv fragment, a single domain antibody, an isolated Complementarity Determining Region (CDR), a diabody, a fragment consisting of only a single monomeric variable domain, a disulfide-linked Fv (sdFv), an intrabody, an anti-idiotypic (anti-Id) antibody, or an antigen-binding fragment thereof.

28. A method of inhibiting PD-1 activity comprising administering the antibody of claim 26 or 27.

29. A method for treating a proliferative disorder comprising administering to a subject in need thereof the antibody of claim 26 or 27.

30. The method of claim 29, wherein the proliferative disorder is cancer.

31. The method of claim 29, wherein the cancer is lung cancer, head and neck squamous cell carcinoma, colorectal cancer, melanoma, liver cancer, classical hodgkin's lymphoma, kidney cancer, stomach cancer, cervical cancer, merkel cell carcinoma, B-cell lymphoma, or bladder cancer.

32. A computerized system for antibody optimization, comprising:

(a) a general-purpose computer; and

(b) a computer-readable medium containing functional modules comprising instructions for the general purpose computer, wherein the computerized system is configured to operate in a method comprising:

(i) receiving an operational instruction, wherein the operational instruction comprises a plurality of sequences encoding an antibody or antibody fragment;

(ii) generating a nucleic acid library comprising the plurality of sequences comprising nucleic acids that, when translated, encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within CDRs relative to an input sequence of the antibody; wherein the library comprises at least 50,000 variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency; and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence; and

(iii) synthesizing the at least 50,000 variant sequences.

33. The system of claim 32, wherein the nucleic acid library comprises at least 100,000 sequences.

34. The system of claim 32, wherein the system further comprises enriching a subset of the variant sequences.

35. The system of claim 32, wherein the system further comprises expressing an antibody or antibody fragment corresponding to the variant sequence.

36. The system of claim 32, wherein the polynucleotide sequence is a murine, human, or chimeric antibody sequence.

37. The system of claim 32, wherein each sequence of the plurality of variant sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to the input sequence.

38. The system of claim 32, wherein each sequence of the plurality of variant sequences comprises at least two mutations in a CDR of a heavy chain or a light chain relative to the input sequence.

39. The system of claim 32, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 5-fold greater than the binding affinity of the input sequence.

40. The system of claim 32, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 25-fold greater than the binding affinity of the input sequence.

41. The system of claim 32, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 50-fold greater than the binding affinity of the input sequence.

42. The system of claim 32, wherein each sequence of the plurality of variant sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to a germline sequence of the input sequence.

43. The system of any one of claims 32-42, wherein the CDRs are CDR1, CDR2, and CDR3 on a heavy chain.

44. The system of any one of claims 32-43, wherein the CDRs are CDR1, CDR2, and CDR3 on a light chain.

45. The system of claim 32, wherein the antibody library has at least 1012Theoretical diversity of individual sequences.

46. The system of claim 32, wherein the antibody library has at least 1013Theoretical diversity of individual sequences.

47. A method of optimizing an antibody, comprising:

(a) providing a plurality of polynucleotide sequences encoding an antibody or antibody fragment;

(b) generating a nucleic acid library comprising the plurality of sequences comprising nucleic acids that, when translated, encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within CDRs relative to an input sequence of the antibody; wherein the library comprises at least 50,000 variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency; and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence; and

(c) synthesizing the at least 50,000 variant sequences.

48. The method of claim 47, wherein the antibody library comprises at least 100,000 sequences.

49. The method of claim 47, wherein the method further comprises enriching a subset of the variant sequences.

50. The method of claim 47, wherein the method further comprises expressing an antibody or antibody fragment corresponding to the variant sequence.

51. The method of claim 47, wherein the polynucleotide sequence is a murine, human, or chimeric antibody sequence.

52. The method of claim 47, wherein each sequence of the plurality of variant sequences comprises at least one mutation in each CDR of a heavy chain or a light chain relative to the input sequence.

53. The method of claim 47, wherein each sequence of the plurality of variant sequences comprises at least two mutations in each CDR of a heavy chain or a light chain relative to the input sequence.

54. The method of claim 47, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 5-fold greater than the binding affinity of the input sequence.

55. The method of claim 47, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 25-fold greater than the binding affinity of the input sequence.

56. The method of claim 47, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 50-fold greater than the binding affinity of the input sequence.

57. The method of claim 47, wherein each sequence comprises at least one mutation in each CDR of the heavy or light chain relative to the germline sequence of the input sequence.

58. The method of claim 47, wherein the nucleic acid library has at least 1012Theoretical diversity of individual sequences.

59. The method of claim 47, wherein the nucleic acid library has at least 1013Theoretical diversity of individual sequences.

Background

Antibodies have the ability to bind biological targets with high specificity and affinity. However, the design of therapeutic antibodies is challenging due to the balance of immune effects and efficacy. Therefore, there is a need to develop compositions and methods for optimizing antibody properties.

Is incorporated by reference

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

Disclosure of Invention

Provided herein are methods, compositions, and systems for optimizing antibodies.

Provided herein are nucleic acid libraries comprising a plurality of sequences encoding antibodies or antibody fragments, wherein each sequence of the plurality of sequences comprises a predetermined number of mutations relative to an input sequence; the library comprises at least 5,000 variant sequences, wherein each of the at least 5,000 variant sequences is present in an amount that is no more than 50% of the amount of any other variant sequence in the library; and at least one sequence encodes an antibody or antibody fragment having a higher binding affinity than the input sequence. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 50,000 variant sequences. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 100,000 variant sequences. Further provided herein are nucleic acid libraries wherein at least some of the sequences encode an antibody light chain. Further provided herein are nucleic acid libraries, wherein at least some of the sequences encode an antibody heavy chain. Further provided herein are nucleic acid libraries, wherein each sequence of the plurality of sequences comprises at least one mutation in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are nucleic acid libraries, wherein each sequence of the plurality of sequences comprises at least two mutations in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are nucleic acid libraries, wherein at least one of the mutations is present in at least two individuals. Further provided herein are nucleic acid libraries, wherein at least one of the mutations is present in at least three individuals. Further provided herein are nucleic acid libraries, wherein each sequence of the plurality of sequences comprises at least one mutation in each CDR of the heavy or light chain relative to the germline sequence of the input sequence.

Provided herein are antibodies, wherein the antibodies comprise CDR-H3, which CDR-H3 comprises the sequence of any one of SEQ ID NOs 1-35. Provided herein are antibodies, wherein the antibodies comprise CDR-H3, which CDR-H3 comprises the sequence of any one of SEQ ID NOs 1-35; and wherein the antibody is a monoclonal antibody, a polyclonal antibody, a bispecific antibody, a multispecific antibody, a grafted antibody, a human antibody, a humanized antibody, a synthetic antibody, a chimeric antibody, a camelized antibody, a single chain Fv (scFv), a single chain antibody, a Fab fragment, a F (ab')2 fragment, a Fd fragment, a Fv fragment, a single domain antibody, an isolated Complementarity Determining Region (CDR), a diabody, a fragment consisting of only a single monomeric variable domain, a disulfide-linked Fv (sdFv), an intrabody, an anti-idiotypic (anti-Id) antibody, or an antigen-binding fragment thereof.

Provided herein are methods of inhibiting PD-1 activity comprising administering an antibody as described herein. Provided herein are methods of treating a proliferative disorder comprising administering to a subject in need thereof an antibody as described herein. Further provided herein are methods of treating a proliferative disorder, wherein the proliferative disorder is cancer. Further provided herein are methods of treating a proliferative disorder, wherein the cancer is lung cancer, head and neck squamous cell carcinoma, colorectal cancer, melanoma, liver cancer, classical hodgkin's lymphoma, kidney cancer, stomach cancer, cervical cancer, merkel cell carcinoma, B-cell lymphoma, or bladder cancer.

Provided herein is a nucleic acid library comprising a plurality of nucleic acids, wherein each nucleic acid in the plurality encodes a sequence that, when translated, encodes an antibody, wherein the antibody comprises a CDR-H3 loop comprising a PD-1 binding domain, and wherein each nucleic acid in the plurality comprises a sequence encoding a sequence variant of the PD-1 binding domain. Further provided herein are nucleic acid libraries, wherein the CDR-H3 loop, when translated, is about 20 to about 80 amino acids in length. Further provided herein are nucleic acid libraries, wherein the CDR-H3 loop is about 80 to about 230 base pairs in length. Further provided herein are nucleic acid libraries, wherein the antibodies further comprise one or more domains selected from the group consisting of a light chain variable domain (VL), a heavy chain variable domain (VH), a light chain constant domain (CL), and a heavy chain constant domain (CH). Further provided herein are nucleic acid libraries, wherein the VH domain is about 90 to about 100 amino acids in length. Further provided herein are nucleic acid libraries, wherein the VL domain is about 90 to about 120 amino acids in length. Further provided herein are nucleic acid libraries, wherein the VH domain is about 280 to about 300 base pairs in length. Further provided herein are nucleic acid libraries, wherein the VL domain is about 300 to about 350 base pairs in length. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 1010A non-identical nucleic acid. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 1012A non-identical nucleic acid. Further provided herein are nucleic acid libraries, wherein the antibodies comprise a single immunoglobulin domain. Further provided herein are nucleic acid libraries, wherein the antibodies comprise peptides of up to 100 amino acids. Further provided herein are nucleic acid libraries, wherein the PD-1 binding domain comprises a peptidomimetic or a small molecule mimetic.

Provided herein is a protein library comprising a plurality of proteins, wherein each protein in the plurality of proteins comprises an antibody, wherein the antibody comprises a CDR-H3 loop comprising a sequence variant of a PD-1 binding domain. Further provided herein are protein libraries comprising a plurality of proteins, wherein the CDR-H3 loop is about 20 to about 80 amino acids in length. Further provided herein is a protein library comprising a plurality of proteins, wherein the antibody further comprises one or more domains selected from the group consisting of a light chain variable domain (VL), a heavy chain variable domain (VH), a light chain constant domain (CL), and a heavy chain constant domain (CH). Further provided herein are protein libraries comprising a plurality of proteins, wherein the VH domain is about 90 to about 100 amino acids in length. Further provided herein are protein libraries comprising a plurality of proteins, wherein the VL domain is about 90 to about 120 amino acids in length. Further provided herein are protein libraries comprising a plurality of proteins, wherein the plurality of proteins are used to generate a peptidomimetic library. Further provided herein is a protein library comprising a plurality of proteins, wherein the protein library comprises antibodies.

Provided herein is a protein library comprising a plurality of proteins, wherein each protein in the plurality of proteins comprises a sequence encoding a different PD-1 binding domain, and wherein each PD-1 binding domain is about 20 to about 80 amino acids in length. Further provided herein is a protein library comprising a plurality of proteins, wherein the protein library comprises peptides. Further provided herein is a protein library comprising a plurality of proteins, wherein the protein library comprises immunoglobulins. Further provided herein is a protein library comprising a plurality of proteins, wherein the protein library comprises antibodies. Further provided herein are protein libraries comprising a plurality of proteins, wherein the plurality of proteins are used to generate a peptidomimetic library.

Provided herein are vector libraries comprising a nucleic acid library as described herein. Provided herein are cell libraries comprising a nucleic acid library as described herein. Provided herein are cell libraries comprising a protein library as described herein.

Provided herein are nucleic acid libraries comprising a plurality of sequences encoding antibodies or antibody fragments, wherein each sequence of the plurality of sequences comprises a predetermined number of mutations relative to an input sequence; the library comprises at least 30,000 variationsA heterologous sequence; and at least some of the antibodies or antibody fragments have a K of less than 50nMDBinds to PD-1. Further provided herein is a nucleic acid library, wherein the library comprises the CDR sequences of any one of SEQ ID NOs 1-35. Further provided herein is a nucleic acid library, wherein the library comprises the CDRH1, CDRH2, or CDRH3 sequences of any one of SEQ ID NOs 1-35. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least one encoding with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least one encoding with a K of less than 5nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least five nucleic acid molecules encoding a polypeptide with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 50,000 variant sequences. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 100,000 variant sequences.

Provided herein is a computerized system for antibody optimization, comprising: (a) a general-purpose computer; and (b) a computer readable medium containing functional modules comprising instructions for the general purpose computer, wherein the computerized system is configured to operate in a method comprising: (I) receiving an instruction for manipulation, wherein the instruction for manipulation comprises a polynucleotide sequence encoding an antibody or antibody fragment; (ii) generating an antibody library, wherein the antibody library comprises a plurality of variant sequences of the polynucleotide sequence; and (iii) synthesizing the plurality of variant sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library comprises at least 30,000 sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library comprises at least 50,000 sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library comprises at least 100,000 sequences. Further provided herein is a computerized system for antibody optimization, wherein the system further comprises enriching the variationA subset of sequences. Further provided herein are computerized systems for antibody optimization, wherein the systems further comprise expressing antibodies or antibody fragments corresponding to the variant sequences. Further provided herein are computerized systems for antibody optimization, wherein the polynucleotide sequences are murine, human, or chimeric antibody sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library comprises variant sequences that are each present in an amount that is no more than 50% of the amount of any other variant sequence in the antibody library. Further provided herein is a computerized system for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least one mutation in each CDR of the heavy chain or the light chain relative to the input sequence. Further provided herein are computerized systems for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least two mutations in each CDR of the heavy or light chain relative to the input sequence. Further provided herein is a computerized system for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least one mutation in each CDR of the heavy or light chain relative to the germline sequence of the input sequence. Further provided herein is a computerized system for antibody optimization, wherein the antibody library has at least 1012Theoretical diversity of individual sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library has at least 1013Theoretical diversity of individual sequences.

Provided herein are methods of optimizing an antibody comprising: (a) providing a polynucleotide sequence encoding an antibody or antibody fragment; (b) generating an antibody library, wherein the antibody library comprises a plurality of variant sequences of the polynucleotide sequence; and (c) synthesizing the plurality of variant sequences. Further provided herein are methods of optimizing antibodies, wherein the antibody library comprises at least 30,000 sequences. Further provided herein are methods of optimizing antibodies, wherein the antibody library comprises at least 50,000 sequences. Further provided herein are methods of optimizing antibodies, wherein the antibody library comprises at least 100,000 sequences. Further provided herein are optimized antibodiesThe method of (a), wherein the method further comprises enriching a subset of the variant sequences. Further provided herein are methods of optimizing an antibody, wherein the methods further comprise expressing an antibody or antibody fragment corresponding to the variant sequence. Further provided herein are methods of optimizing an antibody, wherein the polynucleotide sequence is a murine, human, or chimeric antibody sequence. Further provided herein are methods of optimizing antibodies, wherein the antibody library comprises variant sequences that are each present in an amount that is no more than 50% of the amount of any other variant sequence in the antibody library. Further provided herein are methods of optimizing an antibody, wherein each sequence of the plurality of variant sequences comprises at least one mutation in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are methods of optimizing an antibody, wherein each sequence of the plurality of variant sequences comprises at least two mutations in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are methods of optimizing an antibody, wherein each sequence comprises at least one mutation in each CDR of the heavy or light chain relative to the germline sequence of the input sequence. Further provided herein are methods of optimizing antibodies, wherein the antibody library has at least 1012Theoretical diversity of individual sequences. Further provided herein are methods of optimizing antibodies, wherein the antibody library has at least 1013Theoretical diversity of individual sequences.

Provided herein are nucleic acid libraries comprising: a plurality of sequences comprising nucleic acids that when translated encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within a CDR relative to an input sequence of the antibody; wherein the library comprises at least 50,000 variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency; and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 100,000 variant sequences. Further provided herein are nucleic acid libraries wherein at least some of the sequences encode an antibody light chain. Book (I)Further provided herein are nucleic acid libraries wherein at least some of the sequences encode an antibody heavy chain. Further provided herein are nucleic acid libraries, wherein each sequence in the plurality of sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to the input sequence. Further provided herein are nucleic acid libraries, wherein each sequence in the plurality of sequences comprises at least two mutations in the CDRs of the heavy or light chain relative to the input sequence. Further provided herein are nucleic acid libraries, wherein at least one of the mutations is present in at least two individuals. Further provided herein are nucleic acid libraries, wherein at least one of the mutations is present in at least three individuals. Further provided herein are nucleic acid libraries wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 5-fold greater than the binding affinity of the input sequence. Further provided herein are nucleic acid libraries wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 25-fold greater than the binding affinity of the input sequence. Further provided herein are nucleic acid libraries wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 50-fold greater than the binding affinity of the input sequence. Further provided herein are nucleic acid libraries, wherein each sequence of the plurality of sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to the germline sequence of the input sequence. Further provided herein are nucleic acid libraries, wherein the CDRs are CDR1, CDR2, and CDR3 on a heavy chain. Further provided herein are nucleic acid libraries, wherein the CDRs are CDR1, CDR2, and CDR3 on a light chain. Further provided herein are nucleic acid libraries wherein the at least one sequence, when translated, encodes an antibody or antibody fragment that binds with at least 70-fold greater affinity than the input sequence. Further provided herein are nucleic acid libraries wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 50 nM. Further provided herein are nucleic acid libraries wherein the at least one sequence encodes K when translatedDIs less than25nM antibody or antibody fragment. Further provided herein are nucleic acid libraries wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 10 nM. Further provided herein are nucleic acid libraries wherein the at least one sequence encodes K when translatedDAn antibody or antibody fragment of less than 5 nM. Further provided herein are nucleic acid libraries, wherein the libraries comprise CDR sequences of any one of SEQ ID NOs 1-6 or 9-70. Further provided herein are nucleic acid libraries comprising the CDRH1, CDRH2 or CDRH3 sequences of any one of SEQ ID NOs 1-6 or 9-70. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least one encoding with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least one encoding with a K of less than 5nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least five nucleic acid molecules encoding a polypeptide with a K of less than 10nMDA sequence of an antibody or antibody fragment that binds to PD-1. Further provided herein are nucleic acid libraries, wherein the libraries comprise at least 100,000 variant sequences.

Provided herein are antibodies, wherein the antibodies comprise the sequence of any one of SEQ ID NOs 1-6 or 9-70. Provided herein are antibodies, wherein the antibodies comprise a sequence of any one of SEQ ID NOs 1-6 or 9-34; and wherein the antibody is a monoclonal antibody, a polyclonal antibody, a bispecific antibody, a multispecific antibody, a grafted antibody, a human antibody, a humanized antibody, a synthetic antibody, a chimeric antibody, a camelized antibody, a single chain Fv (scFv), a single chain antibody, a Fab fragment, a F (ab')2 fragment, a Fd fragment, a Fv fragment, a single domain antibody, an isolated Complementarity Determining Region (CDR), a diabody, a fragment consisting of only a single monomeric variable domain, a disulfide-linked Fv (sdFv), an intrabody, an anti-idiotypic (anti-Id) antibody, or an antigen-binding fragment thereof.

Provided herein are methods of inhibiting PD-1 activity comprising administering an antibody described herein. Provided herein are methods of treating a proliferative disorder comprising administering to a subject in need thereof an antibody described herein. Further provided herein are methods, wherein the proliferative disorder is cancer. Further provided herein are methods, wherein the cancer is lung cancer, head and neck squamous cell carcinoma, colorectal cancer, melanoma, liver cancer, classical hodgkin's lymphoma, kidney cancer, stomach cancer, cervical cancer, merkel cell carcinoma, B-cell lymphoma, or bladder cancer.

Provided herein is a computerized system for antibody optimization, comprising: (a) a general-purpose computer; and (b) a computer readable medium containing functional modules comprising instructions for the general purpose computer, wherein the computerized system is configured to operate in a method comprising: (i) receiving an operational instruction, wherein the operational instruction comprises a plurality of sequences encoding an antibody or antibody fragment; (ii) generating a nucleic acid library comprising the plurality of sequences comprising nucleic acids that, when translated, encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within CDRs relative to an input sequence of the antibody; wherein the library comprises at least 50,000 variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency; and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence; and (iii) synthesizing the at least 50,000 variant sequences. Further provided herein is a computerized system for antibody optimization, wherein the nucleic acid library comprises at least 100,000 sequences. Further provided herein are computerized systems for antibody optimization, wherein the systems further comprise enriching a subset of the variant sequences. Further provided herein are computerized systems for antibody optimization, wherein the systems further comprise expressing antibodies or antibody fragments corresponding to the variant sequences. Further provided herein are computerized systems for antibody optimization, wherein the polynucleotide sequences are murine, human, or chimeric antibody sequences. Further provided herein is a computerized system for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to the input sequence. Book (I)Further provided herein is a computerized system for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least two mutations in the CDRs of the heavy or light chain relative to the input sequence. Further provided herein is a computerized system for antibody optimization, wherein at least one sequence encodes upon translation an antibody or antibody fragment having a binding affinity at least 5-fold that of the input sequence. Further provided herein is a computerized system for antibody optimization, wherein at least one sequence encodes upon translation an antibody or antibody fragment having a binding affinity at least 25-fold that of the input sequence. Further provided herein is a computerized system for antibody optimization, wherein at least one sequence encodes upon translation an antibody or antibody fragment having a binding affinity at least 50-fold that of the input sequence. Further provided herein is a computerized system for antibody optimization, wherein each sequence of the plurality of variant sequences comprises at least one mutation in a CDR of a heavy chain or a light chain relative to a germline sequence of the input sequence. Further provided herein are computerized systems for antibody optimization, wherein the CDRs are CDR1, CDR2, and CDR3 on the heavy chain. Further provided herein are computerized systems for antibody optimization, wherein the CDRs are CDR1, CDR2, and CDR3 on the light chain. Further provided herein is a computerized system for antibody optimization, wherein the antibody library has at least 1012Theoretical diversity of individual sequences. Further provided herein is a computerized system for antibody optimization, wherein the antibody library has at least 1013Theoretical diversity of individual sequences.

Provided herein are methods of optimizing an antibody comprising: (A) providing a plurality of polynucleotide sequences encoding an antibody or antibody fragment; (b) generating a nucleic acid library comprising the plurality of sequences comprising nucleic acids that, when translated, encode an antibody or antibody fragment, wherein each sequence comprises a predetermined number of mutations within CDRs relative to an input sequence of the antibody; wherein the library comprises at least 50,000Variant sequences, each variant sequence being present in an amount within 1.5 times the average frequency; and wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 2.5 times the binding affinity of the input sequence; and (c) synthesizing the at least 50,000 variant sequences. Further provided herein are methods of optimizing antibodies, wherein the antibody library comprises at least 100,000 sequences. Further provided herein are methods of optimizing an antibody, wherein the methods further comprise enriching a subset of the variant sequences. Further provided herein are methods of optimizing an antibody, wherein the methods further comprise expressing an antibody or antibody fragment corresponding to the variant sequence. Further provided herein are methods of optimizing an antibody, wherein the polynucleotide sequence is a murine, human, or chimeric antibody sequence. Further provided herein are methods of optimizing an antibody, wherein each sequence of the plurality of variant sequences comprises at least one mutation in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are methods of optimizing an antibody, wherein each sequence of the plurality of variant sequences comprises at least two mutations in each CDR of the heavy or light chain relative to the input sequence. Further provided herein are methods of optimizing an antibody, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 5-fold greater than the binding affinity of the input sequence. Further provided herein are methods of optimizing an antibody, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 25-fold greater than the binding affinity of the input sequence. Further provided herein are methods of optimizing an antibody, wherein at least one sequence, when translated, encodes an antibody or antibody fragment that has a binding affinity that is at least 50-fold greater than the binding affinity of the input sequence. Further provided herein are methods of optimizing an antibody, wherein each sequence comprises at least one mutation in each CDR of the heavy or light chain relative to the germline sequence of the input sequence. Further provided herein are methods of optimizing antibodies, wherein the nucleic acids areThe library has at least 1012Theoretical diversity of individual sequences. Further provided herein are methods of optimizing antibodies, wherein the nucleic acid library has at least 1013Theoretical diversity of individual sequences.

Drawings

Figure 1 depicts the workflow of antibody optimization.

Figure 2A depicts a first schematic of an immunoglobulin scaffold.

Figure 2B depicts a second schematic of an immunoglobulin scaffold.

Fig. 3A depicts an exemplary sequence of optimized antibody input sequences in a library, showing the number of mutations in the sequence regions.

Figure 3B shows the workflow of antibody optimization.

Figure 4A depicts the read length of the variable heavy chain after 1-5 rounds of panning.

Figure 4B depicts the cloning frequency of the variable heavy chain after 1-5 rounds of panning.

Figure 4C depicts clonal accumulation of variable heavy chains after 1-5 rounds of panning.

Figure 4D is a graphical representation of mutation numbers versus different panning conditions.

FIG. 4E is a graphical representation of enriched clones for sequence analysis in connection with binding to PD-1.

Figure 5A is a graphical representation of anti-scFv ELISA versus enrichment at the fifth round of panning.

FIG. 5B is a schematic representation of anti-scFv ELISA binding to PD-1.

Figure 6A depicts the sequence of optimized IgG binding to comparator PD-1.

Figure 6B depicts the increased affinity of the optimized antibody (4.5nM) compared to the parent antibody (330 nM).

FIGS. 6C-6D depict the sequence alignment of the CDRs.

FIG. 6E depicts a dose-dependent phage PD-1 ELISA.

Figure 6F depicts an isoaffinity diagram.

FIG. 7A is a graphical representation of PD-1/PDL-1 blocking analysis for optimized IgG.

Figure 7B shows the binding affinity and potency of optimized IgG.

FIG. 7C is a graphical representation of PD-1/PDL-1 blocking IC50(nM, y-axis) versus SPR monovalent binding affinity (Kd (nM), x-axis) for optimized IgG.

Fig. 7D is a graphical representation of BVP scores (y-axis) for several iggs (x-axis).

FIG. 8 presents a step diagram illustrating an exemplary process workflow for gene synthesis as disclosed herein.

Fig. 9 shows an example of a computer system.

Fig. 10 is a block diagram illustrating an architecture of a computer system.

FIG. 11 is a diagram illustrating a network configured to incorporate multiple computer systems, multiple cellular telephones and personal data assistants, and Network Attached Storage (NAS).

FIG. 12 is a block diagram of a multiprocessor computer system using a shared virtual address memory space.

Detailed Description

Unless otherwise indicated, the present disclosure employs conventional molecular biology techniques within the skill of the art. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.

Definition of

Throughout this disclosure, various embodiments are presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of any embodiment. Thus, unless the context clearly dictates otherwise, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range which are to the nearest tenth of the unit of the lower limit. For example, description of a range such as from 1 to 6 should be considered to have explicitly disclosed sub-ranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6, etc., as well as individual values within that range, e.g., 1.1, 2, 2.3, 5, and 5.9. This applies regardless of the breadth of the range. The upper and lower limits of these intermediate ranges may independently be included in the smaller ranges, and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure, unless the context clearly dictates otherwise.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of any embodiment. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

As used herein, the term "about" with respect to a number or range of numbers should be understood to mean the number and +/-10% of the number or value for the range, or 10% below the lower limit to 10% above the upper limit, as specified, unless otherwise indicated or apparent from the context.

Unless specifically stated otherwise, the term "nucleic acid" as used herein encompasses double-or triple-stranded nucleic acids as well as single-stranded molecules. In double-stranded or triple-stranded nucleic acids, the nucleic acid strands need not be coextensive (i.e., the double-stranded nucleic acid need not be double-stranded along the entire length of both strands). When provided, nucleic acid sequences are listed in a 5 'to 3' orientation unless otherwise indicated. The methods described herein provide for the generation of isolated nucleic acids. The methods described herein additionally provide for the generation of isolated and purified nucleic acids. Reference herein to a "nucleic acid" can include at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000 or more bases in length. Also, provided herein are methods of synthesizing any number of nucleotide sequences encoding polypeptide segments, including sequences encoding non-ribosomal peptides (NRPs), sequences encoding: non-ribonucleopeptide synthetase (NRPS) modules and synthetic variants, polypeptide segments of other modular proteins such as antibodies, polypeptide segments from other protein families, including non-coding DNA or RNA, such as regulatory sequences, e.g., promoters, transcription factors, enhancers, siRNA, shRNA, RNAi, miRNA, nucleolar small RNA derived from microrna, or any functional or structural DNA or RNA unit of interest. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, intergenic DNA, locus(s) defined by linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (sirna), short hairpin RNA (shrna), micro-RNA (mirna), nucleolar small RNA, ribozymes, complementary DNA (cdna), which is a DNA-rendered form of mRNA, typically obtained by reverse transcription of messenger RNA (mRNA) or by amplification; DNA molecules, genomic DNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers, produced synthetically or by amplification. The cDNA encoding the genes or gene fragments referred to herein may comprise at least one region encoding an exon sequence without intervening intron sequences in the genomic equivalent sequence.

Antibody optimization

Provided herein are methods, compositions, and systems for optimizing antibodies. In some cases, the antibodies were optimized by designing in silico libraries comprising variant sequences of the input antibody sequences (fig. 1). In some cases, the input sequence 100 is modified 102 in silico with one or more mutations to generate a library 103 of optimized sequences. In some cases, such libraries are synthesized, cloned into expression vectors, and the activity of the translation products (antibodies) is evaluated. In some cases, fragments of the sequence are synthesized and subsequently assembled. In some cases, the desired antibody is displayed and enriched using an expression vector, such as phage display. In some cases, the selection pressure used during the enrichment process includes binding affinity, toxicity, immune tolerance, stability, or other factors. Such expression vectors allow for the selection ("panning") of antibodies with specific properties, and the subsequent propagation or amplification of such sequences enriches the library with these sequences. The panning rounds may be repeated any number of times, such as 1,2, 3, 4,5, 6, 7 rounds or more than 7 rounds. In some cases, one or more rounds of sequencing are used to identify which sequences 105 are enriched in the library.

Described herein are methods and systems for in silico library design. For example, antibody or antibody fragment sequences are used as inputs. In some cases, any antibody sequence is used for input into the methods and systems described herein. The query 101 comprises a database 102 of known mutations from an organism and generates a sequence library 103 comprising a combination of these mutations. In some cases, an antibody described herein comprises a CDR region. In some cases, known mutations from the CDRs are used to construct a sequence library. In some cases, a particular type of variant is selected for a member of the sequence library using filter 104 or exclusion criteria. For example, if a minimum number of organisms in a database have a mutation, then a sequence with that mutation is added. In some cases, additional CDRs are designated for inclusion in the database. In some cases, a particular mutation or combination of mutations is excluded from the library (e.g., known immunogenic sites, structural sites, etc.). In some cases, a specific site in the input sequence is systematically replaced with histidine, aspartic acid, glutamic acid, or a combination thereof. In some cases, the maximum or minimum number of mutations allowed for each region of the antibody is specified. In some cases, the mutations are described relative to the input sequence or the corresponding germline sequence of the input sequence. For example, the sequence generated by optimization contains at least 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or more than 16 mutations from the input sequence. In some cases, the sequence generated by the optimization comprises no more than 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or no more than 18 mutations from the input sequence. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or about 18 mutations relative to the input sequence. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the first CDR region. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the second CDR region. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the third CDR region. In some cases, the sequence generated by optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the first CDR region of the heavy chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the second CDR region of the heavy chain. In some cases, the sequence generated by optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the third CDR region of the heavy chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the first CDR region of the light chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the second CDR region of the light chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the input sequence in the third CDR region of the light chain. In some cases, the first CDR region is CDR 1. In some cases, the second CDR region is CDR 2. In some cases, the third CDR region is CDR 3. In some cases, a computer-simulated antibody library is synthesized, assembled, and enriched for the desired sequence.

Germline sequences corresponding to the input sequences can also be modified to generate sequences in the library. For example, the sequences generated by the optimization methods described herein comprise at least 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or more than 16 mutations from germline sequences. In some cases, the sequence generated by the optimization comprises no more than 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or no more than 18 mutations from the germline sequence. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or about 18 mutations relative to the germline sequence.

Provided herein are methods, systems, and compositions for antibody optimization, wherein an input sequence comprises a mutation in an antibody region. Exemplary regions of an antibody include, but are not limited to, Complementarity Determining Regions (CDRs), variable domains, or constant domains. In some cases, the CDR is CDR1, CDR2, or CDR 3. In some cases, the CDR is a heavy chain domain, including but not limited to CDR-H1, CDR-H2, and CDR-H3. In some cases, the CDR is a light chain domain, including but not limited to CDR-L1, CDR-L2 and CDR-L3. In some cases, the variable domain is a light chain variable domain (VL) or a heavy chain variable domain (VH). In some cases, the VL domain comprises a kappa or lambda chain. In some cases, the constant domain is a light chain constant domain (CL) or a heavy chain constant domain (CH). In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the first CDR region. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the second CDR region. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the third CDR region. In some cases, the sequence generated by optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the first CDR region of the heavy chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the second CDR region of the heavy chain. In some cases, the sequence generated by optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the third CDR region of the heavy chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the first CDR region of the light chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the second CDR region of the light chain. In some cases, the sequence generated by the optimization comprises about 1,2, 3, 4,5, 6, or 7 mutations from the germline sequence in the third CDR region of the light chain. In some cases, the first CDR region is CDR 1. In some cases, the second CDR region is CDR 2. In some cases, the third CDR region is CDR 3.

Antibody libraries

Provided herein are libraries generated from the antibody optimization methods described herein. The antibodies described herein result in improved functional activity, structural stability, expression, specificity, or a combination thereof.

As used herein, the term antibody will be understood to include proteins having the characteristic double arm, Y-shape of a typical antibody molecule, as well as one or more fragments of an antibody that retain the ability to specifically bind to an antigen. Exemplary antibodies include, but are not limited to, monoclonal antibodies, polyclonal antibodies, bispecific antibodies, multispecific antibodies, grafted antibodies, human antibodies, humanized antibodies, synthetic antibodies, chimeric antibodies, camelized antibodies, single chain Fv (scfv) (including fragments wherein VL and VH are linked using recombinant methods through a synthetic or natural linker that enables them to be a single protein chain in which the VL and VH regions are paired in monovalent molecules, including single chain Fab and scFab), single chain antibodies, Fab fragments (including monovalent fragments comprising VL, VH, CL and CH1 domains), F (ab')2 fragments (including divalent fragments comprising two Fab fragments linked by a disulfide bond at the hinge region), Fd fragments (including fragments comprising VH and CH1 fragments), Fv fragments (including fragments comprising VL and VH domains of a single arm of an antibody), Single domain antibodies (dabs or sdabs) (including fragments comprising a VH domain), isolated Complementarity Determining Regions (CDRs), diabodies (including fragments comprising a bivalent dimer, e.g., two VL and VH domains that bind to each other and recognize two different antigens), fragments consisting of only a single monomeric variable domain, disulfide-linked fvs (sdfvs), intrabodies, anti-idiotypic (anti-Id) antibodies, or ab antigen-binding fragments thereof. In some cases, the libraries disclosed herein comprise nucleic acids encoding antibodies, wherein the antibodies are Fv antibodies, including Fv antibodies consisting of a minimal antibody fragment comprising an intact antigen recognition and antigen binding site. In some embodiments, the Fv antibody consists of a dimer of one heavy chain variable domain and one light chain variable domain in tight, non-covalent association, and the three hypervariable regions of each variable domain interact to define an antigen-binding site on the surface of the VH-VL dimer. In some embodiments, the six hypervariable regions confer antigen-binding specificity to the antibody. In some embodiments, a single variable domain (or half of an Fv comprising only three hypervariable regions specific for an antigen, including single domain antibodies isolated from camelids comprising one heavy chain variable domain, such as VHH antibodies or nanobodies) has the ability to recognize and bind an antigen. In some cases, the libraries disclosed herein comprise nucleic acids encoding an antibody, wherein the antibody is a single chain Fv or scFv, including antibody fragments comprising a VH, a VL, or both VH and VL domains, wherein both domains are present in a single polypeptide chain. In some embodiments, the Fv polypeptide further comprises a polypeptide linker between the VH and VL domains, thereby allowing the scFv to form the desired structure for antigen binding. In some cases, the scFv is linked to an Fc fragment, or the VHH is linked to an Fc fragment (including a miniantibody). In some cases, the antibodies comprise immunoglobulin molecules and immunologically active fragments of immunoglobulin molecules, e.g., molecules that contain an antigen binding site. Immunoglobulin molecules are of any class (e.g., IgG, IgE, IgM, IgD, IgA, and IgY), class (e.g., IgG 1, IgG 2, IgG 3, IgG 4, IgA 1, and IgA 2), or subclass.

In some embodiments, the library comprises immunoglobulins of a species suitable for the intended therapeutic target. Generally, these methods include "mammalianization," and include methods of transferring donor antigen binding information to less immunogenic mammalian antibody recipients to produce useful therapeutic treatments. In some cases, the mammal is a mouse, rat, horse, sheep, cow, primate (e.g., chimpanzee, baboon, gorilla, orangutan, monkey), dog, cat, pig, donkey, rabbit, and human. In some cases, provided herein are libraries and methods for feline and canine animalization of antibodies.

A "humanized" form of a non-human antibody can be a chimeric antibody containing minimal sequences derived from the non-human antibody. Humanized antibodies are typically human antibodies (recipient antibodies) in which residues from one or more CDRs are replaced with residues from one or more CDRs of a non-human antibody (donor antibody). The donor antibody can be any suitable non-human antibody, such as a mouse, rat, rabbit, chicken or non-human primate antibody having the desired specificity, affinity, or biological effect. In some cases, selected framework region residues of the recipient antibody are replaced with corresponding framework region residues from the donor antibody. Humanized antibodies may also comprise residues not found in either the recipient or donor antibody. In some cases, these modifications are made to further improve antibody performance.

"caninization" can include methods of transferring non-canine antigen binding information from a donor antibody to a less immunogenic canine antibody recipient to generate a treatment that can be used as a therapeutic agent in dogs. In some cases, the caninized form of the non-canine antibodies provided herein are chimeric antibodies that contain minimal sequences derived from the non-canine antibodies. In some cases, a caninized antibody is a canine antibody sequence ("recipient" or "acceptor" antibody) in which hypervariable region residues of the recipient are replaced with hypervariable region residues from a non-canine species ("donor" antibody) such as mouse, rat, rabbit, cat, dog, goat, chicken, cow, horse, llama, camel, dromedary, shark, non-human primate, human, humanized, recombinant sequences, or engineered sequences having desired properties. In some cases, Framework Region (FR) residues of a canine antibody are replaced with corresponding non-canine FR residues. In some cases, the caninized antibody includes residues not found in the recipient antibody or the donor antibody. In some cases, these modifications are made to further improve antibody performance. The caninized antibodies can also comprise at least a portion of the immunoglobulin constant region (Fc) of the canine antibody.

"felinization" can include methods of transferring non-feline antigen binding information from a donor antibody to a less immunogenic feline antibody recipient to produce a treatment that can be used as a therapeutic in a cat. In some cases, the felinized form of a non-feline antibody provided herein is a chimeric antibody that contains minimal sequences derived from the non-feline antibody. In some cases, the felinized antibody is a feline antibody sequence (an "acceptor" or "recipient" antibody), wherein hypervariable region residues of the acceptor are replaced with hypervariable region residues from a non-feline species (a "donor" antibody) such as mouse, rat, rabbit, cat, dog, goat, chicken, cow, horse, llama, camel, dromedary, shark, non-human primate, human, humanized, recombinant sequence, or engineered sequence with desired properties. In some cases, Framework Region (FR) residues of the feline antibody are replaced with corresponding non-feline FR residues. In some cases, the felinized antibody includes residues not found in the recipient antibody or the donor antibody. In some cases, these modifications are made to further improve antibody performance. The feline-raised antibody may further comprise at least a portion of an immunoglobulin constant region (Fc) of the feline antibody.

The methods described herein can be used to optimize libraries encoding non-immunoglobulins. In some cases, the library comprises antibody mimetics. Exemplary antibody mimetics include, but are not limited to, anticalins, affilins, affibody molecules, affimers, affitins, alphabodies, avimers, atrimers, DARPins, fynomers, Kunitz domain-based proteins, monoclonal antibodies, anticalins, knottins, armadillo-based repeat protein proteins, and bicyclic peptides.

A library described herein comprising nucleic acids encoding an antibody comprises variations in at least one region of the antibody. Exemplary regions for the variant antibody include, but are not limited to, Complementarity Determining Regions (CDRs), variable domains, or constant domains. In some cases, the CDR is CDR1, CDR2, or CDR 3. In some cases, the CDR is a heavy chain domain, including but not limited to CDR-H1, CDR-H2, and CDR-H3. In some cases, the CDR is a light chain domain, including but not limited to CDR-L1, CDR-L2 and CDR-L3. In some cases, the variable domain is a light chain variable domain (VL) or a heavy chain variable domain (VH). In some cases, the VL domain comprises a kappa or lambda chain. In some cases, the constant domain is a light chain constant domain (CL) or a heavy chain constant domain (CH).

The methods described herein provide for synthesizing a library comprising antibody-encoding nucleic acids, wherein each nucleic acid encodes a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon, such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the antibody library comprises variant nucleic acids that collectively encode variations at multiple positions. In some cases, the library of variants comprises sequences encoding variations of at least a single codon of a CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, or VH domain. In some cases, the library of variants comprises sequences encoding variations of multiple codons of the CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, or VH domains. In some cases, the library of variants comprises sequences encoding variations of a plurality of codons of framework element 1(FW1), framework element 2(FW2), framework element 3(FW3), or framework element 4(FW 4). Exemplary numbers of codons for variation include, but are not limited to, at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.

In some cases, at least one region of the antibody used for the variation is from the heavy chain V gene family, the heavy chain D gene family, the heavy chain J gene family, the light chain V gene family, or the light chain J gene family. In some cases, the light chain V gene family comprises immunoglobulin kappa (IGK) genes or immunoglobulin lambda (IGL).

Provided herein are libraries comprising nucleic acids encoding antibodies, wherein the libraries are synthesized with various numbers of fragments. In some cases, the fragment comprises a CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, or VH domain. In some cases, the segment comprises a frame element 1(FW1), a frame element 2(FW2), a frame element 3(FW3), or a frame element 4(FW 4). In some cases, the antibody library is synthesized with at least or about 2 fragments, 3 fragments, 4 fragments, 5 fragments, or more than 5 fragments. The length of each nucleic acid fragment or the average length of the synthesized nucleic acids can be at least or about 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600, or more than 600 base pairs. In some cases, the length is about 50 to 600, 75 to 575, 100 to 550, 125 to 525, 150 to 500, 175 to 475, 200 to 450, 225 to 425, 250 to 400, 275 to 375, or 300 to 350 base pairs.

When translated, a library comprising nucleic acids encoding antibodies as described herein comprises amino acids of various lengths. In some cases, the length of each amino acid fragment or the average length of the synthetic amino acids can be at least or about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, or more than 150 amino acids. In some cases, the amino acid is about 15 to 150, 20 to 145, 25 to 140, 30 to 135, 35 to 130, 40 to 125, 45 to 120, 50 to 115, 55 to 110, 60 to 110, 65 to 105, 70 to 100, or 75 to 95 amino acids in length. In some cases, the amino acid is about 22 amino acids to about 75 amino acids in length. In some cases, the antibody comprises at least or about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, or more than 5000 amino acids.

A number of variant sequences are synthesized de novo for at least one region of the antibody used for the variation using the methods described herein. In some cases, a number of variant sequences are synthesized de novo for CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, VH, or a combination thereof. In some cases, a number of variant sequences were synthesized de novo for framework element 1(FW1), framework element 2(FW2), framework element 3(FW3), or framework element 4(FW 4). See fig. 2A. The number of variant sequences can be at least or about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, or more than 500 sequences. In some cases, the number of variant sequences is at least or about 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, or more than 8000 sequences. In some cases, the number of variant sequences is about 10 to 500, 25 to 475, 50 to 450, 75 to 425, 100 to 400, 125 to 375, 150 to 350, 175 to 325, 200 to 300, 225 to 375, 250 to 350, or 275 to 325 sequences.

In some cases, the variant sequences for at least one region of the antibody differ in length or sequence. In some cases, the at least one region synthesized de novo is for CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, VH, or a combination thereof. In some cases, the at least one region that is synthesized de novo is for a framing element 1(FW1), a framing element 2(FW2), a framing element 3(FW3), or a framing element 4(FW 4). In some cases, the variant sequence comprises at least or about 1,2, 3, 4,5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more than 50 variant nucleotides or amino acids as compared to the wild type. In some cases, the variant sequence comprises at least or about 1,2, 3, 4,5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 additional nucleotides or amino acids as compared to the wild type. In some cases, the variant sequence comprises at least or about 1,2, 3, 4,5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 fewer nucleotides or amino acids than the wild type. In some cases, the library comprises at least or about 101、102、103、104、105、106、107、108、109、1010At or over 1010(ii) variants.

After synthesis of the antibody library, the antibody library can be used for screening and analysis. For example, libraries of antibody libraries can be analyzed for developability and panning. In some cases, the demonstratibility is analyzed using a selective tag. Exemplary labels include, but are not limited to, radioactive labels, fluorescent labels, enzymes, chemiluminescent labels, colorimetric labels, affinity labels, or other labels or tags known in the art. In some cases, the tag is histidine, polyhistidine, myc, Hemagglutinin (HA), or FLAG. In some cases, antibody libraries are analyzed by sequencing using various methods, including but not limited to single molecule real-time (SMRT) sequencing, polymerase clone (Polony) sequencing, ligation sequencing, reversible terminator sequencing, proton detection sequencing, ion semiconductor sequencing, nanopore sequencing, electronic sequencing, pyrosequencing, Maxam-Gilbert sequencing, chain termination (e.g., Sanger) sequencing, + S sequencing, or sequencing-by-synthesis. In some cases, the antibody library is displayed on the surface of a cell or phage. In some cases, phage display is used to enrich the antibody library for sequences with the desired activity.

In some cases, antibody libraries are analyzed for functional activity, structural stability (e.g., thermostable or pH stable), expression, specificity, or a combination thereof. In some cases, antibody libraries are analyzed for antibodies that are capable of folding. In some cases, antibody regions are assayed for functional activity, structural stability, expression, specificity, folding, or a combination thereof. For example, the VH or VL region is assayed for functional activity, structural stability, expression, specificity, folding, or a combination thereof.

The antibodies or IgG produced by the methods as described herein have improved binding affinity. In some cases, the antibody has a binding affinity (e.g., kD) of less than 1nM, less than 1.2nM, less than 2nM, less than 5nM, less than 10nM, less than 11nM, less than 13.5nM, less than 15nM, less than 20nM, less than 25nM, or less than 30 nM. In some cases, the antibody has a kD of less than 1 nM. In some cases, the antibody has a kD of less than 1.2 nM. In some cases, the antibody has a kD of less than 2 nM. In some cases, the antibody has a kD of less than 5 nM. In some cases, the antibody has a kD of less than 10 nM. In some cases, the antibody has a kD of less than 13.5 nM. In some cases, the antibody has a kD of less than 15 nM. In some cases, the antibody has a kD of less than 20 nM. In some cases, the antibody has a kD of less than 25 nM. In some cases, the antibody has a kD of less than 30 nM.

In some cases, the affinity of an antibody or IgG produced by a method as described herein is an improvement in binding affinity of at least or about 1.5x, 2.0x, 5x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, 100x, 200x, or more than 200x, as compared to a comparative antibody. In some cases, the affinity of an antibody or IgG produced by a method as described herein is improved function by at least or about 1.5x, 2.0x, 5x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, 100x, 200x, or more than 200x as compared to a comparative antibody. In some cases, the comparison antibody is an antibody with a similar structure, sequence, or antigen target.

Expression system

Provided herein are libraries comprising nucleic acids encoding antibodies comprising binding domains, wherein the libraries have improved specificity, stability, expression, folding, or downstream activity. In some cases, the libraries described herein are used for screening and analysis.

Provided herein are libraries comprising nucleic acids encoding antibodies comprising binding domains, wherein the nucleic acid libraries are used for screening and analysis. In some cases, screening and analysis includes in vitro, in vivo, or ex vivo assays. Cells for screening include primary cells or cell lines taken from a living subject. The cells may be from prokaryotes (e.g., bacteria and fungi) or eukaryotes (e.g., animals and plants). Exemplary animal cells include, but are not limited to, animal cells from mice, rabbits, primates, and insects. In some cases, the cells used for screening include cell lines including, but not limited to, a Chinese Hamster Ovary (CHO) cell line, a Human Embryonic Kidney (HEK) cell line, or a Baby Hamster Kidney (BHK) cell line. In some cases, the nucleic acid libraries described herein can also be delivered to a multicellular organism. Exemplary multicellular organisms include, but are not limited to, plants, mice, rabbits, primates, and insects.

The nucleic acid libraries described herein can be screened for various pharmacological or pharmacokinetic properties. In some cases, the library is screened using an in vitro assay, an in vivo assay, or an ex vivo assay. For example, the in vitro pharmacological or pharmacokinetic properties screened include, but are not limited to, binding affinity, binding specificity, and binding avidity. Exemplary in vivo pharmacological or pharmacokinetic properties of the libraries described herein that are screened include, but are not limited to, therapeutic efficacy, activity, preclinical toxicity properties, clinical efficacy properties, clinical toxicity properties, immunogenicity, efficacy, and clinical safety properties.

Provided herein are nucleic acid libraries, wherein the nucleic acid libraries can be expressed in vectors. Expression vectors for insertion into the nucleic acid libraries disclosed herein can include eukaryotic or prokaryotic expression vectors. Exemplary expression vectors include, but are not limited to, mammalian expression vectors: pSF-CMV-NEO-NH2-PPT-3XFLAG, pSF-CMV-NEO-COOH-3XFLAG, pSF-CMV-PURO-NH2-GST-TEV, pSF-OXB20-COOH-TEV-FLAG (R) -6His, pCEP4 pDEST27, pSF-CMV-Ub-KrYFP, pSF-CMV-FMDV-daGFP, pEF1a-mCherry-N1 vector, pEF1 a-Totdmato vector, pSF-CMV-FMDV-Hygro, pSF-CMV-PGK-PURO, pMCP-tag (m) and pSF-CMV-PURO-NH 2-CMYCC; bacterial expression vectors: pSF-OXB20-BetaGal, pSF-OXB20-Fluc, pSF-OXB20 and pSF-Tac; plant expression vector: pRI 101-AN DNA and pCambia 2301; and yeast expression vectors: pTYB21 and pKLAC2, and insect vectors: pAc5.1/V5-His A and pDEST 8. In some cases, the vector is pcDNA3 or pcdna3.1.

Described herein are nucleic acid libraries expressed in vectors to generate constructs comprising antibodies. In some cases, the constructs vary in size. In some cases, the construct comprises at least or about 500, 600, 700, 800, 900, 1000, 1100, 1300, 1400, 1500, 1600, 1700, 1800, 2000, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200,4400, 4600, 4800, 5000, 6000, 7000, 8000, 9000, 10000, or more than 10000 bases. In some cases, the construct comprises in the range of about 300 to 1,000, 300 to 2,000, 300 to 3,000, 300 to 4,000, 300 to 5,000, 300 to 6,000, 300 to 7,000, 300 to 8,000, 300 to 9,000, 300 to 10,000, 1,000 to 2,000, 1,000 to 3,000, 1,000 to 4,000, 1,000 to 5,000, 1,000 to 6,000, 1,000 to 7,000, 1,000 to 8,000, 1,000 to 9,000, 1,000 to 10,000, 2,000 to 3,000, 2,000 to 4,000, 2,000 to 5,000, 2,000 to 6,000, 2,000 to 7,000, 2,000 to 8,000, 2,000 to 9,000, 2,000 to 10,000, 3,000 to 4,000, 3,000 to 5,000, 3,000 to 6,000, 6,000 to 7,000, 2,000, 6,000 to 8,000, 6,000 to 7,000, 6,000 to 8,000, 6,000 to 10,000, 6,000 to 8,000, 6,000 to 8,000, 6,000, 7,000, 6,000 to 8,000, 7,000, 6,000 to 10,000, 6,000, 8,000 to 10,000, 8,000, or 6,000 to 8,000, 6,000,000,000, 6,000 to 10,000, or more preferably 8,000 to 4,000 to 8,000 to 10,000, or more preferably 8,000 to 4,000, 6,000, 7,000, or more preferably 8,000 to 4,000 to 10,000,000,000,000 to 4,000, 6,000.

Provided herein are libraries comprising nucleic acids encoding antibodies, wherein the nucleic acid libraries are expressed in cells. In some cases, the library is synthesized to express a reporter gene. Exemplary reporter genes include, but are not limited to, acetohydroxyacid synthase (AHAS), Alkaline Phosphatase (AP), β -galactosidase (LacZ), β -Glucuronidase (GUS), Chloramphenicol Acetyltransferase (CAT), Green Fluorescent Protein (GFP), Red Fluorescent Protein (RFP), Yellow Fluorescent Protein (YFP), Cyan Fluorescent Protein (CFP), sky blue fluorescent protein, yellow crystal fluorescent protein, orange fluorescent protein, cherry fluorescent protein, turquoise fluorescent protein, blue fluorescent protein, horseradish peroxidase (HRP), luciferase (Luc), nopaline synthase (NOS), octopine synthase (OCS), luciferase, and derivatives thereof. Methods of determining the modulation of a reporter are well known in the art and include, but are not limited to, fluorometry (e.g., fluorescence spectroscopy, Fluorescence Activated Cell Sorting (FACS), fluorescence microscopy) and antibiotic resistance determination.

PD-1 library

Provided herein are methods and compositions related to programmed cell death protein 1(PD-1) binding libraries comprising nucleic acids encoding PD-1 antibodies. In some cases, such methods and compositions are generated by the antibody optimization methods and systems described herein. The antibodies as described herein can stably support a PD-1 binding domain. PD-1 binding domains can be designed based on the surface interaction of PD-1 ligands with PD-1. The libraries as described herein can be further variegated (variegated) to provide a variant library comprising nucleic acids each encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. Further described herein are protein libraries that can be generated upon translation of the nucleic acid library. In some cases, a nucleic acid library as described herein is transferred into a cell to generate a cell library. Also provided herein are downstream applications of the libraries synthesized using the methods described herein. Downstream applications include the identification of variant nucleic acid or protein sequences with enhanced biologically relevant functions (e.g., improved stability, affinity, binding, functional activity) and for the treatment or prevention of disease states associated with PD-1 signaling. In some cases, an antibody described herein comprises the CDRH1 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 80% identical to the CDRH1 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 85% identical to the CDRH1 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 90% identical to the CDRH1 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 95% identical to the CDRH1 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises the CDRH2 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 80% identical to the CDRH2 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 85% identical to the CDRH2 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 90% identical to the CDRH2 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 95% identical to the CDRH2 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises the CDRH3 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 80% identical to the CDRH3 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 85% identical to the CDRH3 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 90% identical to the CDRH3 sequence of any one of SEQ ID NOs 1-35. In some cases, an antibody described herein comprises a sequence that is at least 95% identical to the CDRH3 sequence of any one of SEQ ID NOs 1-35.

The term "sequence identity" means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide basis) over a comparison window. The term "percent sequence identity" is calculated as follows: the two optimally aligned sequences are compared over a comparison window, the number of positions at which the same nucleobase (e.g., A, T, C, G, U or I) occurs in both sequences is determined to yield the number of matched positions, the number of matched positions is divided by the total number of positions within the comparison window (i.e., the window size), and the result is multiplied by 100 to yield the percentage of sequence identity.

The term "homology" or "similarity" between two proteins is determined by comparing the amino acid sequence of one protein sequence and its conservative amino acid substitutions to the second protein sequence. Similarity can be determined by procedures well known in the art, such as the BLAST program (basic local alignment search tool of the national center for bioinformatics).

Provided herein are libraries comprising nucleic acids encoding PD-1 antibodies. The antibodies described herein allow for increased stability of a range of PD-1 binding domain encoding sequences. In some cases, the PD-1 binding domain coding sequence is determined by the interaction between a PD-1 ligand and PD-1.

Various methods were used to analyze the sequence of the PD-1 binding domain based on surface interactions between the PD-1 ligand and PD-1. For example, a multi-species computational analysis is performed. In some cases, structural analysis is performed. In some cases, sequence analysis is performed. Sequence analysis can be performed using databases known in the art. Non-limiting examples of databases include, but are not limited to, NCBI BLAST (blast.ncbi.nlm.nih.gov/blast.cgi), UCSC Genome Browser (genome.ucsc.edu /), UniProt (www.uniprot.org /), and IUPHAR/BPS Guide to PHARMACOLOGY (guidotopharmacology.org /).

Described herein are PD-1 binding domains designed based on sequence analysis between various organisms. For example, sequence analysis is performed to identify homologous sequences in different organisms. Exemplary organisms include, but are not limited to, mice, rats, horses, sheep, cows, primates (e.g., chimpanzees, baboons, gorillas, orangutans, monkeys), dogs, cats, pigs, donkeys, rabbits, fish, flies, and humans. In some cases, homologous sequences are identified in the same organism across individuals.

After the PD-1 binding domain is identified, a library comprising nucleic acids encoding the PD-1 binding domain can be generated. In some cases, the library of PD-1 binding domains comprises PD-1 binding domain sequences designed based on conformational ligand interactions, peptide ligand interactions, small molecule ligand interactions, the extracellular domain of PD-1, or antibodies targeting PD-1. The library of PD-1 binding domains can be translated to generate a library of proteins. In some cases, the library of PD-1 binding domains is translated to generate a peptide library, an immunoglobulin library, derivatives thereof, or combinations thereof. In some cases, the library of PD-1 binding domains is translated to generate a protein library, which is further modified to generate a peptidomimetic library. In some cases, a library of PD-1 binding domains is translated to generate a library of proteins for generating small molecules.

The methods described herein provide for the synthesis of a library of PD-1 binding domains comprising nucleic acids each encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon, such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the PD-1 binding domain library comprises variant nucleic acids that collectively encode variations at multiple positions. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon in the PD-1 binding domain. In some cases, the library of variants comprises a sequence encoding a variation of a plurality of codons in the PD-1 binding domain. Exemplary numbers of codons for variation include, but are not limited to, at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.

The methods described herein provide for synthesizing a library comprising nucleic acids encoding a PD-1 binding domain, wherein the library comprises sequences encoding length variations of the PD-1 binding domain. In some cases, the library comprises sequences encoding at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons of length variation compared to a predetermined reference sequence. In some cases, the library comprises sequences encoding at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, or more than 300 codons of length variation compared to a predetermined reference sequence.

After the PD-1 binding domain is identified, antibodies comprising the PD-1 binding domain can be designed and synthesized. Antibodies comprising a PD-1 binding domain can be designed based on binding, specificity, stability, expression, folding, or downstream activity. In some cases, an antibody comprising a PD-1 binding domain enables contact with PD-1. In some cases, an antibody comprising a PD-1 binding domain enables high affinity binding to PD-1. An exemplary amino acid sequence of a PD-1 binding domain comprises any one of SEQ ID NOs 1-70.

In some cases, the PD-1 antibody has a binding affinity (e.g., kD) for PD-1 of less than 1nM, less than 1.2nM, less than 2nM, less than 5nM, less than 10nM, less than 11nM, less than 13.5nM, less than 15nM, less than 20nM, less than 25nM, or less than 30 nM. In some cases, the PD-1 antibody has a kD of less than 1 nM. In some cases, the PD-1 antibody has a kD of less than 1.2 nM. In some cases, the PD-1 antibody has a kD of less than 2 nM. In some cases, the PD-1 antibody has a kD of less than 5 nM. In some cases, the PD-1 antibody has a kD of less than 10 nM. In some cases, the PD-1 antibody has a kD of less than 13.5 nM. In some cases, the PD-1 antibody has a kD of less than 15 nM. In some cases, the PD-1 antibody has a kD of less than 20 nM. In some cases, the PD-1 antibody has a kD of less than 25 nM. In some cases, the PD-1 antibody has a kD of less than 30 nM.

In some cases, the affinity of a PD-1 antibody generated by a method as described herein is an improvement in binding affinity of at least or about 1.5x, 2.0x, 5x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, 100x, 200x, or more than 200x, as compared to a comparative antibody. In some cases, a PD-1 antibody generated by a method as described herein has improved function by at least or about 1.5x, 2.0x, 5x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, 100x, 200x, or more than 200x as compared to a comparative antibody. In some cases, the comparison antibody is an antibody with a similar structure, sequence, or antigen target.

Provided herein are PD-1 binding libraries comprising nucleic acids encoding antibodies comprising a PD-1 binding domain, the antibodies comprising a variation in domain type, domain length, or residue variation. In some cases, the domain is a region in an antibody comprising a PD-1 binding domain. For example, the region is a VH, CDR-H3 or VL domain. In some cases, the domain is a PD-1 binding domain.

The methods described herein provide for the synthesis of a PD-1 binding library of nucleic acids each encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon, such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the PD-1 binding library comprises variant nucleic acids that collectively encode variations at multiple positions. In some cases, the library of variants comprises sequences encoding variations of at least a single codon of a VH, CDR-H3, or VL domain. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon in the PD-1 binding domain. For example, at least one single codon of the PD-1 binding domain is different. In some cases, the library of variants comprises sequences that encode variations of multiple codons of the VH, CDR-H3, or VL domains. In some cases, the library of variants comprises a sequence encoding a variation of a plurality of codons in the PD-1 binding domain. Exemplary numbers of codons for variation include, but are not limited to, at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.

The methods described herein provide a PD-1 binding library that synthesizes nucleic acids each encoding a predetermined variant of at least one predetermined reference nucleic acid sequence, wherein the PD-1 binding library comprises sequences encoding length variations of a domain. In some cases, the domain is a VH, CDR-H3, or VL domain. In some cases, the domain is a PD-1 binding domain. In some cases, the library comprises sequences encoding at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons of length variation compared to a predetermined reference sequence. In some cases, the library comprises sequences encoding at least or about 1,5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, or more than 300 codons of length variation compared to a predetermined reference sequence.

Provided herein are PD-1 binding libraries comprising nucleic acids encoding antibodies comprising a PD-1 binding domain, wherein the PD-1 binding libraries are synthesized with various numbers of fragments. In some cases, the fragment comprises a VH, CDR-H3, or VL domain. In some cases, the PD-1 binding library is synthesized with at least or about 2 fragments, 3 fragments, 4 fragments, 5 fragments, or more than 5 fragments. The length of each nucleic acid fragment or the average length of the synthesized nucleic acids can be at least or about 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600, or more than 600 base pairs. In some cases, the length is about 50 to 600, 75 to 575, 100 to 550, 125 to 525, 150 to 500, 175 to 475, 200 to 450, 225 to 425, 250 to 400, 275 to 375, or 300 to 350 base pairs.

When translated, a library of PD-1 binding domains comprising nucleic acids encoding antibodies comprising a PD-1 binding domain as described herein comprise amino acids of various lengths. In some cases, the length of each amino acid fragment or the average length of the synthetic amino acids can be at least or about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, or more than 150 amino acids. In some cases, the amino acid is about 15 to 150, 20 to 145, 25 to 140, 30 to 135, 35 to 130, 40 to 125, 45 to 120, 50 to 115, 55 to 110, 60 to 110, 65 to 105, 70 to 100, or 75 to 95 amino acids in length. In some cases, the amino acid is about 22 to about 75 amino acids in length.

A PD-1 binding library comprising de novo synthesized variant sequences encoding antibodies comprising a PD-1 binding domain comprises a plurality of variant sequences. In some cases, a number of variant sequences are synthesized de novo for CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2, CDR-L3, VL, VH, or a combination thereof. In some cases, a number of variant sequences were synthesized de novo for framework element 1(FW1), framework element 2(FW2), framework element 3(FW3), or framework element 4(FW 4). In some cases, a number of variant sequences are synthesized de novo for the PD-1 binding domain. The number of variant sequences can be at least or about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, or more than 500 sequences. In some cases, the number of variant sequences is about 10 to 300, 25 to 275, 50 to 250, 75 to 225, 100 to 200, or 125 to 150 sequences.

PD-1 binding libraries comprising de novo synthesized variant sequences encoding antibodies comprising PD-1 binding domains have improved diversity. In some cases, the variant comprises an affinity matured variant. Alternatively or in combination, the variant includes variants in other regions of the antibody including, but not limited to, CDR-H1, CDR-H2, CDR-L1, CDR-L2 and CDR-L3. In some cases, the number of variants of the PD-1 binding library is at least or about 104、105、106、107、108、109、1010、1011、1012、1013、1014Or more than 1014A non-identical sequence.

After synthesis of a PD-1 binding library comprising nucleic acids encoding antibodies containing a PD-1 binding domain, the library can be used for screening and analysis. For example, library developability and panning of the analysis library. In some cases, the demonstratibility is analyzed using a selective tag. Exemplary labels include, but are not limited to, radioactive labels, fluorescent labels, enzymes, chemiluminescent labels, colorimetric labels, affinity labels, or other labels or tags known in the art. In some cases, the tag is histidine, polyhistidine, myc, Hemagglutinin (HA), or FLAG. For example, a PD-1 binding library comprises nucleic acids encoding antibodies comprising a PD-1 binding domain with multiple tags such as GFP, FLAG and Lucy and DNA barcodes. In some cases, the library is assayed by sequencing using various methods including, but not limited to, single molecule real-time Sequencing (SMRT), polymerase clone (Polony) sequencing, ligation sequencing, reversible terminator sequencing, proton detection sequencing, ion semiconductor sequencing, nanopore sequencing, electronic sequencing, pyrosequencing, Maxam-Gilbert sequencing, chain termination (e.g., Sanger) sequencing, + S sequencing, or sequencing-by-synthesis.

Diseases and disorders

Provided herein are PD-1 binding libraries comprising nucleic acids encoding antibodies comprising a PD-1 binding domain, which can have a therapeutic effect. In some cases, the PD-1 binding library, when translated, produces a protein that is used to treat a disease or disorder. In some cases, the protein is an immunoglobulin. In some cases, the protein is a peptidomimetic. Exemplary diseases include, but are not limited to, cancer, inflammatory diseases or disorders, metabolic diseases or disorders, cardiovascular diseases or disorders, respiratory diseases or disorders, pain, digestive diseases or disorders, reproductive diseases or disorders, endocrine diseases or disorders, or nervous system diseases or disorders. In some cases, the cancer is a solid cancer or a hematologic cancer. In some cases, the cancer is lung cancer, head and neck squamous cell carcinoma, colorectal cancer, melanoma, liver cancer, classical hodgkin lymphoma, kidney cancer, stomach cancer, cervical cancer, merkel cell carcinoma, B-cell lymphoma, or bladder cancer. In some cases, the cancer is an MSI-H/dMMR cancer. In some cases, an inhibitor of PD-1 programmed cell death protein 1 as described herein is used to treat a metabolic disorder. In some cases, the subject is a mammal. In some cases, the subject is a mouse, rabbit, dog, or human. The subject treated by the methods described herein may be an infant, an adult, or a child. Pharmaceutical compositions comprising the antibodies or antibody fragments described herein may be administered intravenously or subcutaneously. In some cases, a pharmaceutical composition comprises an antibody or antibody fragment described herein comprising CDR-H3, which CDR-H3 comprises the sequence of any one of SEQ ID NOs 1-70. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat cancer. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat lung cancer. In some cases, a sequence of any one of SEQ ID NOs 1-70 is used to treat squamous cell carcinoma of the head and neck. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat colorectal cancer. In some cases, melanoma is treated using a sequence of any one of SEQ ID NOs 1-70. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat liver cancer. In some cases, classical Hodgkin's lymphoma is treated with a sequence of any one of SEQ ID NOs 1-70. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat kidney cancer. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat gastric cancer. In some cases, cervical cancer is treated using a sequence of any one of SEQ ID NOs 1-70. In some cases, merkel cell carcinoma is treated with a sequence of any one of SEQ ID NOs 1-70. In some cases, a sequence of any one of SEQ ID NOs 1-70 is used to treat B cell lymphoma. In some cases, the sequence of any one of SEQ ID NOs 1-70 is used to treat bladder cancer.

Variant libraries

Codon variation

The library of variant nucleic acids described herein can comprise a plurality of nucleic acids, wherein each nucleic acid encodes a variant codon sequence as compared to a reference nucleic acid sequence. In some cases, each nucleic acid in the first population of nucleic acids contains a variant at a single mutation site. In some cases, the first population of nucleic acids contains a plurality of variants at a single mutation site, such that the first population of nucleic acids contains more than one variant at the same mutation site. The first population of nucleic acids can comprise nucleic acids that collectively encode a plurality of codon variants at the same site of variation. The first population of nucleic acids can comprise nucleic acids that collectively encode up to 19 or more codons at the same position. The first population of nucleic acids can comprise nucleic acids that collectively encode up to 60 variant triplets at the same position, or the first population of nucleic acids can comprise nucleic acids that collectively encode up to 61 different codon triplets at the same position. Each variant may encode a codon that produces a different amino acid during translation. Table 1 provides a list of each codon (and representative amino acids) possible for a mutation site.

TABLE 1 codon and amino acid List

The population of nucleic acids can comprise altered nucleic acids that collectively encode at most 20 codon variations at multiple positions. In such cases, each nucleic acid in the population comprises codon variations at more than one position in the same nucleic acid. In some cases, each nucleic acid in the population comprises a codon variation at 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more codons in a single nucleic acid. In some cases, each variant long nucleic acid comprises a codon variation at 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more codons in a single long nucleic acid. In some cases, the population of variant nucleic acids comprises codon variations at 1,2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more codons in a single nucleic acid. In some cases, the population of variant nucleic acids comprises codon variations at least about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more codons in a single long nucleic acid.

Highly parallel nucleic acid synthesis

Provided herein is a platform approach that utilizes the miniaturization, parallelization, and vertical integration of the end-to-end process from polynucleotide synthesis to in-silicon nanopore gene assembly to create a revolutionary synthetic platform. The devices described herein provide a silicon synthesis platform that can increase the flux by up to 1,000-fold or more compared to traditional synthesis methods, using the same footprint (footprint) as a 96-well plate, with up to about 1,000,000 or more polynucleotides or 10,000 or more genes being produced in a single highly parallelized run.

With the advent of next generation sequencing, high resolution genomic data has become an important factor in the study of the biological role of various genes in normal biology and disease pathogenesis. The core of this study is the central dogma of molecular biology and the concept of "residue-by-residue transfer of continuous information". Genomic information encoded in DNA is transcribed into information, which is subsequently translated into protein, which is an active product within a given biological pathway.

Another exciting area of research is in the discovery, development and preparation of therapeutic molecules that focus on highly specific cellular targets. Highly diverse libraries of DNA sequences are central to the development process of targeted therapeutics. Protein expression using gene mutants in a design, construction and test protein engineering cycle, which ideally results in genes optimized for high expression of proteins with high affinity for their therapeutic targets. As an example, the binding pocket of the receptor is considered. Testing the ability to bind all sequence alignments of all residues within the pocket simultaneously would allow thorough exploration, thereby increasing the likelihood of success. Saturation mutagenesis, in which researchers attempt to generate all possible mutations at a specific site within a receptor, represents one approach to this development challenge. Although it is costly, time consuming and laborious, it is capable of introducing each variant to each location. In contrast, combinatorial mutagenesis, in which several selected positions or short stretches of DNA can be extensively modified, generates an incomplete repertoire of variants with biased presentation.

To speed up the drug development process, libraries with desired variants (in other words, accurate libraries) available at the correct positions available for testing at the expected frequency enable a reduction in cost as well as in the turnaround time for screening. Provided herein are methods for synthesizing libraries of nucleic acid synthetic variants that are capable of precisely introducing each desired variant at a desired frequency. For the end user, this means that not only can the sequence space be thoroughly sampled, but these hypotheses can be queried in an efficient manner, thereby reducing cost and screening time. Genome-wide editing can elucidate important pathways, can detect each variant and sequence permutation to obtain a library of optimal functionality, and can use thousands of genes to reconstruct the entire pathway and genome, to re-engineer biological systems for drug discovery.

In a first example, the drug itself may be optimized using the methods described herein. For example, to improve a given function of an antibody, a library of variant polynucleotides encoding a portion of the antibody is designed and synthesized. Libraries of variant nucleic acids of the antibodies can then be generated by the processes described herein (e.g., PCR mutagenesis followed by insertion into a vector). The antibody is then expressed in a producer cell line and screened for enhanced activity. Exemplary screens include examining the modulation of binding affinity to an antigen, stability, or effector function (e.g., ADCC, complement, or apoptosis). Exemplary regions for optimizing antibodies include, but are not limited to, an Fc region, a Fab region, a variable region of a Fab region, a constant region of a Fab region, a variable domain of a heavy or light chain (V)HOr VL) And VHOr VLThe specific Complementarity Determining Region (CDR).

Libraries of nucleic acids synthesized by the methods described herein can be expressed in a variety of cells associated with disease states. Cells associated with a disease state include cell lines, tissue samples, primary cells from a subject, cultured cells expanded from a subject, or cells in a model system. Exemplary model systems include, but are not limited to, plant and animal models of disease states.

To identify variant molecules that are associated with prevention, alleviation or treatment of a disease state, a library of variant nucleic acids described herein is expressed in cells associated with the disease state, or can induceCellsExpressed in cells of the disease state. In some cases, the agent is used to induce a disease state in a cell. Exemplary tools for induction of disease states include, but are not limited to, Cre/Lox recombinant systems, LPS inflammation induction, and streptozotocin to induce hypoglycemia. The cells associated with a disease state can be cells from a model system or cultured cells, as well as cells from a subject with a particular disease condition. Exemplary disease conditions include bacterial, fungal, viral, autoimmune or proliferative disorders (e.g., cancer). In some cases, the library of variant nucleic acids is expressed in a model system, a cell line, or a primary cell derived from the subject and screened for an alteration in at least one cellular activity. Exemplary cellular activities include, but are not limited to, proliferation, cycle progression, cell death, adhesion, migration, replication, cell signaling, energy production, oxygen utilization, metabolic activity and aging, response to free radical damage, or any combination thereof.

Substrate

Devices used as polynucleotide synthesis surfaces may be in the form of substrates including, but not limited to, homogeneous array surfaces, patterned array surfaces, channels, beads, gels, and the like. Provided herein are substrates comprising a plurality of clusters, wherein each cluster comprises a plurality of loci that support polynucleotide attachment and synthesis. In some cases, the substrate comprises a uniform array surface. For example, the uniform array surface is a uniform plate. The term "locus" as used herein refers to a discrete region on a structure that provides support for extension of a polynucleotide encoding a single predetermined sequence from the surface. In some cases, the seat is on a two-dimensional surface (e.g., a substantially planar surface). In some cases, the seat is on a three-dimensional surface (e.g., a hole, a micro-hole, a channel, or a post). In some cases, the surface of the locus comprises a material that is activated and functionalized to attach at least one nucleotide for polynucleotide synthesis, or preferably, to attach a population of the same nucleotide for polynucleotide population synthesis. In some cases, a polynucleotide refers to a population of polynucleotides that encode the same nucleic acid sequence. In some cases, the surface of the substrate includes one or more surfaces of the substrate. The average error rate of polynucleotides synthesized within the libraries described herein using the provided systems and methods is typically less than 1/1000, less than about 1/2000, less than about 1/3000, or lower, typically without error correction.

Provided herein are surfaces that support the parallel synthesis of a plurality of polynucleotides having different predetermined sequences at addressable locations on a common support. In some cases, the substrate provides support for the synthesis of more than 50, 100, 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000, 5,000, 10,000, 20,000, 50,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000,000, 4,500,000, 5,000,000, 10,000,000 or more different polynucleotides. In some cases, the surface provides support for synthesizing more than 50, 100, 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000, 5,000, 10,000, 20,000, 50,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000,000, 4,500,000, 5,000,000, 10,000,000 or more polynucleotides encoding different sequences. In some cases, at least a portion of the polynucleotides have the same sequence or are configured to be synthesized with the same sequence. In some cases, the substrate provides a surface environment for growing polynucleotides having at least 80, 90, 100, 120, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500 or more bases.

Provided herein are methods of synthesizing polynucleotides at different loci on a substrate, wherein each locus supports a synthetic polynucleotide population. In some cases, each locus supports the synthesis of a population of polynucleotides having a different sequence than the population of polynucleotides growing at another locus. In some cases, each polynucleotide sequence is synthesized with 1,2, 3, 4,5, 6, 7, 8, 9 or more redundancies at different loci within the same locus cluster on the surface used for polynucleotide synthesis. In some cases, the loci of the substrate are located within a plurality of clusters. In some cases, the substrate comprises at least 10, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 11000, 12000, 13000, 14000, 15000, 20000, 30000, 40000, 50000 or more clusters. In some cases, the substrate comprises more than 2,000, 5,000, 10,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,100,000, 1,200,000, 1,300,000, 1,400,000, 1,500,000, 1,600,000, 1,700,000, 1,800,000, 1,900,000, 2,000,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000, 4,500,000, 5,000, or 10,000 or more different seats. In some cases, the substrate comprises about 10,000 different seats. The amount of seats within a single cluster is different in different situations. In some cases, each cluster contains 1,2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 130, 150, 200, 300, 400, 500 or more seats. In some cases, each cluster contains about 50-500 loci. In some cases, each cluster contains about 100 and 200 loci. In some cases, each cluster contains approximately 100 and 150 loci. In some cases, each cluster contains about 109, 121, 130, or 137 loci. In some cases, each cluster contains about 19, 20, 61, 64, or more loci. Alternatively or in combination, polynucleotide synthesis is performed on a uniform array surface.

In some cases, the number of different polynucleotides synthesized on the substrate depends on the number of different loci available in the substrate. In some cases, the density of loci within a cluster or surface of the substrate is at least or about 1,10, 25, 50, 65, 75, 100, 130, 150, 175, 200, 300, 400, 500, 1,000 or more loci per mm2. In some cases, the substrate comprises 10-500, 25-400, 50-500, 100-500, 150-500, 10-250, 50-250, 10-200, or 50-200mm2. In some cases, the distance between the centers of two adjacent seats in a cluster or surface is about 10-500, about 10-200, or about 10-100 um. In some cases, the distance between the two centers of adjacent seats is greater than about 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 um. In some cases, the distance between the centers of two adjacent seats is less than about 200, 150, 100, 80, 70, 60, 40, 30, 20, or 10 um. In some cases, each seat has a width of about 0.5, 1,2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 um. In some cases, each seat has a width of about 0.5-100, 0.5-50, 10-75, or 0.5-50 um.

In some cases, the density of clusters within the substrate is at least or about 1 cluster per 100mm21 cluster/10 mm21 cluster/5 mm21 cluster/4 mm21 cluster/3 mm21 cluster/2 mm21 cluster/1 mm22 clusters/1 mm23 clusters/1 mm24 clusters/1 mm25 clusters/1 mm210 clusters/1 mm250 clusters/1 mm2Or higher. In some cases, the substrate comprises about 1 tuft/10 mm2To about 10 clusters/1 mm2. In some cases, the distance between the centers of two adjacent clusters is at least or about 50, 100, 200, 500, 1000, 2000, or 5000 um. In some cases, the distance between the centers of two adjacent clusters is about 50-100, 50-200, 50-300, 50-500, and 100-2000 um. In some cases, two are adjacentThe distance between the centers of the clusters is about 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.1-10, 0.2-10, 0.3-10, 0.4-10, 0.5-5, or 0.5-2 mm. In some cases, each tuft has a cross-section of about 0.5 to about 2, about 0.5 to about 1, or about 1 to about 2 mm. In some cases, each cluster has a cross-section of about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2 mm. In some cases, each cluster has an internal cross-section of about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.15, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2 mm.

In some cases, the substrate is about the size of a standard 96-well plate, e.g., about 100 to about 200mm by about 50 to about 150 mm. In some cases, the substrate has a diameter of less than or equal to about 1000, 500, 450, 400, 300, 250, 200, 150, 100, or 50 mm. In some cases, the diameter of the substrate is about 25-1000, 25-800, 25-600, 25-500, 25-400, 25-300, or 25-200 mm. In some cases, the substrate has at least about 100, 200, 500, 1,000, 2,000, 5,000, 10,000, 12,000, 15,000, 20,000, 30,000, 40,000, 50,000mm2Or a larger planar surface area. In some cases, the substrate has a thickness of about 50-2000, 50-1000, 100-1000, 200-1000, or 250-1000 mm.

Surfacing material

The substrates, devices, and reactors provided herein are made from any of a variety of materials suitable for the methods, compositions, and systems described herein. In some cases, the substrate material is fabricated to exhibit low levels of nucleotide incorporation. In some cases, the substrate material is modified to create different surfaces that exhibit high levels of nucleotide binding. In some cases, the substrate material is transparent to visible and/or ultraviolet light. In some cases, the substrate material is sufficiently conductive, e.g., capable of forming a uniform electric field across the entire substrate or a portion thereof. In some cases, the conductive material is electrically grounded. In some cases, the substrate is thermally conductive or thermally insulating. In some cases, the material is chemically and thermally resistant to support chemical or biochemical reactions, such as polynucleotide synthesis reaction processes. In some cases, the substrate comprises a flexible material. For flexible materials, the materials may include, but are not limited to: modified and unmodified nylon, nitrocellulose, polypropylene, and the like. In some cases, the substrate comprises a rigid material. For rigid materials, the materials may include, but are not limited to: glass; fused quartz; silicon, plastic (e.g., polytetrafluoroethylene, polypropylene, polystyrene, polycarbonate, mixtures thereof, and the like); metals (e.g., gold, platinum, etc.). The substrate, solid support or reactor may be made of a material selected from the group consisting of silicon, polystyrene, agarose, dextran, cellulose polymers, polyacrylamide, Polydimethylsiloxane (PDMS), and glass. The substrate/solid support or microstructures therein, the reactor, can be made using combinations of the materials listed herein or any other suitable material known in the art.

Surface structure

Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates have a surface architecture suitable for the methods, compositions, and systems described herein. In some cases, the substrate comprises raised and/or recessed features. One benefit of having such features is the increased surface area available to support polynucleotide synthesis. In some cases, a substrate having raised and/or recessed features is referred to as a three-dimensional substrate. In some cases, the three-dimensional substrate comprises one or more channels. In some cases, one or more seats include a channel. In some cases, the channels may be subject to reagent deposition by a deposition device, such as a material deposition device. In some cases, reagents and/or fluids are collected in larger wells in fluid communication with one or more channels. For example, the substrate contains a plurality of channels corresponding to a plurality of loci having a cluster, and the plurality of channels are in fluid communication with one aperture of the cluster. In some methods, the polynucleotide library is synthesized in multiple loci of a cluster.

Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates are configured for polynucleotide synthesis. In some cases, the structures are formulated to allow controlled flow and mass transfer pathways for polynucleotide synthesis on a surface. In some cases, the configuration of the substrate allows for controlled and uniform distribution of mass transfer paths, chemical exposure times, and/or wash efficacy during polynucleotide synthesis. In some cases, the configuration of the substrate allows for increased scanning efficiency, for example by providing a volume sufficient for growing the polynucleotide such that the volume excluded by the grown polynucleotide does not exceed 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or less of the initial usable volume available for or suitable for growing the polynucleotide. In some cases, the three-dimensional structure allows for a managed flow of fluid, allowing for rapid exchange of chemical exposure.

Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates comprise structures suitable for the methods, compositions, and systems described herein. In some cases, isolation is achieved by physical structures. In some cases, isolation is achieved by differential functionalization of the surface to generate activated and deactivated regions for polynucleotide synthesis. In some cases, differential functionalization is achieved by alternating hydrophobicity across the substrate surface, causing water contact angle effects that can cause beading or wetting of deposited reagents. The use of larger structures can reduce splatter and cross-contamination of different polynucleotide synthesis sites by reagents adjacent to the spots. In some cases, reagents are deposited at different polynucleotide synthesis locations using a device, such as a material deposition device. Substrates with three-dimensional features are configured in a manner that allows for the synthesis of large numbers (e.g., more than about 10,000) of polynucleotides with low error rates (e.g., less than about 1:500, 1:1000, 1:1500, 1:2,000; 1:3,000; 1:5,000; or 1:10,000). In some cases, the substrate comprises a density of about or greater than about 1,5, 10, 20, 30, 40, 50, 60, 70, 80, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, or 500 features/mm2The characteristics of (1).

The aperture of the substrate may have the same or different width, height and/or volume as another aperture of the substrate. The channel of the substrate may have the same or different width, height, and/or volume as another channel of the substrate. In some cases, the diameter of the cluster or the diameter of the aperture containing the cluster or both is about 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.05-1, 0.05-0.5, 0.05-0.1, 0.1-10, 0.2-10, 0.3-10, 0.4-10, 0.5-5, or 0.5-2 mm. In some cases, the diameter of the tuft or hole, or both, is less than or about 5, 4, 3, 2, 1, 0.5, 0.1, 0.09, 0.08, 0.07, 0.06, or 0.05 mm. In some cases, the diameter of the tuft or hole, or both, is about 1.0mm to 1.3 mm. In some cases, the diameter of the tuft or hole, or both, is about 1.150 mm. In some cases, the diameter of the tuft or hole or both is about 0.08 mm. The diameter of a cluster refers to the cluster within a two-dimensional or three-dimensional substrate.

In some cases, the height of the wells is about 20-1000, 50-1000, 100-1000, 200-1000, 300-1000, 400-1000, or 500-1000. mu.m. In some cases, the height of the holes is less than about 1000, 900, 800, 700, or 600 um.

In some cases, the substrate comprises a plurality of channels corresponding to a plurality of loci within a cluster, wherein the height or depth of the channels is 5-500, 5-400, 5-300, 5-200, 5-100, 5-50, or 10-50 um. In some cases, the height of the channel is less than 100, 80, 60, 40, or 20 um.

In some cases, the diameter of the tunnel, the seat (e.g., in a substantially flat substrate), or both the tunnel and the seat (e.g., in a three-dimensional substrate in which the seat corresponds to the tunnel) is about 1-1000, 1-500, 1-200, 1-100, 5-100, or 10-100um, such as about 90, 80, 70, 60, 50, 40, 30, 20, or 10 um. In some cases, the diameter of the channel, the seat, or both the channel and the seat is less than about 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 um. In some cases, the distance between two adjacent channels, seats, or channels and the center of a seat is about 1-500, 1-200, 1-100, 5-200, 5-100, 5-50, or 5-30, for example about 20 um.

Surface modification

Provided herein are methods for synthesizing polynucleotides on a surface, wherein the surface comprises various surface modifications. In some cases, surface modification is used to chemically and/or physically alter the surface by an additive process or a subtractive process to alter one or more chemical and/or physical properties of the substrate surface or selected sites or regions of the substrate surface. For example, surface modifications include, but are not limited to: (1) changing the wetting properties of the surface; (2) functionalizing the surface, i.e., providing, modifying or replacing surface functional groups; (3) defunctionalizing the surface, i.e., removing surface functional groups; (4) changing the chemical composition of the surface in other ways, for example by etching; (5) increase or decrease surface roughness; (6) providing a coating on a surface, e.g., a coating that exhibits wetting properties that are different from the wetting properties of the surface; and/or (7) depositing particles on the surface.

In some cases, the addition of a chemical layer (referred to as an adhesion promoter) on top of the surface facilitates the structured patterning of the seats on the substrate surface. Exemplary surfaces for applying the adhesion promoter include, but are not limited to, glass, silicon dioxide, and silicon nitride. In some cases, the adhesion promoter is a chemical with high surface energy. In some cases, a second chemical layer is deposited on the surface of the substrate. In some cases, the second chemical layer has a low surface energy. In some cases, the surface energy of the chemical layer coated on the surface supports the positioning of droplets on the surface. Depending on the selected patterning arrangement, the proximity of the seats and/or the fluid contact area at the seats may be varied.

In some cases, the substrate surface or resolved loci onto which the nucleic acids or other moieties are deposited (e.g., for polynucleotide synthesis) are smooth or substantially planar (e.g., two-dimensional), or have irregularities such as raised or recessed features (e.g., three-dimensional features). In some cases, the substrate surface is modified with one or more different compound layers. Such modification layers of interest include, but are not limited to, inorganic and organic layers, such as metals, metal oxides, polymers, small organic molecules, and the like.

In some cases, the resolved loci of the substrate are functionalized with one or more moieties that increase and/or decrease surface energy. In some cases, the moiety is chemically inert. In some cases, the moiety is configured to support a desired chemical reaction, such as one or more processes in a polynucleotide synthesis reaction. The surface energy or hydrophobicity of a surface is a factor that determines the affinity of nucleotides for attaching to the surface. In some cases, a substrate functionalization method comprises: (a) providing a substrate having a surface comprising silicon dioxide; and (b) silanizing the surface using a suitable silylating agent (e.g., organofunctional alkoxysilane molecules) as described herein or known in the art. Methods and functionalizing agents are described in U.S. patent 5474796, which is incorporated herein by reference in its entirety.

In some cases, the substrate surface is functionalized, typically via reactive hydrophilic moieties present on the substrate surface, by contacting the substrate surface with a derivatizing composition comprising a mixture of silanes under reaction conditions effective to couple the silanes to the substrate surface. Silanization generally covers surfaces by self-assembly using organofunctional alkoxysilane molecules. A variety of siloxane functionalizing agents currently known in the art, for example, for reducing or increasing surface energy, may also be used. Organofunctional alkoxysilanes are classified according to their organofunctional group.

Polynucleotide synthesis

Methods of the present disclosure for polynucleotide synthesis may include processes involving phosphoramidite chemistry. In some cases, polynucleotide synthesis includes coupling a base to a phosphoramidite. Polynucleotide synthesis may comprise coupling bases by depositing phosphoramidite under coupling conditions, wherein the same base is optionally deposited more than once with the phosphoramidite, i.e. double coupling. Polynucleotide synthesis may include capping of unreacted sites. In some cases, capping is optional. Polynucleotide synthesis may also include oxidation or an oxidation step or multiple oxidation steps. Polynucleotide synthesis may include deblocking, detritylation, and sulfurization. In some cases, polynucleotide synthesis comprises oxidation or sulfurization. In some cases, the device is washed, for example with tetrazole or acetonitrile, between one or each step during the polynucleotide synthesis reaction. The time range for any step in the phosphoramidite synthesis process can be less than about 2min, 1min, 50sec, 40sec, 30sec, 20sec, and 10 sec.

Polynucleotide synthesis using the phosphoramidite approach can include the subsequent addition of a phosphoramidite building block (e.g., a nucleoside phosphoramidite) to a growing polynucleotide chain to form a phosphite triester linkage. Phosphoramidite polynucleotide synthesis proceeds in the 3 'to 5' direction. Phosphoramidite polynucleotide synthesis allows for the controlled addition of one nucleotide to a growing nucleic acid strand in each synthesis cycle. In some cases, each synthesis cycle includes a coupling step. Phosphoramidite coupling involves the formation of a phosphite triester bond between an activated nucleoside phosphoramidite and a nucleoside bound to a substrate (e.g., via a linker). In some cases, the nucleoside phosphoramidite is provided to an activated device. In some cases, the nucleoside phosphoramidite is provided to a device with an activator. In some cases, the nucleoside phosphoramidite is provided to the device in an excess of 1.5, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100-fold or more relative to the substrate-bound nucleoside. In some cases, the addition of the nucleoside phosphoramidite is performed in an anhydrous environment (e.g., in anhydrous acetonitrile). After addition of the nucleoside phosphoramidite, the device is optionally washed. In some cases, the coupling step is repeated one or more additional times, optionally with a washing step between the addition of nucleoside phosphoramidite to the substrate. In some cases, a polynucleotide synthesis method as used herein comprises 1,2, 3, or more sequential coupling steps. In many cases, prior to coupling, the device-bound nucleoside is deprotected by removal of a protecting group, wherein the protecting group acts to prevent polymerization. A common protecting group is 4, 4' -Dimethoxytrityl (DMT).

Following coupling, the phosphoramidite polynucleotide synthesis method optionally includes a capping step. In the capping step, the growing polynucleotide is treated with a capping agent. The capping step can be used to block unreacted substrate-bound 5' -OH groups after coupling to prevent further chain extension, thereby preventing formation of polynucleotides with internal base deletions. Furthermore, activation with 1H-tetrazoleThe phosphoramidite of (a) can react to a small extent with the O6 position of guanosine. Without being bound by theory, in the use of I2After water oxidation, this by-product (possibly migrating via O6-N7) can undergo depurination. The apurinic site may end up being cleaved during the final deprotection of the polynucleotide, thereby reducing the yield of the full-length product. The O6 modification can be removed by treatment with a capping reagent prior to oxidation with I2/water. In some cases, including a capping step during polynucleotide synthesis reduces the error rate compared to synthesis without capping. As an example, the capping step comprises treating the polynucleotide bound to the substrate with a mixture of acetic anhydride and 1-methylimidazole. After the capping step, the device is optionally washed.

In some cases, the growing nucleic acid bound to the device is oxidized after addition of the nucleoside phosphoramidite, and optionally after capping and one or more washing steps. The oxidation step involves oxidation of the phosphite triester to a tetracoordinated phosphotriester, a protected precursor to the naturally occurring phosphodiester internucleoside linkage. In some cases, oxidation of the growing polynucleotide is achieved by treatment with iodine and water, optionally in the presence of a weak base (e.g., pyridine, lutidine, collidine). The oxidation can be carried out under anhydrous conditions using, for example, tert-butyl hydroperoxide or (1S) - (+) - (10-camphorsulfonyl) -oxaziridine (CSO). In some methods, a capping step is performed after the oxidizing. The second capping step allows the device to dry, since residual water from oxidation, which may be present continuously, may inhibit subsequent coupling. Following oxidation, the device and growing polynucleotide are optionally washed. In some cases, the oxidation step is replaced with a sulfurization step to obtain a polynucleotide phosphorothioate, wherein any capping step may be performed after sulfurization. A number of reagents are capable of effective sulfur transfer, including but not limited to 3- (dimethylaminomethylene) amino) -3H-1,2, 4-dithiazole-3-thione, DDTT, 3H-1, 2-benzodithiolan-3-one 1, 1-dioxide (also known as Beaucage reagent), and N, N, N' -tetraethylthiuram disulfide (TETD).

To allow subsequent cycles of nucleoside incorporation to occur through coupling, the protected 5' end of the growing polynucleotide bound to the device is removed, allowing the primary hydroxyl group to react with the next nucleoside phosphoramidite. In some cases, the protecting group is DMT, and deblocking is performed with trichloroacetic acid in dichloromethane. Performing detritylation for extended periods of time or detritylation using stronger acid solutions than the recommended acid solutions can result in increased depurination of the polynucleotide bound to the solid support and thus reduced yield of the desired full length product. The methods and compositions of the present disclosure described herein provide controlled deblocking conditions to limit undesirable depurination reactions. In some cases, the device-bound polynucleotide is washed after deblocking. In some cases, efficient washing after deblocking facilitates synthesis of polynucleotides with low error rates.

Polynucleotide synthesis methods generally comprise a series of iterative steps of: applying a protected monomer to an activated functionalized surface (e.g., a locus) to attach to an activated surface, a linker, or to a previously deprotected monomer; deprotecting the applied monomer to make it reactive with a subsequently applied protected monomer; and applying another protected monomer for attachment. One or more intermediate steps include oxidation or sulfidation. In some cases, one or more washing steps may precede or follow one or all of the steps.

Phosphoramidite-based polynucleotide synthesis methods involve a series of chemical steps. In some cases, one or more steps of a synthetic method involve reagent cycling, wherein one or more steps of the method include applying reagents useful for that step to the device. For example, the reagents are cycled through a series of liquid phase deposition and vacuum drying steps. For substrates containing three-dimensional features such as wells, microwells, channels, etc., reagents optionally pass through one or more regions of the device via the wells and/or channels.

The methods and systems described herein relate to polynucleotide synthesis devices for synthesizing polynucleotides. The synthesis may be parallel. For example, at least or about at least 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 1000, 10000, 50000, 75000, 100000 or more polynucleotides may be synthesized in parallel. The total number of polynucleotides that can be synthesized in parallel may be 2-100000, 3-50000, 4-10000, 5-1000, 6-900, 7-850, 8-800, 9-750, 10-700, 11-650, 12-600, 13-550, 14-500, 15-450, 16-400, 17-350, 18-300, 19-250, 20-200, 21-150, 22-100, 23-50, 24-45, 25-40, 30-35. One skilled in the art will appreciate that the total number of polynucleotides synthesized in parallel can be in any range defined by any of these values, e.g., 25-100. The total number of polynucleotides synthesized in parallel may be within any range defined by any value serving as an end of the range. The total molar mass of polynucleotides synthesized within the device, or the molar mass of each polynucleotide, may be at least or at least about 10, 20, 30, 40, 50, 100, 250, 500, 750, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 25000, 50000, 75000, 100000 picomoles, or greater. The length of each polynucleotide or the average length of the polynucleotides within the device may be at least or about at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 300, 400, 500 or more nucleotides. The length of each polynucleotide or the average length of the polynucleotides within the device may be up to or about up to 500, 400, 300, 200, 150, 100, 50, 45, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides. The length of each polynucleotide or the average length of polynucleotides within the device may be between 10-500, 9-400, 11-300, 12-200, 13-150, 14-100, 15-50, 16-45, 17-40, 18-35, 19-25. One skilled in the art will recognize that the length of each polynucleotide or the average length of polynucleotides within the device can be within any range defined by any of these values, such as 100-300. The length of each polynucleotide or the average length of polynucleotides within a device can be within any range defined by any value that serves as an end point of the range.

The methods of synthesizing polynucleotides on a surface provided herein allow for faster synthesis. As an example, at least 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200 or more nucleotides per hour are synthesized. Nucleotides include adenine, guanine, thymine, cytosine, uridine building blocks, or analogs/modified forms thereof. In some cases, the polynucleotide libraries are synthesized in parallel on a substrate. For example, a device comprising about or at least about 100, 1,000, 10,000, 30,000, 75,000, 100,000, 1,000,000, 2,000,000, 3,000,000, 4,000,000, or 5,000,000 resolved loci can support the synthesis of at least the same number of different polynucleotides, wherein polynucleotides encoding different sequences are synthesized at resolved loci. In some cases, a polynucleotide library is synthesized on a device with a low error rate as described herein in less than about three months, two months, one month, three weeks, 15 days, 14 days, 13 days, 12 days, 11 days, 10 days, 9 days, 8 days, 7 days, 6 days, 5 days, 4 days, 3 days, 2 days, 24 hours, or less. In some cases, larger nucleic acids assembled from a polynucleotide library synthesized with a low error rate using the substrates and methods described herein are prepared in less than about three months, two months, one month, three weeks, 15 days, 14 days, 13 days, 12 days, 11 days, 10 days, 9 days, 8 days, 7 days, 6 days, 5 days, 4 days, 3 days, 2 days, 24 hours, or less.

In some cases, the methods described herein provide for generating a nucleic acid library comprising variant nucleic acids that differ at multiple codon sites. In some cases, a nucleic acid can have 1 site, 2 sites, 3 sites, 4 sites, 5 sites, 6 sites, 7 sites, 8 sites, 9 sites, 10 sites, 11 sites, 12 sites, 13 sites, 14 sites, 15 sites, 16 sites, 17 sites, 18 sites, 19 sites, 20 sites, 30 sites, 40 sites, 50 sites, or more variant codon sites.

In some cases, one or more of the variant codon sites may be adjacent. In some cases, one or more of the variant codon sites may be non-adjacent and separated by 1,2, 3, 4,5, 6, 7, 8, 9, 10 or more codons.

In some cases, a nucleic acid can comprise multiple sites of variant codon sites, wherein all of the variant codon sites are adjacent to each other, forming a stretch of variant codon sites. In some cases, a nucleic acid can comprise multiple sites of variant codon sites, wherein none of the variant codon sites are adjacent to each other. In some cases, a nucleic acid can comprise multiple sites of variant codon sites, wherein some variant codon sites are adjacent to each other, forming a stretch of variant codon sites, and some variant codon sites are not adjacent to each other.

Referring to the figures, fig. 8 illustrates an exemplary process workflow for synthesizing nucleic acids (e.g., genes) from shorter nucleic acids. The workflow is roughly divided into the following stages: (1) de novo synthesis of a single-stranded nucleic acid library, (2) ligation of nucleic acids to form larger fragments, (3) error correction, (4) quality control, and (5) transport. The desired nucleic acid sequence or set of nucleic acid sequences is pre-selected prior to de novo synthesis. For example, a set of genes is pre-selected for generation.

Once the large nucleic acids are selected for generation, a predetermined nucleic acid library is designed for de novo synthesis. Various suitable methods for generating high density polynucleotide arrays are known. In this workflow example, a device surface layer is provided. In this example, the chemistry of the surface is altered to improve the polynucleotide synthesis process. The low surface energy regions are created to repel liquid while the high surface energy regions are created to attract liquid. The surface itself may be in the form of a planar surface or contain changes in shape, such as protrusions or pores that increase the surface area. In this workflow example, the selected high surface energy molecule serves the dual function of supporting DNA chemistry, as disclosed in international patent application publication WO/2015/021080, which is incorporated herein by reference in its entirety.

In situ preparation of polynucleotide arrays is performed on a solid support and multiple oligomers are extended in parallel using a single nucleotide extension process. A deposition device, such as a material deposition device, is designed to release reagents in a stepwise manner such that multiple polynucleotides are extended in parallel one residue at a time to generate oligomers 802 having a predetermined nucleic acid sequence. In some cases, the polynucleotide is cleaved from the surface at this stage. Cleavage includes, for example, gas cleavage with ammonia or methylamine.

The generated polynucleotide library is placed in a reaction chamber. In this exemplary workflow, the reaction chambers (also referred to as "nanoreactors") are silicon-coated wells that contain PCR reagents and descend onto the polynucleotide library 803. Before or after the polynucleotide seal 804, an agent is added to release the polynucleotide from the substrate. In this exemplary workflow, the polynucleotide is released after the nanoreactor seal 805. Once released, fragments of the single-stranded polynucleotide hybridize to span the entire long-range DNA sequence. Partial hybridization 805 is possible because each synthesized polynucleotide is designed to have a small portion that overlaps at least one other polynucleotide in the pool.

After hybridization, the PCA reaction was started. During the polymerase cycle, the polynucleotide anneals to the complementary fragment and the nick is filled in with polymerase. The length of the individual fragments is randomly increased each cycle depending on which polynucleotides are found to each other. The complementarity between the fragments allows the formation of a complete, large-span double-stranded DNA 806.

After PCA is completed, the nanoreactor is separated 807 from the device and positioned to interact 808 with the device with PCR primers. After sealing, the nanoreactors undergo PCR 809 and amplify larger nucleic acids. Following PCR 810, nanochamber 811 is opened, error correction reagent 812 is added, chamber is sealed 813 and an error correction reaction is performed to remove mismatched base pairs and/or strands 814 with poor complementarity from the double-stranded PCR amplification product. The nanoreactor 815 is opened and isolated. The error correction product is next subjected to additional processing steps, such as PCR and molecular barcoding, followed by packaging 822 for transport 823.

In some cases, quality control measures are taken. After error correction, the quality control steps include, for example, interacting 816 with the wafer having sequencing primers for amplifying the error correction products, sealing the wafer into a chamber containing the error correction amplification products 817, and performing another round of amplification 818. The nanoreactors 819 are opened and the products 820 are combined and sequenced 821. After acceptable quality control results are obtained, the packaged product 822 is permitted to be shipped 823.

In some cases, nucleic acids generated by a workflow such as in fig. 8 are mutagenized using overlapping primers disclosed herein. In some cases, a primer library is generated by in situ preparation on a solid support and multiple oligomers are extended in parallel using a single nucleotide extension process. A deposition device, such as a material deposition device, is designed to release reagents in a stepwise manner such that multiple polynucleotides are extended in parallel one residue at a time to generate oligomers 802 having a predetermined nucleic acid sequence.

Computer system

Any of the systems described herein can be operatively connected to a computer and can be automated locally or remotely by the computer. In various instances, the methods and systems of the present disclosure may further include software programs on a computer system and uses thereof. Thus, computerized control of the synchronization of the dispensing/vacuuming/refilling functions (e.g., programming and synchronizing material deposition device movement, dispensing action, and vacuum actuation) is within the scope of the present disclosure. The computer system can be programmed to interface between the user-specified base sequence and the location of the material deposition device to deliver the correct reagent to the specified region of the substrate.

The computer system 900 shown in fig. 9 may be understood as a logical device capable of reading instructions from the media 911 and/or the network port 905, which may optionally be connected to a server 909 having a fixed media 912. A system such as that shown in fig. 9 may include a CPU 901, a disk drive 903, an optional input device such as a keyboard 915 and/or mouse 916, and an optional monitor 907. Data communication with a server at a local or remote location may be accomplished through the communication media shown. A communication medium may include any means for transmitting and/or receiving data. The communication medium may be a network connection, a wireless connection, or an internet connection, for example. Such connections may provide for communication via the world wide web. It is contemplated that data related to the present disclosure may be transmitted over such a network or connection for receipt and/or review by user party 922 as shown in fig. 9.

Fig. 10 is a block diagram illustrating a first example architecture of a computer system 1000 that may be used in connection with example examples of the present disclosure. As shown in FIG. 10, the example computer system may include a processor 1002 for processing instructions. Non-limiting examples of processors include: an Intel Xeon processor, an AMD Opteron processor, a Samsung 32-bit RISC ARM 1176JZ (F) -Sv1.0TM processor, an ARM Cortex-A8 Samsung S5PC100TM processor, an ARM Cortex-A8 Apple A4TM processor, a Marvell PXA930TM processor, or a functionally equivalent processor. Multiple threads of execution may be used for parallel processing. In some cases, multiple processors or processors with multiple cores may also be used, whether in a single computer system, in a cluster, or distributed across a system by a network containing multiple computers, cell phones, and/or personal data assistant devices.

As shown in fig. 10, a cache memory 1004 may be connected to or incorporated within the processor 1002 to provide a high speed store of instructions or data that are recently or frequently used by the processor 1002. The processor 1002 is connected to the north bridge 1006 by a processor bus 1008. The north bridge 1006 is coupled to Random Access Memory (RAM)1010 through a memory bus 1012 and manages access to the RAM 1010 by the processor 1002. The north bridge 1006 is also connected to a south bridge 1014 through a chip lumped line 1016. In turn, south bridge 1014 connects to peripheral bus 1018. The peripheral bus may be, for example, PCI-X, PCI Express, or other peripheral bus. The north bridge and south bridge are commonly referred to as a processor chipset and manage data transfers between the processor, RAM, and peripheral components on the peripheral bus 1018. In some alternative architectures, the functionality of the north bridge may be incorporated into the processor, rather than using a separate north bridge chip. In some cases, the system 1000 may include an accelerator card 1022 attached to the peripheral bus 1018. The accelerator may include a Field Programmable Gate Array (FPGA) or other hardware for accelerating some processing. For example, the accelerator may be used for adaptive data reconstruction or to evaluate algebraic expressions used in extended set processing.

Software and data are stored in external memory 1024 and may be loaded into RAM 1010 and/or cache memory 1004 for use by the processor. System 1000 includes an operating system for managing system resources; non-limiting examples of operating systems include: linux, windows, MACOSTM, BlackBerry OSTM, iOSTM, and other functionally equivalent operating systems, and application software running on top of the operating systems for managing data storage and optimization according to example scenarios of the present disclosure. In this example, system 1000 also includes Network Interface Cards (NICs) 1020 and 1021 to interface with peripheral buses to provide a network interface with external storage, such as Network Attached Storage (NAS) and other computer systems that may be used for distributed parallel processing.

FIG. 11 is a diagram showing a network 1100 having multiple computer systems 1102a and 1102b, multiple cellular telephones and personal data assistants 1102c, and Network Attached Storage (NAS)1104a and 1104 b. In an example instance, systems 1102a, 1102b, and 1102c can manage data storage and optimize data access to data stored in Network Attached Storage (NAS)1104a and 1104 b. A mathematical model can be used for this data and evaluated using distributed parallel processing across computer systems 1102a and 1102b and cell phones and personal data assistant systems 1102 c. Computer systems 1102a and 1102b and cellular telephone and personal data assistant system 1102c may also provide parallel processing of adaptive data reconstruction of data stored in Network Attached Storage (NAS)1104a and 1104 b. Fig. 11 illustrates only one example, and a wide variety of other computer architectures and systems can be used with the various examples of the present disclosure. For example, blade servers may be used to provide parallel processing. Processor blades may be connected through a backplane to provide parallel processing. The storage may also be connected to the backplane through a separate network interface or as Network Attached Storage (NAS). In some example instances, a processor may maintain separate memory spaces and transmit data through a network interface, backplane, or other connector for parallel processing by other processors. In other cases, some or all of the processors may use a shared virtual address memory space.

FIG. 12 is a block diagram of a multiprocessor computer system using a shared virtual address memory space according to an example scenario. The system includes a plurality of processors 1202a-f that can access a shared memory subsystem 1204. In this system, a plurality of programmable hardware storage algorithm processors (MAP)1206a-f are incorporated in a memory subsystem 1204. Each MAP 1206a-f may contain memory 1208a-f and one or more Field Programmable Gate Arrays (FPGAs) 1210 a-f. The MAP provides configurable functional units and may provide specific algorithms or portions of algorithms to the FPGAs 1210a-f for processing in close cooperation with a corresponding processor. For example, in an example scenario, MAP may be used to evaluate algebraic expressions associated with data models and to perform adaptive data reconstruction. In this example, each MAP is globally accessible by all processors for these purposes. In one configuration, each MAP may use Direct Memory Access (DMA) to access the associated memory 1208a-f to perform tasks independently and asynchronously from the respective microprocessor 1202 a-f. In this configuration, a MAP may feed the results directly to another MAP for pipelined processing and parallel execution of algorithms.

The above computer architectures and systems are merely examples, and a wide variety of other computer, cellular telephone, and personal data assistant architectures and systems can be used in conjunction with the example examples, including systems using any combination of general purpose processors, co-processors, FPGAs, and other programmable logic devices, System On Chip (SOC), Application Specific Integrated Circuits (ASICs), and other processing and logic elements. In some cases, all or a portion of a computer system may be implemented in software or hardware. Any kind of data storage medium may be used in conjunction with the illustrative examples, including random access memory, hard disk drives, flash memory, tape drives, disk arrays, Network Attached Storage (NAS), and other local or distributed data storage devices and systems.

In an example instance, the computer system may be implemented using software modules executing on any of the above-described or other computer architectures and systems. In other examples, the functionality of the system may be partially or fully implemented in firmware, a programmable logic device such as a Field Programmable Gate Array (FPGA) as referenced in fig. 10, a system on a chip (SOC), an Application Specific Integrated Circuit (ASIC), or other processing and logic elements. For example, a Set Processor (Set Processor) and optimizer may be implemented in a hardware accelerated manner using a hardware accelerator card, such as accelerator card 1022 shown in FIG. 10.

The following examples are set forth to more clearly illustrate the principles and practice of the embodiments disclosed herein to those skilled in the art and are not to be construed as limiting the scope of any claimed embodiments. All parts and percentages are by weight unless otherwise indicated.

Examples

The following examples are given for the purpose of illustrating various embodiments of the present disclosure and are not intended to limit the invention in any way. These examples, as well as the methods presently representative of preferred embodiments, are illustrative and not intended to limit the scope of the present disclosure. Variations thereof and other uses will occur to those skilled in the art and are encompassed within the spirit of the disclosure as defined by the scope of the claims.

Example 1: functionalization of device surfaces

The device is functionalized to support the attachment and synthesis of a library of polynucleotides. First using a catalyst containing 90% H2SO4And 10% of H2O2The surface of the device was rinsed for 20 minutes with the Tiger solution (piranha solution). Rinsing the device in several beakers containing deionized water, holding the device in a deionized water gooseneck cock for 5min, and adding N2And (5) drying. The device was then placed in NH4OH (1: 100; 3mL:300mL) was soaked for 5min, rinsed with deionized water using a hand held spray gun (hand gun), soaked for 1min each in three consecutive beakers containing deionized water, and then rinsed with deionized water using a hand held spray gun. The device was then plasma cleaned by exposing the device surface to O2. O at 250 watts in downstream mode using SAMCO PC-300 instrument2Plasma etch for 1 min.

Use is provided withYES-1224P vapor deposition oven system with the following parameters, activated functionalization of a cleaned device surface with a solution comprising N- (3-triethoxysilylpropyl) -4-hydroxybutyramide: 0.5 to 1 torr for 60min, 70 ℃, 135 ℃. The resist coating was applied to the surface of the device using a Brewer Science 200X spin coater. Will SPRTM3612 photoresist was spun on the device at 2500rpm for 40 sec. The apparatus is pre-baked on a Brewer hot plate at 90 ℃ for 30 min. The device was lithographed using a Karl Suss MA6 mask aligner. The device was exposed for 2.2sec and developed in MSF 26A for 1 min. The remaining developer was rinsed with a hand-held spray gun and the apparatus was soaked in water for 5 min. The device was baked in an oven at 100 ℃ for 30min, followed by visual inspection for lithographic defects using a Nikon L200. O was performed using a Pre-clean (descum) process at 250 watts using a SAMCO PC-300 instrument2Plasma etch for 1min to remove residual resist.

The surface of the device was passivated and functionalized with 100. mu.L of perfluorooctyltrichlorosilane mixed with 10. mu.L of light mineral oil. The device was placed in the chamber, pumped for 10min, then the valve to the pump was closed and left for 10 min. The chamber is vented. The apparatus strips the resist by soaking twice for 5min in 500mL NMP at 70 ℃ and simultaneously sonicating at maximum power (9 on the Crest system). The device was then soaked in 500mL isopropanol at room temperature for 5min while sonication was performed at maximum power. The device was immersed in 300mL of 200 proof ethanol and treated with N2And (5) drying. The functionalized surface is activated to serve as a support for polynucleotide synthesis.

Example 2: synthesis of 50-mer sequences on an oligonucleotide Synthesis device

A two-dimensional oligonucleotide synthesis apparatus, uniformly functionalized with N- (3-triethoxysilylpropyl) -4-hydroxybutyramide (Gelest), and used to synthesize an exemplary polynucleotide of 50bp ("50-mer polynucleotide") was assembled into a flow cell, which was attached to the flow cell (Applied Biosystems (ABI394 DNA synthesizer ").

The sequence of the 50-mer is as follows. 5'AGACAATCAACCATTTGGGGTGGACAGCCTTGACCTCTAGACTTCGGCAT # # TTTTTTTTTT3', where # denotes thymidine-succinylcaproamide CED phosphoramidite (CLP-2244 from Chemgenes), a cleavable linker capable of releasing an oligonucleotide from a surface during deprotection.

Synthesis was accomplished using standard DNA synthesis chemistry (coupling, capping, oxidation and deblocking) according to the protocol in table 2 and an ABI synthesizer.

Table 2: synthetic schemes

The phosphoramidite/activator combination is delivered in a manner similar to the delivery of bulk agent through a flow cell. When the environment is kept "wet" by the reagents for the entire time, no drying step is performed.

The flow restrictor was removed from the ABI394 synthesizer to enable faster flow. In the absence of a flow restrictor, the flow rates of amides (amidites) (0.1M in ACN), activator (0.25M benzoylthiotetrazole in ACN ("BTT"; 30-3070-xx from GlenResearch)), and Ox (0.02M I2 in 20% pyridine, 10% water, and 70% THF) were approximately about 100uL/sec, a 1:1 mixture of acetonitrile ("ACN") and capping reagents (cap A and cap B, where cap A is acetic anhydride in THF/pyridine, cap B is 16% 1-methylimidazole (1-methylimidazole) in THF) were approximately 200uL/sec, and the flow rate of deblocking agent (3% dichloroacetic acid in toluene) was approximately 300uL/sec (in contrast, in the case of a flow restrictor, the flow rates of all reagents were approximately 50 uL/sec). The time to completely expel the oxidant was observed, the timing of the chemical flow time was adjusted accordingly, and additional ACN washes were introduced between the different chemicals. Following polynucleotide synthesis, the chip was deprotected in gaseous ammonia at 75psi overnight. Five drops of water were applied to the surface to recover the polynucleotide. The recovered polynucleotides were then analyzed on a BioAnalyzer small RNA chip.

Example 3: synthesis of 100-mer sequences on an oligonucleotide Synthesis device

Using the same procedure described in example 2 for the synthesis of 50-mer sequences, 100-mer polynucleotides ("100-mer polynucleotides"; 5'CGGGATCCTTATCGTCATCGTCGTACAGATCCCGACCCATTTGCTGTCCACCAGTCATGCTAGCCATACCATGATGATGATGATGATGAGAACCCCGCAT # # TTTTTTTTTT3', where # denotes thymidine-succinylcaproamide CED phosphoramidite (CLP-2244 from Chemgenes), the first functionalized homogeneously with N- (3-triethoxysilylpropyl) -4-hydroxybutyramide and the second functionalized with a 5/95 mixture of 11-acetoxyundecyltriethoxysilane and N-decyltriethoxysilane, were synthesized on two different silicon chips and the polynucleotides extracted from the surface were analyzed on a BioAnalyzer instrument.

All ten samples from both chips were further PCR amplified using the forward primer (5'ATGCGGGGTTCTCATCATC3') and reverse primer (5'CGGGATCCTTATCGTCATCG3') in a 50uL PCR mix (25uL NEB Q5 master mix, 2.5uL 10uM forward primer, 2.5uL 10uM reverse primer, 1uL of polynucleotides extracted from the surface, added to 50uL with water) using the following thermal cycling procedure:

98℃,30sec

98 ℃ for 10 sec; 63 ℃ for 10 sec; 72 ℃ for 10 sec; repeat for 12 cycles

72℃,2min

The PCR product was also run on a BioAnalyzer showing a sharp peak at the 100-mer position. Then, the PCR amplified samples were cloned and Sanger sequenced. Table 3 summarizes the Sanger sequencing results for samples taken from spots 1-5 from chip 1 and samples taken from spots 6-10 from chip 2.

Table 3: sequencing results

Thus, the high quality and uniformity of the synthesized polynucleotides was reproduced on two chips with different surface chemistries. Overall, 89% of the sequenced 100-mers were perfect sequences without errors, corresponding to 233 out of 262.

Table 4 summarizes the error characteristics of the sequences obtained from the polynucleotide samples from spots 1-10.

Table 4: error characterization

Example 4: antibody optimization

Libraries generated from parental sequences

Antibody sequences targeting PD-1 were designed by generating a library containing mutations from 12 individuals on a computer. The heavy and light chain mutation spaces are derived from the parental sequences and the closest germline sequences to generate the NGS database. The NGS database comprises sequences of light chain CDR1-3 and heavy chain CDR1-3 that comprise mutations compared to a parental reference or germline sequence. All CDR sequences are presented in two or more individuals from the NGS database. The input sequence is shown in fig. 3A. The library contained 5.9x107Different heavy chains and 2.9x106Light chains (FIG. 3B).

Bead-based selection

The C-terminal biotinylated PD-1 antigen was bound to streptavidin-coated magnetic beads for five rounds of selection. The bead-bound variant was depleted between each round. The stringency of selection increased with each subsequent round, and the enrichment ratio tracked target binding.

ELISA and Next Generation sequencing

Constructs expressing the heavy and light chain combinations were synthesized and phage displayed to identify improved PD-1 binders. The pool was sequenced with 1000 ten thousand reads and 400,000 unique clones were identified. The read length distribution is highly uniform (fig. 4A). Clone frequency and accumulation were measured at each round of panning (fig. 4B and 4C). Four different stringency conditions were used for panning (fig. 4D). High stringency selection enriches identical conjugate pools in duplicate selections. The low stringency selection recovered 44 out of the same 70 clones as the high stringency selection (63%), with a wider range of low affinity binders. Most scFv binders were enriched in round 5 (fig. 5A). In the ELISA experiment measuring scFv binding to PD-1, over 90% (68/75) of clones were present 5-fold over background and enriched to over 0.01%.

Five rounds of selection were completed with three different initial selection conditions. Clonal enrichment by NGS was followed through each successive round. Sequences enriched for off-target or background binders are removed. Approximately 1000 clones presented in round 5 were enriched to more than 0.01% of the population. Sequence analysis showed that under different selection conditions, the vast majority (> 95%) of clones enriched for binding to PD-1 were captured identically (fig. 4E).

High throughput IgG characterization

Clones were transiently transfected in Expi293 and purified by Kingfisher and Hamilton automation platform. Yield and purity were confirmed by Perkin Elmer Labchip and analytical HPLC. Binding affinities and epitope binning (binding) of >170 IgG variants were evaluated using the cartera LSA system (data not shown).

Optimized IgG bound with similar or improved affinity to a comparative antibody having a sequence corresponding to pembrolizumab (comparative 1) and a sequence corresponding to nivolumab (comparative 2) (data not shown). Binding of scFV to PD-1 was also measured as shown in figure 5B. Compared to a sequence having a sequence corresponding to pembrolizumab (comparative 1) and a sequence corresponding to nivolumab (comparative2) The sequence of the optimized antibody contained fewer germline mutations (fig. 6A and table 5A). The light chains of the optimized antibodies were highly diverse, with more than 90% of the clones containing unique light chains that were never repeated (table 5B). The optimized IgG showed a 100-fold improvement in monovalent binding affinity compared to the parental sequence. PD1-1 clone was cloned with a K of 4.52nMDBinding to PD-1, while several other clones showed binding affinity<10 nM. Figure 6B shows the use of the method described in optimization of affinity increases. These high affinity binders each contained a unique CDRH3 and were not clustered in a sequential lineage. Diversity among the various CDRs was also observed, as shown in fig. 6C-6D. The PD-1 antibody showed improved binding affinity compared to wild type (fig. 6E-6F and table 5C).

Table 5a. pd-1 variant sequences

Table 5B.

Table 5C.

Functional and exploitability analysis

IgG optimized by the methods described herein was tested for functional blockade of the PD-1/PD-L1 interaction. Figure 7A shows that the high affinity variants show improved IC50 compared to wild-type and comparative anti-PD 1 antibodies with sequences corresponding to nivolumab. IC50 is highly correlated with monovalent binding affinity. As shown in fig. 7B, optimized IgG showed improved binding affinity (up to 72-fold) and function was also increased 9.5-fold. Six antibodies with higher binding affinity and function were identified compared to antibodies with sequences corresponding to nivolumab. As shown in fig. 7C, the addition of anti-PD-1 IgG blocked PD-1/PDL-1 interaction, releasing inhibitory signals, and resulted in TCR activation and NFAT-RE mediated luminescence (RU). In addition, all binders also retained binding to cynomolgus monkey PD-1. Several high affinity iggs showed low multispecific scores as measured by BVP binding ELISA (fig. 7D). In addition, IgG was also tested against Tm and Tagg on the uncainated UNCLE machine and on analytical HPLC.

While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will occur to those skilled in the art without departing from the disclosure. It should be understood that various alternatives to the embodiments of the disclosure described herein may be employed in practicing the disclosure. It is intended that the scope of the disclosure be defined by the following claims and that the methods and structures within the scope of these claims and their equivalents be covered thereby.

74页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种血红蛋白病治疗有效性预测方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!