Functionally independent labelling of organic compounds

文档序号:689635 发布日期:2021-04-30 浏览:25次 中文

阅读说明:本技术 有机化合物功能性独立的标记 (Functionally independent labelling of organic compounds ) 是由 杨光 理查德·A·乐纳 姜标 马培翔 许红涛 于 2019-07-17 设计创作,主要内容包括:本文公开了不依赖于化合物的任何官能团来标记有机化合物的方法。在一些实施方案中,提供了可用于所述方法的双官能连接子。(Disclosed herein are methods for labeling organic compounds independent of any functional group of the compound. In some embodiments, bifunctional linkers useful in the methods are provided.)

1. A linker precursor molecule C comprising a first functional group capable of generating a carbene or nitrene or radical capable of reacting with an organic compound a and a second functional group which is unreactive under the conditions of the first functional group reacting with the organic compound a but is capable of reacting with a second organic compound B.

2. The linker precursor molecule C of claim 1 of the formula R-L-M wherein R is a first functional group, M is a second functional group, L is a linker moiety comprising one or more moieties independently selected from optionally substituted alkylene, optionally substituted alkenylene, optionally substituted alkynylene, optionally substituted heteroalkylene, optionally substituted cycloalkylene, optionally substituted heterocycloalkylene, optionally substituted arylene, and optionally substituted heteroarylene.

3. The linker precursor molecule C of claim 1 or 2 wherein the first functional group comprises diazomethane, aryl azide, benzophenone, or diazo.

4. The linker precursor molecule C according to claim 1 or 2 wherein the first functional group is selected from

WhereinIndicates the attachment site to the rest of linker precursor molecule C.

5. The linker precursor molecule C according to any one of claims 1-4 wherein the second functional group comprises an azide, an alkyne, an alkene or-C (O) -O-.

6. The linker precursor molecule C according to any of claims 1 to 4 wherein the second functional group is selected from

WhereinIndicates the attachment site to the rest of linker precursor molecule C.

7. The linker precursor molecule C of any one of claims 2 to 6 wherein L is an alkylene group wherein one or more methylene units of the alkylene group are optionally independently selected from NR1、O、S、CO、SO、SO2、NR1C(O)、C(O)NR1Cycloalkylene, heterocycloalkylene, arylene and heteroaryl, wherein R is substituted with a moiety1Is hydrogen or alkyl.

8. The linker precursor molecule C of any one of claims 2 to 6 wherein L is alkylene, wherein one or more methylene units of alkylene are substituted with a moiety independently selected from O, NHC (O) and C (O) NH.

9. A linker precursor molecule C selected from

10. A linker precursor molecule C selected from

11. A compound selected from table 3 or a stereoisomer or mixture of stereoisomers thereof.

12. A labeled organic compound E prepared by a process comprising the steps of:

(1) reacting the first functional group of the linker precursor molecule C according to any one of claims 1 to 10 with an organic compound a under carbene or nitrene or radical reaction conditions to produce an intermediate D having the second functional group of the linker precursor molecule C according to any one of claims 1 to 10; and

(2) and reacting the second functional group with a labeled molecule B with a label.

13. The labeled organic compound E of claim 12, wherein the organic compound a does not comprise a reactive functionality.

14. The tagged organic compound E of claim 12 or 13, wherein the tag comprises a unique sequence or a fluorescent tag, or a combination thereof.

15. The labeled organic compound E of claim 14, wherein the unique sequence comprises a single-stranded DNA or RNA, a double-stranded DNA or RNA, a triple-stranded DNA or RNA, a multi-stranded natural or artificial oligonucleotide, or a chemically modified oligonucleotide.

16. The labeled organic compound E of claim 13 wherein the unique sequence comprises an oligonucleotide.

17. A compound selected from table 4 or a stereoisomer or mixture of stereoisomers thereof.

18. A set of labelled organic compounds which are isomers comprising a labelling moiety, a linker and an organic compound moiety, characterised in that the isomers each have the same labelling moiety, the same linker and the same organic compound moiety, and in that in each different isomer the labelling moiety is attached to a different site of the organic compound moiety through the linker.

19. A mixture comprising (1) a plurality of isomeric compounds, wherein the isomeric compounds are produced by reacting a carbene or nitrene or radical of organic compound a with a linker precursor molecule C according to any one of claims 1 to 10, and/or (2) one or more pairs of compounds, which are disproportionation products of compound D produced by reacting a carbene or nitrene or radical of organic compound a with a linker precursor molecule C.

20. The mixture of claim 19, wherein the organic compound a does not comprise a reactive functionality.

21. The mixture of claim 19 or 20, wherein the pair of disproportionation products includes compound F having a molecular weight of compound D produced by a carbene or nitrene or radical reaction plus 2, and compound F' having a molecular weight of compound D produced by a carbene or nitrene or radical reaction minus 2.

22. A mixture of a plurality of isomerically labeled compounds produced by a process comprising the steps of:

(1) reacting the first functional group of the linker precursor molecule C of any one of claims 1-10 with an organic compound a under carbene or nitrene or radical reaction conditions to produce an intermediate D having the second functional group of the linker precursor molecule of any one of claims 1-10; and

(2) and reacting the second functional group with a labeled molecule B with a label.

23. The mixture of claim 22, wherein the label comprises a unique sequence or a fluorescent tag, or a combination thereof.

24. The mixture of claim 23, wherein the unique sequences comprise single-stranded DNA or RNA, double-stranded DNA or RNA, triple-stranded DNA or RNA, or multi-stranded natural or artificial oligonucleotides or chemically modified oligonucleotides.

25. The mixture of claim 23, wherein the unique sequences comprise oligonucleotides.

26. A method for labeling an organic compound A, which comprises

(1) Reacting the first functional group of the linker precursor molecule C according to any one of claims 1 to 10 with an organic compound a under carbene or nitrene or radical reaction conditions to produce an intermediate D having the second functional group of the linker precursor molecule according to any one of claims 1 to 10; and

(2) and reacting the second functional group with a labeled molecule B with a label.

27. The method of claim 24, wherein the organic compound a does not comprise a reactive functionality.

28. The method of claim 26 or 27, wherein the first functional group of linker precursor molecule C reacts with multiple sites of organic compound a to produce a mixture of multiple isomeric products.

29. The method of any one of claims 26 to 28, wherein the label comprises a unique sequence or a fluorescent tag, or a combination thereof.

30. The method of claim 29, wherein the unique sequence comprises single-stranded DNA or RNA, double-stranded DNA or RNA, triple-stranded DNA or RNA, multi-stranded natural oligonucleotides, multi-stranded artificial oligonucleotides or chemically modified oligonucleotides, or the like.

31. The method of claim 29, wherein the unique sequence comprises an oligonucleotide.

32. A library of tagged organic compounds comprising at least two tagged organic compounds, each tagged organic compound made by the method of any of claims 26-30, wherein a unique organic compound is tagged with a unique tag.

33. The library of claim 32, comprising at least 10 unique organic compounds, each labeled with a unique label.

34. The library of claim 32, comprising at least 50 unique organic compounds, each labeled with a unique label.

35. A library according to any of claims 32 to 34 wherein the unique signature is an oligonucleotide having a unique sequence.

36. A method of identifying an organic compound that binds to a target, comprising analysing the library of any of claims 32-35 to identify an organic compound that binds to a target based on its label.

37. A method of identifying a compound having two conjugated double bonds, the method comprising

(1) Reacting the first functional group of linker precursor molecule C according to any one of claims 1-10 with an organic compound a under carbene or nitrene or radical reaction conditions to produce a product or product mixture, wherein the second functional group of the linker precursor molecule is unreacted; and

(2) analyzing the product or product mixture, wherein the presence of a pair of disproportionation products comprising compound F having a molecular weight of carbene or nitrene or product D of a free radical reaction plus 2 and compound F' having a molecular weight of carbene or nitrene or product D of a free radical reaction minus 2, indicates that organic compound a has two conjugated double bonds.

38. A method of treating cancer, stroke, myocardial infarction, neurodegenerative disease or inflammation in a subject in need thereof comprising administering to the subject a therapeutically effective amount of luteolin, naringin, hyperoside, glycitin, epicatechin, epigallocatechin, daphnetin, F001, F002, F003, or F006, or a derivative thereof.

39. The method of claim 38, wherein the subject has cancer cells that are mutated in BRCA1, BRCA2, PALB2, or PTENT.

40. The method of claim 39, wherein the mutation is an inactivating mutation.

The method of claim 38, wherein the subject has cancer cells that overexpress PARP protein compared to corresponding normal cells.

41. The method of claim 38, wherein the neurodegenerative disease is selected from the group consisting of parkinson's disease, alzheimer's disease, huntington's disease, atrophic myelitis, aids dementia, and vascular dementia, or a combination thereof.

42. The method of claim 38, wherein the inflammation is associated with a disease or condition selected from the group consisting of: parkinson's disease, arthritis, rheumatoid arthritis, multiple sclerosis, psoriasis, psoriatic arthritis, Crohn's disease, inflammatory bowel disease, ulcerative colitis, lupus, systemic lupus erythematosus, juvenile rheumatoid arthritis, juvenile idiopathic arthritis, Graves 'disease, Hashimoto's thyroiditis, Edison's disease, celiac disease, dermatomyositis, multiple sclerosis, myasthenia gravis, pernicious anemia, Sjogren's syndrome, type I diabetes, vasculitis, uveitis, atherosclerosis and ankylosing spondylitis.

Technical Field

The present disclosure relates generally to chemistry and biology, and more particularly to labeling and screening of natural products.

Background

Currently, targeted identification of natural products often requires extensive structure-activity relationship (SAR) work in order to introduce suitable tags, such as affinity tags and cross-linking tags. This work is typically a trial and error process. If the tag is located in an inappropriate position, it will sterically hinder the interaction of the compound and target, resulting in a false negative result in the biological screen. Thus, targeted identification requires that various drug tags bind to different site markers to reduce false negative results. However, multi-site labeling relies on functional groups on natural products or total synthesis to introduce new functional groups. The structure of natural products is complex and total synthesis is challenging for chemists. This method is time consuming and inefficient.

As chemistry has evolved, more and more new natural products have been isolated. For high throughput screening of chemical substances, several agencies have proposed strategies using combinatorial chemistry and DNA coding techniques, see, for example, US5565324, EP0643778, US7935658, WO2010/094036 and CN 103882531.

These methods use combinatorial chemistry to build chemical libraries, using small chemical fragments to build large chemicals. However, these chemical reactions are limited by the stability of DNA, so many common reagents including strong bases, strong reducing agents, etc. are excluded. Thus, many complex natural products have not been included in combinatorial chemistry libraries.

Summary of The Invention

The present disclosure provides a new labeling strategy to cover a larger chemical space. Thus, provided herein are methods for the site-non-selective labeling of organic chemicals using oligonucleotides, compounds useful in said methods, and labeled compounds, libraries comprising said labeled compounds, and uses thereof.

In some embodiments, there is provided a method for labeling an organic compound, the method comprising:

(1) contacting a linker precursor molecule with an organic compound under site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, wherein the linker precursor molecule has two functional groups, a first functional group R and a second functional group M, wherein the first functional group R of the linker precursor molecule produces a site-nonselective reaction group, such as carbene or nitrene or radical, which reacts with the organic compound under site-nonselective reaction conditions, and the second functional group is unreactive under the conditions under which the first functional group reacts, wherein contacting the linker precursor molecule with the organic compound forms an intermediate having the second functional group M of the linker precursor molecule; and

(2) contacting the intermediate of (1) with a labeling molecule, thereby reacting the second functional group M of the linker precursor molecule with the labeling molecule to produce a labeled organic compound, wherein the labeling molecule comprises a label.

In some embodiments, a linker precursor molecule is provided comprising a first functional group and a second functional group, wherein the first functional group is capable of generating a site-non-selective reactive group, such as a carbene, nitrene or radical, which is capable of reacting with an organic compound in a site-non-selective manner, and the second functional group is non-reactive under conditions in which the first functional group reacts with the organic compound, but is capable of reacting with a labeling molecule.

In some embodiments, there is provided a labeled organic compound produced by a method comprising the steps of:

(1) contacting a linker precursor molecule with an organic compound under site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, wherein the linker precursor molecule has two functional groups, a first functional group R and a second functional group M, wherein the first functional group R of the linker precursor molecule produces a site-nonselective reaction group, such as carbene or nitrene or radical, which reacts with the organic compound under site-nonselective reaction conditions, and the second functional group is unreactive under conditions in which the first functional group reacts with the organic compound, wherein the linker precursor molecule contacts the organic compound to form an intermediate having the second functional group M of the linker precursor molecule; and

(2) contacting the intermediate of (1) with a labeling molecule, thereby reacting the second functional group M of the linker precursor molecule with the labeling molecule to produce a labeled organic compound, wherein the labeling molecule comprises a label.

In some embodiments, a set of labeled organic compounds comprising positional isomers, each comprising a labeling moiety, a linker, and an organic compound moiety, wherein the isomers differ in position on the organic compound moiety, and the labeling moiety is linked to the organic compound moiety through the linker, is provided.

In some embodiments, a library of tagged organic compounds is provided comprising at least two tagged organic compounds (or two sets of isomeric tagged organic compounds), wherein a Unique organic compound (Unique organic compound) is tagged with a Unique tag (Unique cable).

In some embodiments, a method of identifying an organic compound that binds to a target is provided, comprising analyzing the labeled organic compound, a set of isomeric labeled organic compounds, or a library of labeled organic compounds, and identifying the organic compound that binds to the target based on its label.

These and other embodiments are further described below.

Drawings

FIG. 1 shows HPLC and mass spectra of the oridonin linker of example 5.

Figure 2 shows the mass spectrum of compound 8 in example 6.

FIG. 3 shows the mass spectrum of the labeled compound 9 in example 7.

FIG. 4 shows the mass spectrum of labeled Compound 8 in example 8.

FIGS. 5 and 6 show HPLC and mass spectra of the oridonin-linker II conjugate compound of example 9.

Figures 7 and 8 show HPLC and mass spectra of the tripterine-linker II conjugate compound of example 10.

Fig. 9 and 10 show HPLC and mass spectra of the paclitaxel-linker II conjugate compound in example 11.

Fig. 11 and 12 show HPLC and mass spectra of the triptophenolide-linker II conjugate compound of example 12.

Fig. 13 and 14 show HPLC and mass spectra of the maytansinol-linker II conjugate compound in example 13.

FIGS. 15 and 16 show HPLC and mass spectra of dehydroabietic acid II conjugate compound (A) and hydroabietic acid-plus-II conjugate compound (B) compounds in example 14.

FIG. 17 shows DNA migration of four samples in agarose gel electrophoresis: 1. compound 6 conjugated to an oligonucleotide; 2. compound 7 is conjugated to an oligonucleotide; 3. unreacted oligonucleotide; and 4. DNA head 4 with linker II was irradiated with UV light for 2 hours and then ligated with oligonucleotide.

FIG. 18 shows the preparation of the crude irradiated product 2, 4-dihydroxyacetophenone labeled with linker IV without evacuation1H NMR; and ii) a crude product of 2, 4-dihydroxyacetophenone labeled with linker IV after evacuation. Linker IV-2, 4-dihydroxyacetophenone conjugates are identified with arrows.

Figure 19 shows a general scheme for quantification of natural product-DNA conjugation by quantitative polymerase chain reaction (qPCR) and sequencing.

FIG. 20(a) shows DEL selection fingerprints for heat shock 70kDa protein (HSP 70). FIG. 20(b) shows DEL selection fingerprints for poly (ADP-ribose) polymerase 1(PARP 1). In fig. 20, the red dotted line is the cut-off point for hit selection. The corresponding chemical structure is shown in fig. 21.

FIG. 21 shows the chemical structure of a selected conjugate from the DEL selection of PAPR 1.

Figure 22 shows hit validation for nDEL selected PARP1 conjugates. Luteolin (figure 22(a)) and F003 (figure 22(b)) inhibit the enzymatic activity of human PARP 1. The molecular docking of luteolin at the active site of human PARP1 is shown in FIG. 22 (c).

It should be appreciated that some or all of the figures are schematic representations for purposes of illustration.

Detailed Description

Definition of

The following description sets forth exemplary embodiments of the present technology. It should be recognized, however, that such description is not intended as a limitation on the scope of the present disclosure, but is instead provided as a description of exemplary embodiments.

Unless explicitly stated otherwise, the following definitions apply:

as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a product" includes a plurality of products, such as isomers.

As used herein, the term "comprising" or "comprises" is intended to mean that the compositions and methods include the recited ingredients, but do not exclude other ingredients. When used in defining compositions and methods, "consisting essentially of … …" means excluding other elements that have any significance to the composition for the purpose. Thus, a composition consisting essentially of the ingredients defined herein does not exclude other materials or steps that do not materially affect the basic and novel characteristics claimed. "consisting of … …" shall mean excluding other components and substantial process steps beyond trace elements. Embodiments defined by each of these transition terms are within the scope of the present disclosure.

The term "about" when used in conjunction with a numerical designation, e.g., temperature, time, amount, and concentration (including ranges), means approximate, and varies by (+) or (-) 10%, 5%, or 1%.

"alkyl" refers to a monovalent saturated straight or branched chain aliphatic hydrocarbon group. In some embodiments, the alkyl group has 1 to 30 carbon atoms (i.e., C)1-30Alkenyl), 1 to 20 carbon atoms (i.e., C)1-20Alkenyl), 1 to 8 carbon atoms (i.e., C)1-8Alkenyl), 1 to 6 carbon atoms (C)1-6Alkyl) or 1 to 4 carbon atoms (i.e., C)1-4Alkenyl). The term includes, for example, straight and branched chain hydrocarbon radicals such as methyl (CH)3-, ethyl (CH)3CH2-), n-propyl (CH)3CH2CH2-, isopropyl ((CH)3)2CH-), n-butyl (CH)3CH2CH2CH2-, isobutyl ((CH)3)2CHCH2-, sec-butyl ((CH)3)(CH3CH2) CH-) and tert-butyl ((CH)3)3C-). "alkylene" refers to a divalent straight or branched chain saturated aliphatic hydrocarbon group.

"alkenyl" refers to an alkyl group containing at least one carbon-carbon double bond. In some embodiments, alkenyl groups have 2 to 30 carbon atoms (i.e., C)2-30Alkenyl), 2 to 20 carbon atoms (i.e., C)2-20Alkenyl), 2 to 8 carbon atoms (i.e., C)2-8Alkenyl), 2 to 6 carbon atoms (i.e., C)2-6Alkenyl), or 2 to 4 carbon atoms (i.e., C)2-4Alkenyl). Examples of alkenyl groups include, but are not limited to, ethenyl, propenyl, butadienyl (including 1, 2-butadienyl and 1, 3-butadienyl). "alkenylene" refers to a divalent alkenyl group.

"alkynyl" refers to an alkyl group containing at least one carbon-carbon triple bond. In some embodiments, alkenyl groups have 2 to 30 carbon atoms (i.e., C)2-30Alkynyl), 2 to 20 carbon atoms (i.e., C)2-20Alkynyl), 2 to 8 carbon atoms(i.e., C)2-8Alkynyl), 2 to 6 carbon atoms (i.e., C)2-6Alkynyl), or 2 to 4 carbon atoms (i.e., C)2-4Alkynyl). The term "alkynyl" also includes those groups having one triple bond and one double bond. "alkynylene" refers to a divalent alkynyl group.

"alkoxy" means an "alkyl-O-" group. Examples of alkoxy groups include, but are not limited to, methoxy, ethoxy, n-propoxy, isopropoxy, n-butoxy, tert-butoxy, sec-butoxy, n-pentoxy, n-hexoxy, and 1, 2-dimethylbutoxy.

"haloalkoxy" refers to an alkoxy group, as defined above, wherein one or more hydrogen atoms are replaced by a halogen.

"alkylthio" refers to an "alkyl-S-" group.

"acyl" refers to the group-C (O) R, wherein R is hydrogen, alkyl, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroalkyl, or heteroaryl; each may be optionally substituted, as defined herein. Examples of acyl groups include, but are not limited to, formyl, acetyl, cyclohexylcarbonyl, cyclohexylmethylcarbonyl, and benzoyl.

"amido" means-C (O) NRyRzOf a group of "C-acylamino" and-NRyC(O)RzThe "N-acylamino" group of (1), wherein RyAnd RzIndependently selected from hydrogen, alkyl, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroalkyl, or heteroaryl; each of which may be optionally substituted.

"amino" means-NRyRzGroup, wherein RyAnd RzIndependently selected from hydrogen, alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl or heteroaryl; each of which may be optionally substituted.

"amidino" means-C (NH)2)。

"aryl" refers to a monovalent aromatic carbocyclic group having a single ring (e.g., monocyclic) or multiple rings (e.g., bicyclic or tricyclic) including fused systems. In some embodiments, aryl groups have from 6 to 20 ring carbon atoms (i.e., C)6-20Aryl), 6 to 14 carbon ring atoms (i.e.,C6-14aryl), 6 to 12 carbon ring atoms (i.e., C)6-12Aryl), or 6 to 10 carbon ring atoms (i.e., C)6-10Aryl). Examples of aryl groups include, but are not limited to, phenyl, naphthyl, fluorenyl, and anthracenyl. However, aryl does not in any way comprise or overlap with heteroaryl as defined below. If one or more aryl groups are fused to a heteroaryl group, the resulting ring system is a heteroaryl group. If one or more aryl groups are fused to a heterocycloalkyl group, the resulting ring system is a heterocycloalkyl group. "arylene" refers to a divalent aromatic radical.

"O-carbamoyl" refers to the group-O-C (O) NRyRz"N-carbamoyl" refers to the group-NRyC(O)ORzWherein R isyAnd RzIndependently selected from hydrogen, alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl or heteroaryl; each of which may be optionally substituted.

"carboxy" refers to-C (O) OH.

"carboxylate" refers to-OC (O) R and-C (O) OR, where R is hydrogen, alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, OR heteroaryl; each may be optionally substituted, as defined herein.

"cyano" or "nitrile" refers to the group-CN.

"cycloalkyl" refers to a monovalent saturated or partially unsaturated nonaromatic cyclic alkyl group having a single ring or multiple rings, including fused, bridged, and spiro ring systems. The term "cycloalkyl" includes cycloalkenyl (i.e., a non-aromatic carbocyclic group having at least one double bond). In some embodiments, cycloalkyl groups have 3 to 20 ring carbon atoms (i.e., C)3-20Cycloalkyl), 3 to 12 ring carbon atoms (i.e., C)3-12Cycloalkyl), 3 to 10 ring carbon atoms (i.e., C)3-10Cycloalkyl), 3 to 8 ring carbon atoms (i.e., C)3-8Cycloalkyl), or 3 to 6 ring carbon atoms (i.e., C)3-6Cycloalkyl groups). Examples of cycloalkyl groups include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, and cyclohexyl. "cycloalkylene" refers to a divalent saturated or partially unsaturated cyclic alkyl group.

"guanidino" refers to-NHC (NH)2)。

"hydrazino" refers to-NHNH2

"imino" means a-C (NR) R group, wherein each R is alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, or heteroaryl; each may be optionally substituted, as defined herein.

"halogen" or "halogen" includes fluorine, chlorine, bromine and iodine.

"haloalkyl" refers to an alkyl group as defined above wherein one or more hydrogen atoms are replaced with a halogen. For example, when a residue is substituted with more than one halogen, it can be referred to using a prefix corresponding to the number of halogen groups attached. Dihaloalkyl and trihaloalkyl refer to alkyl substituted with two ("di") or three ("tri") halo groups, which may be, but are not necessarily, the same halo. Examples of haloalkyl groups include difluoromethyl (-CHF)2) And trifluoromethyl (-CF)3)。

"haloalkenyl" refers to an alkenyl group as defined above wherein one or more hydrogen atoms are replaced by halogen.

"haloalkynyl" refers to an alkynyl group as defined above in which one or more hydrogen atoms are replaced by halogen.

"heteroalkyl" refers to an alkyl in which one or more carbon atoms (and any associated hydrogen atoms) are each independently substituted with the same or different heteroatom groups. The term "heteroalkyl" includes straight or branched saturated chains with carbon and heteroatoms. For example, 1,2 or 3 carbon atoms may be independently substituted with the same or different heteroatom groups. Heteroatom groups include, but are not limited to, -NR-, -O-, -S-, -S (O) -, -S (O)2-and the like, wherein R is hydrogen, alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, or heteroaryl, each of which may be optionally substituted. Examples of heteroalkyl groups include-OCH3、-CH2OCH3、-SCH3、-CH2SCH3、-NHCH3and-CH2NRCH3. In some embodiments, heteroalkyl groups include 1 to 10 carbon atoms, 1 to 8 carbon atoms, or 1 to 4 carbon atoms; and 1 to 3 hetero atomsA molecule, 1 to 2 heteroatoms or 1 heteroatom. "Heteroalkylene" refers to a divalent heteroalkyl group.

"heteroaryl" refers to an aromatic group having a single ring, multiple rings, or multiple fused rings, wherein one or more ring heteroatoms are independently selected from nitrogen, oxygen, and sulfur. In some embodiments, heteroaryl includes 5 to 24 ring atoms (i.e., 5 to 24 membered heteroaryl), or 5 to 14 ring atoms (i.e., 5 to 14 membered heteroaryl), 5 to 10 ring atoms (i.e., 5 to 10 membered heteroaryl), or 5 or 6 ring atoms (i.e., 5 or 6 membered heteroaryl). In some embodiments, heteroaryl includes 1 to 20 ring carbon atoms (i.e., C)1-20Heteroaryl), 5 to 14 ring carbon atoms (i.e., C)5-14Heteroaryl), 3 to 12 ring carbon atoms (i.e., C)3-12Heteroaryl), or 3 to 8 carbon ring atoms (i.e., C)3-8Heteroaryl); and 1 to 5 heteroatoms, 1 to 4 heteroatoms, 1 to 3 ring heteroatoms, 1 to 2 ring heteroatoms, or 1 ring heteroatom independently selected from nitrogen, oxygen, and sulfur. Examples of heteroaryl groups include, but are not limited to, pyrimidinyl, purinyl, pyridyl, pyridazinyl, benzothiazolyl, pyrazolyl, benzo [ d ] d]Thiazolyl, quinolinyl, isoquinolinyl, benzo [ b ]]Thienyl, indazolyl, benzo [ d]Imidazolyl, pyrazolo [1,5-a ]]Pyridyl and imidazo [1,5-a ]]A pyridyl group. Any aromatic ring having single or multiple fused rings and containing at least one heteroatom is considered heteroaryl, regardless of the attachment to the rest of the molecule (i.e., through any fused rings). Heteroaryl does not comprise or overlap with aryl as defined above. "heteroaryl" refers to a divalent heteroaryl.

"Heterocycloalkyl" means a saturated or unsaturated, non-aromatic cyclic alkyl group having one or more ring heteroatoms independently selected from nitrogen, oxygen, and sulfur. The term "heterocycloalkyl" includes heterocycloalkenyl (i.e., heterocycloalkyl having at least one double bond), bridged heterocycloalkyl, fused heterocycloalkyl, and spiroheterocycloalkyl. Heterocycloalkyl groups can be monocyclic or polycyclic, wherein multiple rings can be fused, bridged, or spiro rings. Any non-aromatic ring containing at least one heteroatom is considered to be a heterocycloalkyl group regardless of the manner of attachment (i.e., may be bound by a carbon atom or a heteroatom). Furthermore, the term heterocycloalkyl is intended to include a group containing at least oneAny non-aromatic ring of a heteroatom that may be fused to an aryl or heteroaryl ring, regardless of the remainder of the molecule. In some embodiments, heterocycloalkyl includes 3 to 24 ring atoms (i.e., 3 to 24 membered heterocycloalkyl), or 3 to 14 ring atoms (i.e., 3 to 14 membered heterocycloalkyl), 3 to 10 ring atoms (i.e., 3 to 10 membered heterocycloalkyl), or 5 or 6 ring atoms (i.e., 5 or 6 membered heterocycloalkyl). In some embodiments, heterocycloalkyl has 2 to 20 ring carbon atoms (i.e., C)2-20Heterocycloalkyl), 2 to 12 ring carbon atoms (i.e., C)2-12Heterocycloalkyl group), 2 to 10 ring carbon atoms (i.e., C)2-10Heterocycloalkyl), 2 to 8 ring carbon atoms (i.e., C)2-8Heterocycloalkyl), 3 to 12 ring carbon atoms (i.e., C)3-12Heterocycloalkyl), 3 to 8 ring carbon atoms (i.e., C)3-8Heterocycloalkyl), or 3 to 6 ring carbon atoms (i.e., C)3-6Heterocycloalkyl); having 1 to 5 ring heteroatoms, 1 to 4 ring heteroatoms, 1 to 3 ring heteroatoms, 1 to 2 ring heteroatoms, or 1 ring heteroatom independently selected from nitrogen, sulfur, or oxygen. Examples of heterocycloalkyl include, but are not limited to, pyrrolidinyl, piperidinyl, piperazinyl, oxetanyl, dioxolanyl, azetidinyl, morpholinyl, 2-oxa-7-azaspiro [3.5 ]]Nonyl, 2-oxa-6-azaspiro [3.4 ]]Octyl, 6-oxa-1-azaspiro [3.3]Heptyl, 1,2,3, 4-tetrahydroisoquinolinyl, 4,5,6, 7-tetrahydrothieno [2,3-c ]]Pyridyl, indolyl and isoindolinyl. "Heterocycloalkylene" refers to a divalent heterocycloalkyl radical.

"hydroxy" means an-OH group.

"oxo" refers to (═ O) or (O) groups.

"nitro" means-NO2A group.

"Sulfonyl" means-S (O)2R groups, wherein R is alkyl, alkenyl, alkynyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, or heteroaryl. Examples of sulfonyl groups include, but are not limited to, methylsulfonyl, ethylsulfonyl, phenylsulfonyl, and tosyl.

"alkylsulfonyl" means-S (O)2R group, wherein R is alkyl.

"sulfonic acid" means-SO3H radical。

"Alkylsulfinyl" refers to the group-S (O) R, where R is alkyl.

"thiocyanate" means the-SCN group.

"thiol" refers to the-SH group.

"thioxo" or "thione" refers to (═ S) or (S) groups.

Certain commonly used alternative chemical names may be used. For example, divalent groups such as divalent "alkyl", divalent "aryl", and the like may also be referred to as "alkylene" groups, "arylene" groups, respectively. Likewise, unless expressly stated otherwise, a combination of groups is referred to herein as a moiety, e.g., aralkyl, and the last-mentioned group contains an atom through which the moiety is attached to the rest of the molecule.

The terms "optional" or "optionally" mean that the subsequently described event or circumstance may or may not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not. Furthermore, the term "optionally substituted" means that any one or more hydrogen atoms on the designated atom or group may or may not be substituted with a group other than hydrogen.

The term "substituted" means that any one or more hydrogen atoms on the designated atom or group is replaced with a substituent other than one or more hydrogens, provided that the designated atom's normal valence is not exceeded. One or more substituents include, but are not limited to, alkyl, alkenyl, alkynyl, alkoxy, acyl, amino, amido, amidino, aryl, N3O-carbamoyl, N-carbamoyl, carboxy, carboxylate, cyano, guanidino, halogen, haloalkyl, haloalkoxy, heteroalkyl, heteroaryl, heterocycloalkyl, hydroxy, hydrazino, imino, oxo, nitro, alkylsulfinyl, sulfonic acid, alkylsulfonyl, thiocyanate, thiol, thione, or a combination thereof. Polymers obtained by defining substituents with infinitely additional substituents (e.g., substituted aryl with substituted alkyl, substituted alkyl itself substituted with substituted aryl, substituted aryl further substituted with substituted heteroalkyl, etc.)Or similar uncertain structure, are not intended to be included herein. Unless otherwise indicated, the maximum number of consecutive substitutions in the compounds described herein is three. For example, sequential substitution of a substituted aryl substituted with two other substituted aryl groups is limited to ((substituted aryl) substituted aryl. Similarly, the above definitions are not intended to include impermissible substitution patterns (e.g., methyl substituted with 5 fluorines or heteroaryl having two adjacent oxygen ring atoms). Such impermissible substitution is well known to the skilled worker. When used to modify a chemical group, the term "substituted" may describe other chemical groups as defined herein. In some embodiments, when a group is described as optionally substituted, any substituent of the group is itself unsubstituted. For example, in some embodiments, the term "substituted alkyl" refers to an alkyl group having one or more substituents including hydroxy, halo, alkoxy, cycloalkyl, heterocyclyl, aryl, and heteroaryl. In other embodiments, one or more substituents may be further substituted with halogen, alkyl, haloalkyl, hydroxy, alkoxy, cycloalkyl, heterocyclyl, aryl, or heteroaryl, each of which is substituted. In other embodiments, the substituents may be further substituted with halogen, alkyl, haloalkyl, alkoxy, hydroxy, cycloalkyl, heterocyclyl, aryl, or heteroaryl, each unsubstituted.

As used herein, the term "solvent" refers to a liquid that dissolves a solid, liquid, or gaseous solute to form a solution. Common solvents are well known in the art and include, but are not limited to, water; saturated aliphatic hydrocarbons such as pentane, hexane, heptane and other light petroleum oils; aromatic hydrocarbons such as benzene, toluene, xylene, etc.; halogenated hydrocarbons such as dichloromethane, chloroform, carbon tetrachloride, etc.; aliphatic alcohols such as methanol, ethanol, propanol, etc.; ethers such as diethyl ether, dipropyl ether, dibutyl ether, tetrahydrofuran, dioxane and the like; ketones such as acetone, ethyl methyl ketone, and the like; esters such as methyl acetate, ethyl acetate, and the like; nitrogen-containing solvents such as dimethylacetamide, formamide, N-dimethylformamide, acetonitrile, pyridine, N-methylpyrrolidone, quinoline, nitrobenzene, and the like; sulfur-containing solvents such as carbon disulfide, dimethyl sulfoxide, sulfolane, and the like; phosphorus-containing solvents such as hexamethylphosphoric triamide and the like. The term solvent includes combinations of two or more solvents unless otherwise specifically indicated. The particular choice of suitable solvent will depend on a number of factors, including the nature and intended purpose of the solvent and solute to be dissolved, e.g., what chemical reactions will take place in the solution, as is well known in the art.

As used herein, the term "contacting" refers to bringing two or more chemical molecules into proximity such that a chemical reaction can occur between the two or more chemical molecules. For example, contacting may include mixing and optionally continuously mixing the chemicals. Contacting may be accomplished by dissolving or suspending two or more chemical species, either completely or partially, in one or more solvents, mixing the chemical species in a solvent with another chemical species in a solid and/or gas phase, or attached to a solid support such as a resin, or mixing two or more chemical species in a gas or solid phase and/or solid support, as is commonly known to those of skill in the art.

As used herein, "organic compound" refers to a compound to be labeled, which can be screened for binding activity to one or more targets. The term "drug", chinese medicine and Natural Product (NP) refers to the organic compound to be labeled as defined herein.

As used herein, a "linker precursor molecule" refers to a molecule having two functional groups, each functional group reacting with the molecule to attach, such that after reaction, the residue of the linker precursor molecule becomes the linker of the product.

As used herein, "labeled molecule" refers to a molecule, such as an oligonucleotide, having a label that is detectable by an analytical method.

As used herein, "site non-selective" in the terms "site non-selective reaction," "site non-selective marker," and the like, refers to a reaction or marker, and the like, that may occur under a number of conditions, which may not necessarily rely on a functional group of the molecule. Examples of "site-nonselective reactions" include carbene, nitrene and radical reactions, wherein a compound X-Y (where Y is a site-nonselective reactive group, such as a carbene, nitrene or radical) is reacted with a different site of a compound Z to form a compound X-Z. This site-nonselective reaction typically results in several positional isomers, X-Z, where X is attached to different sites of Z.

As used herein, "isomeric compound" or "isomeric product" and the like refer to an isomer resulting from a site-nonselective reaction, e.g., reacting a linker precursor molecule having a site-nonselective reactive group with an organic compound or a corresponding labeled isomeric product, where appropriate. By "set of isomeric products" is meant a mixture of isomeric compounds produced from a unique organic compound.

As used herein, "disproportionation reaction product" refers to two products resulting from a disproportionation reaction, sometimes referred to as disproportionation, which is a redox reaction in which a compound in an intermediate oxidation state is converted to two different compounds, one in a higher oxidation state and one in a lower oxidation state. The disproportionation reaction can be described as:

2P→P'+P"

wherein P, P 'and P "are both different chemicals and P' and P" are the products of a disproportionation reaction.

Unless explicitly indicated to the contrary, all atoms specified in the formulae described herein, whether in the structure provided or in the definition of the variable associated with the structure, are intended to include any isotope thereof. It is understood that for any given atom, the isotope may be substantially present in the proportions in which it naturally occurs, or one or more particular atoms may be enhanced relative to one or more isotopes using synthetic methods known to those skilled in the art. Thus, hydrogen includes, for example1H、2H、3H; carbon includes, for example11C、12C、13C、14C; oxygen includes, for example16O、17O、18O; nitrogen includes, for example13N、14N、15N; sulfur includes, for example32S、33S、34S、35S、36S、37S、38S;Fluorine includes, for example17F、18F、19F; chlorine includes, for example35Cl、36Cl、37Cl、38Cl、39Cl, and the like.

The compounds described herein include any tautomeric form, although only one structural formula of a given compound may be provided herein for each tautomeric form.

As used herein, the term "salt" refers to acid addition salts and base addition salts. Examples of acid addition salts include those containing sulfate, chloride, hydrochloride, fumarate, maleate, phosphate, sulfamate, acetate, citrate, lactate, tartrate, methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate, cyclohexylsulfamate and quinic acid salts. Salts may be obtained from acids such as hydrochloric acid, maleic acid, sulfuric acid, phosphoric acid, sulfamic acid, acetic acid, citric acid, lactic acid, tartaric acid, malonic acid, methanesulfonic acid, ethanesulfonic acid, benzenesulfonic acid, p-toluenesulfonic acid, cyclohexylsulfamic acid, fumaric acid and quinic acid. Base addition salts include those containing benzathine, chloroprocaine, choline, diethanolamine, ethanolamine, tert-butylamine, ethylenediamine, meglumine, procaine, aluminum, calcium, lithium, magnesium, potassium, sodium, ammonium, alkylamines, and zinc, when acidic functional groups such as carboxylic acids or phenols are present. See, for example, Remington's Pharmaceutical Sciences,19thed., Mack Publishing Co., Easton, PA, Vol.2, p.1457, 1995. Salts include pharmaceutically acceptable salts which are not of a nature that would cause reasonable caution to a physician in the light of the disease or condition to be treated and the route of administration of the respective substance. The compounds, intermediates or products described herein include salts thereof.

Further, the abbreviations used herein have the following respective meanings:

list of abbreviations

Compound (I)

Provided herein are bifunctional linkers comprising two functional groups, one of which is a chemical group capable of generating highly reactive chemicals such as carbenes or nitrenes or radicals, which can be used to label compounds by exploiting the high reactivity of carbenes or nitrenes or radicals to generate a collection of isomeric compounds with spatial structural diversity. The other functional group of the linker is used to attach a labeling molecule, such as a specific oligonucleotide, using the synthesis of a set of sequence-encoded isomers of the oligonucleotide. Finally, various compounds can be tagged with such bifunctional linkers and tagging molecules to generate tagged products, which can be mixed together to construct a library of compounds. In the present disclosure, carbene or nitrene or radical chemistry reactions compatible with oligonucleotide labeling are optimized, e.g., solvent, reaction sequence, catalyst composition, etc. The method avoids complex structure-activity relationship experiments in compound function research, and has the advantages of simple and rapid operation and high screening efficiency.

Provided herein are methods for the site-non-selective labeling of organic chemicals using oligonucleotides, compounds useful in the methods, and labeled compounds, libraries comprising the labels, and uses thereof.

Marking method

The methods described herein utilize site-non-selective reactions, such as carbene or nitrene or radical reactions, and can label organic compounds at different sites simultaneously, regardless of the functional group of the compound, thus minimizing false negative results. These methods can also be used to label organic compounds that do not contain functional groups. In addition, this method does not use reagents such as strong bases or strong reducing agents, and oligonucleotide labeling is unstable to these reagents.

In some embodiments, the method may be described as shown in scheme 1:

scheme 1

In some embodiments, there is provided a method of labeling an organic compound, the method comprising:

(1) contacting a linker precursor molecule C with an organic compound a under site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, wherein the linker precursor molecule C has two functional groups, a first functional group R and a second functional group M, wherein the first functional group R of the linker precursor molecule C produces a site-nonselective reaction group, such as carbene or nitrene or radical, which reacts with the organic compound a under site-nonselective reaction conditions, and the second functional group is unreactive under conditions in which the first functional group reacts with the organic compound a, wherein the linker precursor molecule C contacts the organic compound a to form an intermediate D having the second functional group M of the linker precursor molecule. As used herein, "intermediate D" may refer to all isomers, or one or some isomers, when linker precursor molecule C is reacted with organic compound a under site-nonselective reaction conditions to produce two or more isomers.

In some embodiments, the method further comprises:

(2) contacting the intermediate D with a labelling molecule B, thereby reacting the second functional group M of the linker precursor molecule C with the labelling molecule B to form the labelled organic compound E, wherein the labelling molecule B comprises a label. When two or more isomers are produced by this method, "labeled organic compound E" as used herein may refer to all isomers, or one or some of the isomers.

In some embodiments, the site-nonselective reaction conditions are carbene reaction conditions under which the first functional group R of the linker precursor molecule C generates a carbene. In some embodiments, the site-nonselective reaction condition is a nitrene reaction condition under which the first functional group R of the linker precursor molecule C produces a nitrene. In some embodiments, the site-nonselective reaction conditions are free radical reaction conditions under which the first functional group R of the linker precursor molecule C generates a free radical.

The methods provided herein do not rely on any functional groups on the organic compound to label the organic compound. Thus, in some embodiments, the first functional group R of linker precursor molecule C may react with multiple sites of organic compound a to form a mixture of multiple isomeric products. In some embodiments, organic compound a does not comprise reactive functionality.

In some embodiments, intermediate D is a mixture of isomers.

In some embodiments, the labeled organic compound E is a mixture of isomers.

In some embodiments, the label comprises a Unique sequence for chemical labeling (Unique sequence) or a fluorescent label (e.g., fluorophore or GFP) or a biotin label, or a combination thereof. In some embodiments, the unique sequence comprises a single-stranded DNA or RNA, a double-stranded DNA or RNA, a multi-stranded DNA or RNA, or other chemically modified oligonucleotide. In some embodiments, the label comprises an oligonucleotide.

In some embodiments, the unique sequence comprises a chemically modified oligonucleotide, including a natural nucleotide or any other artificial nucleotide or a mixture of both. In some embodiments, the chemically modified oligonucleotide is modified by an amino group on the oligonucleotide. In some embodiments, the chemically modified oligonucleotide comprises an ethynyl group (C ≡ CH). In some embodiments, the ethynyl group is linked to the oligonucleotide through a linker. In some embodiments, the linker comprises a-CH2-(OCH2CH2)n-O-CH2-, where n is an integer of 0 to 10 or 1 to 5.

In some embodiments, the site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, comprise dissolving an organic compound a and a linker precursor molecule C in a solvent to form a reaction mixture. In some embodiments, the free radical reaction conditions include ultraviolet (e.g., 365nm) radiation of the reaction mixture for a period of time, such as 1-5 hours or about 3 hours or more. In some embodiments, the site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, comprise a radical initiator, such as a peroxide (a compound having a peroxide bond (-O-), e.g., di-t-butyl peroxide, benzoyl peroxide, methyl ethyl ketone peroxide, and peroxydisulfate), or a transition metal complex. Other Free Radical Initiators are well known in the art, see, e.g., Denisov, et al, Handbook of Free radial Initiators, Wiley-Interscience; 1st edition (April 4,2003), the entire contents of which are incorporated herein by reference.

In some embodiments, the method further comprises isolating intermediate D in step (1) and/or labeled organic compound E in step (2) above. Isolation refers to the separation of the desired reaction product from other materials in the reaction mixture, such as solvents, reagents, by-products, and the like. If the process produces multiple isomerization and/or disproportionation reaction products, these products may be separated as a mixture or as individual compounds.

Linker precursor molecules

The methods described herein use linker precursor molecules to link to organic molecules labeled with labeling molecules via the linker of the linker precursor molecules. The linker precursor molecule has functional groups capable of generating carbenes or nitrenes or free radicals, which are capable of reacting with the organic compound, independently of any functional groups of the organic compound.

In some embodiments, a linker precursor molecule C is provided comprising a first functional group capable of generating a site-non-selective reactive group capable of reacting with organic compound a, such as a carbene or nitrene or a radical, and a second functional group that is non-reactive with organic compound a under conditions in which the first functional group reacts with organic compound a, but is capable of reacting with labeling molecule B.

In some embodiments, linker precursor molecule C has the formula R-L-M, wherein R is a first functional group, M is a second functional group, and L is a linker moiety comprising one or more moieties independently selected from optionally substituted alkylene, alkenylene, alkynylene, heteroalkylene, cycloalkylene, heterocycloalkylene, arylene, and heteroarylene.

In some embodiments, L is alkylene, wherein one or more methylene units of the alkylene are optionally independently selected from NR1、O、S、SO、SO2、CO、NR1C(O)、C(O)NR1Partially substituted cycloalkylene, heterocycloalkylene, arylene and heteroaryl, wherein R is1Each independently hydrogen or alkyl.

In some embodiments, L is alkylene, wherein one or more methylene units of the alkylene are substituted with a moiety independently selected from the group consisting of phenylene, oxy, nhc (o), and c (o) NH.

In some embodiments, L comprises phenylene.

In some embodiments, the first functional group or R of the linker precursor molecule C comprises a group capable of generating a carbene or nitrene or a radical (e.g., a diazonium, enone, isocyanate, or azide group). In some embodiments, the first functional group or R of the linker precursor molecule C comprises diazomethane, aryl azide, benzophenone, or diazo.

In some embodiments, the first functional group or R comprises an arylene group and is connected to the remainder of linker precursor molecule C through an arylene group, such as phenylene.

In some embodiments, the first functional group of the linker precursor molecule is selected from

WhereinIndicates the attachment site to the rest of linker precursor molecule C.

In some embodiments, the second functional group or M of the linker precursor molecule C comprises an azide (e.g., an alkyl azide), an alkyne, an alkene, or-C (O) -O-.

In some embodiments, the second functional group or M comprises an alkylene group and is connected to the remainder of linker precursor molecule C through the alkylene group.

In some embodiments, the second functional group or M of the linker precursor molecule C is selected from

WhereinIndicates the attachment site to the rest of linker precursor molecule C.

In some embodiments, the linker precursor molecule C is selected from

In some embodiments, the linker precursor molecule C is selected from

In some embodiments, the linker precursor molecule C is selected from

In some embodiments, the linker precursor molecule C is selected from

In some embodiments, the linker precursor molecule C is

Labelled organic compounds

In some embodiments, there is provided a labeled organic compound E made by a process comprising the steps of:

(1) contacting a linker precursor molecule C with an organic compound a under site-nonselective reaction conditions, such as carbene or nitrene or radical reaction conditions, wherein the linker precursor molecule C has two functional groups, a first functional group R and a second functional group M, wherein the first functional group R of the linker precursor molecule C produces a site-nonselective reactive group, such as carbene or nitrene or radical, that reacts with the organic compound a, under site-nonselective reaction conditions, the second functional group is unreactive under conditions in which the first functional group reacts with the organic compound a, and wherein the linking of the linker precursor molecule C with the organic compound a produces an intermediate D having the second functional group M of the linker precursor molecule C; and

(2) contacting said intermediate D with a labeling molecule B, whereby the second functional group M of said linker precursor molecule C reacts with the labeling molecule B to generate a labeled organic compound E, wherein the labeling molecule B comprises a label.

In some embodiments, the organic compound a does not comprise a reactive functionality.

In some embodiments, the intermediate D is a mixture of isomers.

In some embodiments, the labeled organic compound E is a mixture of isomers.

In some embodiments, the label comprises a unique sequence for chemical labeling or a fluorescent tag (e.g., a fluorophore or GFP) or a biotin label, or a combination thereof.

In some embodiments, the unique sequence comprises a single-stranded DNA or RNA, a double-stranded DNA or RNA, a chemically modified oligonucleotide, a multi-stranded natural or artificial oligonucleotide, or an artificial oligonucleotide.

In some embodiments, the unique sequence comprises an oligonucleotide.

In some embodiments, a set of isomerically labeled organic compounds is provided that includes a label moiety and an organic compound moiety, wherein the label moiety is attached to different sites of the organic compound moiety through a linker.

In some embodiments, a mixture is provided comprising (1) a plurality of isomeric compounds, wherein said isomeric compounds are produced by site-nonselective reactions of a carbene or nitrene or radical of an organic compound a with said linker precursor molecule C, and/or (2) one or more pairs of compounds which are disproportionation products of an isomeric compound produced by the radical reaction of (1).

In some embodiments, the pair of disproportionation products includes compound F having a molecular weight of compound D made by a carbene or nitrene or radical reaction plus 2, and compound F' having a molecular weight of compound D made by a carbene or nitrene or radical reaction minus 2.

In some embodiments, there is provided a mixture of a plurality of isomeric marker compounds, made by a process comprising the steps of:

(1) reacting the first functional group of linker precursor molecule C having the first and second functional groups described herein with organic compound a under site-nonselective reaction conditions, such as carbene or nitrene or free radical reaction conditions, to produce intermediate D having the second functional group of linker precursor molecule C; and

(2) reacting the second functional group with a labeling molecule B comprising a label.

In some embodiments, the label comprises a unique sequence for chemical labeling or a fluorescent tag (e.g., a fluorophore or GFP) or a biotin label, or a combination thereof.

In some embodiments, the unique sequence includes single-stranded DNA or RNA, double-stranded DNA or RNA, other chemically modified oligonucleotides, or other artificial oligonucleotides, such as multi-stranded natural or artificial oligonucleotides. In some embodiments, the unique sequence comprises an oligonucleotide.

In some embodiments, the first functional group of the linker precursor molecule C reacts with the plurality of sites of the organic compound a to produce a mixture of a plurality of isomeric intermediates D, which react with the labeling molecule B to form a plurality of isomeric labeled organic compounds E. In some embodiments, a library of labeled organic compounds is provided that includes at least two labeled organic compounds (or two sets of isomerically labeled organic compounds), each made by the methods described herein, wherein a unique organic compound is labeled with a unique label.

In some embodiments, the library comprises at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, or 500 sets of uniquely labeled organic compounds (or 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, or 500 sets of isomerically labeled organic compounds), each labeled with a unique label.

In some embodiments, the unique tag is an oligonucleotide having a unique sequence.

Screening method

In some embodiments, a method of identifying an organic compound that binds to a target molecule is provided, comprising analyzing a labeled organic compound, a set of isomeric labeled organic compounds, or a library of labeled organic compounds described herein, and identifying an organic compound that binds to a target based on their label. Such tests are well known in the art, for example, in chapter thirteenth and sixteenth of the handbook of DNA coding chemistry, Goodnow jr., Wiley, version 1 (2014), Decurtins, w. et al, Automated screening for small organic ligands using DNA-encoded chemical libraries, Nat protoc.2016; 764 (4) 764-80, the entire contents of which are incorporated herein by reference.

In some embodiments, organic compounds are screened by affinity between a labeled organic molecule described herein and a target molecule to be tested. Targets to be tested include protein molecules, cells, organelles (nucleus, mitochondria, golgi apparatus, peroxisomes, lysosomes, exosomes, etc.), cytoskeleton (microtubules, intermediate filaments, microfilaments), DNA, RNA, sugars, phospholipid molecules, phospholipid protein complexes or complexes of the above, etc. Such screening methods are generally known in the art and may vary depending on the particular application, e.g., the particular target to be tested.

In some embodiments, marker organic compounds having binding affinity to the target molecule are selected for identification. In some embodiments, selected labeled organic compounds are subjected to their labeled sequencing reactions to interpret the encoded oligonucleotide sequences (e.g., Sanger sequencing, second generation sequencing, etc.) to identify organic compounds that bind to the target.

In some embodiments, there is provided a method for identifying a compound having two conjugated double bonds, the method comprising

(1) Reacting a first functional group of a linker precursor molecule C as described herein with an organic compound a under site-nonselective reaction conditions, such as carbene or nitrene or free radical reaction conditions, to produce a product or product mixture D, wherein a second functional group of the linker precursor molecule C is unreacted; and

(2) analyzing the product or product mixture D, wherein the presence of a pair of disproportionation products comprising compound F having a molecular weight of carbene or nitrene or product of free radical reaction D plus 2 and compound F' having a molecular weight of carbene or nitrene or product of free radical reaction D minus 2, indicates that organic compound A has two conjugated double bonds.

In some embodiments, at least one of the two conjugated double bonds is in a ring system, e.g., a cycloalkyl ring.

The product or product mixture may be analysed by methods well known in the art, for example liquid chromatography in combination with mass spectrometry.

Method of treatment

In some embodiments, provided herein is a compound for use in treating a disease or disorder in a subject in need thereof, wherein the disease or disorder is modulated or affected by the activity of a predetermined biological target. In some embodiments, the compounds are identified by the screening methods described herein and optionally further modified, e.g., to enhance binding efficacy to a predetermined biological target. In certain embodiments, the compound is derived from a labeled organic compound described herein. The predetermined biological target can be any biological target including, but not limited to, proteins, enzymes, cells, organelles (nucleus, mitochondria, golgi apparatus, peroxisomes, lysosomes, exosomes, etc.), cytoskeleton (microtubules, intermediate filaments, microfilaments), DNA, RNA, sugars, phospholipid molecules, phospholipid protein complexes or complexes of the foregoing, and the like.

In certain embodiments, the predetermined biological target is poly (ADP-ribose) polymerase (PARP). Poly (ADP-ribose) polymerase (PARP) is a family of proteins that are involved in many cellular processes, such as DNA repair, genomic stability, and programmed cell death. The PARP family includes 17 members, such as PARP1, PARP2, VPARP (PARP4), Tankyrase-1 and-2 (PARP-5a or TNKS, PARP-5b or TNKS2), PARP3, PARP6, TIPARP (or "PARP 7"), PARP8, PARP9, PARP10, PARP11, PARP12, PARP14, PARP15 and PARP 16.

In certain embodiments, the predetermined biological target is PARP 1. PARP1 is a protein important for repairing single strand breaks. Drugs that inhibit PARP1 result in the formation of multiple double-strand breaks that cannot be effectively repaired in tumors with mutations in BRCA1, BRCA2 or PALB2, resulting in tumor cell death. Some cancer cells lacking the tumor suppressor PTEN may be sensitive to PARP inhibitors due to down-regulation of Rad 51. PARP inhibitors may therefore be effective against PTEN deficient tumors.

In addition to use in cancer therapy, PARP inhibitors are considered as potential therapies for stroke, myocardial infarction and neurodegenerative diseases.

Accordingly, in some embodiments, there is provided a method of treating cancer, stroke, myocardial infarction, neurodegenerative disease or inflammation in a subject in need thereof comprising administering to the subject a therapeutically effective amount of a compound identified as capable of binding or inhibiting PARP1 by the screening methods described herein.

In some embodiments, there is provided a method of treating cancer, stroke, myocardial infarction, neurodegenerative disease or inflammation in a subject in need thereof, comprising administering to the subject a therapeutically effective amount of luteolin, naringin, hyperin, liquiritin, epicatechin, epigallocatechin, daphnetin, F001, F002, F003, or F006, or a derivative thereof. In some embodiments, there is provided a method of treating cancer, stroke, myocardial infarction, neurodegenerative disease, or inflammation in a subject in need thereof, comprising administering to the subject a therapeutically effective amount of luteolin, or a derivative thereof.

In some embodiments, the subject in need of treatment has cancer. In some embodiments, the subject has cancer cells that are mutated in BRCA1, BRCA2, PALB2, or PTENT. In certain embodiments, the mutation is an inactivating mutation.

In some embodiments, the subject has cancer cells that overexpress PARP protein as compared to corresponding normal cells. In some embodiments, the PARP protein is PARP 1.

In some embodiments, the subject in need of treatment has a neurodegenerative disease. In some embodiments, the neurodegenerative disease is selected from the group consisting of parkinson's disease, alzheimer's disease, huntington's disease, atrophic myelitis, aids dementia, vascular dementia, and combinations thereof.

In some embodiments, the subject in need of treatment has a stroke.

In some embodiments, the subject in need of treatment suffers from inflammation, such as, but not limited to, inflammation associated with a disease or condition selected from: parkinson's disease, arthritis, rheumatoid arthritis, multiple sclerosis, psoriasis, psoriatic arthritis, Crohn's disease, inflammatory bowel disease, ulcerative colitis, lupus, systemic lupus erythematosus, juvenile rheumatoid arthritis, juvenile idiopathic arthritis, Graves 'disease, Hashimoto's thyroiditis, Edison's disease, celiac disease, dermatomyositis, multiple sclerosis, myasthenia gravis, pernicious anemia, Sjogren's syndrome, type I diabetes, vasculitis, uveitis, atherosclerosis and ankylosing spondylitis.

"treatment" is a method of achieving beneficial or desired results, including clinical results. Beneficial or desired clinical results may include one or more of the following: a) inhibiting the disease or disorder (e.g., alleviating one or more symptoms caused by the disease or disorder, and/or alleviating the extent of the disease or disorder); b) slowing or arresting the development of one or more clinical symptoms associated with the disease or disorder (e.g., stabilizing the disease or disorder, preventing or delaying the worsening or progression of the disease or disorder, and/or preventing or delaying the spread (e.g., metastasis) of the disease or disorder); and/or c) ameliorating the disease, i.e., causing regression of clinical symptoms (e.g., improving the disease state, providing partial or complete remission of the disease or disorder, potentiating the effect of another drug, delaying progression of the disease, improving quality of life, and/or prolonging survival.

"prevention" refers to any treatment of a disease or disorder that avoids the development of clinical symptoms of the disease or disorder. In some embodiments, the compound may be administered to a subject (including a human) at risk or having a family history of a disease or condition.

"subject" refers to an animal, such as a mammal (including a human), that has been or will become the subject of treatment, observation or experiment. The methods described herein can be used for human therapy and/or veterinary applications. In some embodiments, the subject is a mammal. In one embodiment, the subject is a human.

The term "therapeutically effective amount" or "effective amount" of a compound described herein, or a pharmaceutically acceptable salt, tautomer, stereoisomer, mixture of stereoisomers, prodrug or deuterated analog thereof, refers to an amount sufficient to effect treatment when administered to a subject to provide a therapeutic benefit, such as an improvement in symptoms or slowing of disease progression. For example, a therapeutically effective amount may be an amount sufficient to reduce the symptoms of a disease or disorder of a predetermined biological target, such as, but not limited to, PARP protein. The therapeutically effective amount may vary depending on the subject, the disease or condition being treated, the weight and age of the subject, the severity of the disease or condition, and the mode of administration, which can be readily determined by one skilled in the art.

The methods described herein can be applied to a population of cells in vivo or in vitro. By "in vivo" is meant in a living individual, such as an animal or human. In such cases, the methods described herein can be used for treatment of an individual. By "in vitro" is meant outside a living subject. Examples of in vitro cell populations include in vitro cell cultures and biological samples, including fluid or tissue samples obtained from individuals. Such samples may be obtained by methods well known in the art. Exemplary biological fluid samples include blood, cerebrospinal fluid, urine, and saliva. In such cases, the compounds and compositions described herein can be used for a variety of purposes, including therapeutic and experimental purposes. For example, the compounds and compositions described herein may be used in vitro to determine the optimal dosing regimen and/or dosage of a compound of the invention for a given indication, cell type, individual, and other parameters. The information gathered from such use can be used for experimental purposes or to set up an in vivo treatment regimen in the clinic. Other in vitro uses to which the compounds and compositions described herein may be suitable are described below or will become apparent to those skilled in the art. The selected compounds may be further characterized to check the safety or tolerability of the dose in human or non-human subjects. These properties can be checked using methods well known to those skilled in the art.

The methods of the present disclosure specifically provide improvements to any of the aforementioned therapeutic criteria.

Examples

Specific embodiments of the present disclosure are illustrated by the following examples. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques which function well in the practice of the present disclosure, and thus can be considered to constitute particular modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the present disclosure.

Example 1

Compound 1(200mg) was dissolved in 10mL of dichloromethane, and then mixed with compound 2(200mg), EDCI (416mg), and DMAP (10mg) with stirring. After stirring at room temperature for 3 hours, the mixture was extracted with dichloromethane, washed with brine, concentrated and purified by flash chromatography to give linker I (210mg, 70.4%).1H NMR(500MHz,Chloroform-d)δ7.82(d,J=8.5Hz,2H),7.25(d,J=8.2Hz,2H),6.63(s,1H),3.79–3.62(m,6H),3.45–3.36(m,2H);HRMS-ESI(m/z)[M+H]+C13H9F3N6O2Calculated, 343.1130, found, 343.1133.

Example 2

Compound 3(200mg) was dissolved in 10mL of dichloromethane, and then mixed with Compound 2(513mg), EDCI (405mg), and DMAP (10mg) with stirring. After stirring at room temperature for 3h, the mixture was extracted with dichloromethane, washed with brine, concentrated and purified by flash chromatography to give linker II (283mg, 75.6%).1H NMR(500MHz,Chloroform-d)δ5.93(s,1H),3.72–3.66(m,2H),3.57(t,J=5.0Hz,2H),3.48(t,J=5.1Hz,2H),3.41–3.35(m,2H),2.06-1.98(m,2H),1.82–1.67(m,2H),1.03(s,3H);HRMS-ESI(m/z)[M+H]+C9H17N6O2Calculated, 241.1413, found, 241.1417.

Example 3

Compound 4(50mg) was dissolved in 5mL of dichloromethane, followed byAfter mixing with compound 5(25 μ L), HATU (150mg) and DIPEA (138 μ L) and stirring at room temperature for 3 hours, the mixture was extracted with dichloromethane, washed with brine, concentrated and purified by flash chromatography to give linker III (49mg, 79.1%).1H NMR(500MHz,Chloroform-d)δ7.81(d,J=8.5Hz,2H),7.26(d,J=8.5Hz,2H),6.23(s,1H),4.26(dd,J=5.2,2.6Hz,2H),2.30(t,J=2.6Hz,1H).HRMS-ESI(m/z)[M+H]+C12H9F3N3O, calculated, 268.0698, found, 268.0699.

Example 4

3- (2-Azidoethyl) -3-methyl-3H-diazine was prepared according to the method of Beam et al (Angew Chem Int Ed Eng (2017)56(10):2744-2748) by substituting acetonitrile for the solvent N, N-dimethylformamide.1H NMR(500MHz,Chloroform-d)=1.05(s,3H),1.60(t,2H,J=5.4Hz),3.18(t,2H,J=5.4Hz)。

Example 5

Oridonin (2mg, 0.0054mM) was dissolved in 1mL of dichloromethane and mixed with bifunctional linker I (3.3mg, 0.011 mM). The mixture was irradiated with ultraviolet light (365nm) for 3 hours at room temperature to produce an oridonin linker I conjugate comprising at least five isomeric products with an MS (ESI (M/z) [ M + H ] +) of 679.27 (FIG. 1).

Example 6

In a 500 μ L tube, the initial DNA7(50 μ L, 1mM borate buffer (pH 9.4)) was mixed with compound 6(5 μ L, 200mM DMSO) by vortexing, and then spun with a spinner for 10 hours. 5M sodium chloride was added to the mixture(5.5. mu.L) followed by addition of EtOH (160. mu.L). The contents were vortexed briefly and incubated in a refrigerator at-20 ℃ for 20 minutes. The suspension was then centrifuged at 10000 × g for 5 minutes, the supernatant discarded, and traces of EtOH were removed under vacuum. Dissolving the precipitate in H2To O (50. mu.L), a 1mM solution of Compound 8 was prepared. (FIG. 2).

Example 7

Compound 8 (20. mu.L, 1mM in H)2O) was mixed with linker I (4 μ L, 100mM), copper (II) sulfate pentahydrate (4 μ L, 100mM), sodium ascorbate (4 μ L, 200mM) and tetrahydrocannabinol (4 μ L, 100 mM). The mixture was mixed by vortexing and then spun on a spinner for 5 hours. Then 5M sodium chloride (3.6. mu.L) was added to the mixture followed by EtOH (100. mu.L). The contents were briefly vortexed and incubated at-20 ℃ for 20 minutes. The suspension was then centrifuged at 10000 Xg for 5 minutes, the supernatant discarded and the traces of ethanol removed under vacuum to yield labeled compound 9 (FIG. 3).

Example 8

Compound 8 (20. mu.L, 1mM in H)2O) was mixed with oridonin-linker I conjugate (4 μ L, 100mM), copper (II) sulfate pentahydrate (4 μ L, 100mM), sodium ascorbate (4 μ L, 200mM) and tetrahydrocannabinol (4 μ L, 100 mM). The mixture was mixed by vortexing and spun on a spinner for 5 hours. Then 5 μ L sodium chloride (3.6 μ L) was added to the mixture followed by EtOH (100 μ L). The mixture was briefly vortexed and incubated at-20 ℃ for 20 minutes. The suspension was then centrifuged at 10000 Xg for 5 minutes, the supernatant discarded and traces of EtOH were removed under vacuum to yield labeled Compound 10 (FIG. 4).

Example 9

Oridonin (2mg, 0.0054mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (2.6mg, 0.011 mM). The mixture was irradiated with ultraviolet light (365nm) for 3 hours at room temperature to yield an oridonin-linker II conjugate comprising at least five isomeric compounds, MS (ESI (M/z) [ M + H ]]+) Is 577.3 (fig. 5 and 6).

Example 10

Tripterine (2mg, 0.0044mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (2.1mg, 0.009 mM). The mixture was irradiated with ultraviolet light (365nm) at room temperature for 3 hours to produce a tripterine-linker II conjugate comprising at least three isomeric compounds, MS (ESI (M/z) [ M + H ]]+) 663.38 (fig. 7 and 8).

Example 11

Paclitaxel (2mg, 0.0047mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (2.3mg, 0.0094 mM). The mixture was irradiated with ultraviolet light (365nm) for 3 hours at room temperature to yield a paclitaxel-linker II conjugate comprising at least three isomeric compounds, MS (ESI (M/z) [ M + H ]]+) 1066.46 (fig. 9 and 10).

Example 12

The triptophenolide (2mg, 0.0064mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (3.1mg, 0.0128 mM). The mixture was irradiated with uv light (365nm) for 3 hours at room temperature to generate a rapolenoid-linker II conjugate (fig. 11 and 12).

Example 13

Maytansinol (4mg, 0.0071mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (3.4mg, 0.0142 mM). The mixture was irradiated with ultraviolet light (365nm) for 3 hours at room temperature to yield a maytansinol-linker II conjugate comprising at least two isomeric compounds, MS (ESI (M/z) [ M + H ]]+) 777.3 (fig. 13 and 14).

Example 14

Abietic acid (2mg, 0.0066mM) was dissolved in 1mL acetonitrile and then mixed with bifunctional linker II (3.1mg, 0.013 mM). The mixture was irradiated with UV light (365nm) for 3 hours at room temperature. Two types of conjugates were generated (dehydroabietic acid linker II conjugate (A), MS (ESI (M/z) [ M + H ]]+) 513.37, and hydroabietic acid linker II conjugate (B), MS (ESI (M/z) [ M + H ]]+) 517.40). (FIGS. 15 and 16).

Example 15

Reaction with oligonucleotides: compounds 6 and 7 were ligated to oligonucleotides using ligase. The head-DNA conjugate 4 is mixed with the photoactive linker 2. The mixture was irradiated with UV light (365nm) for 2 hours. Ligation products were ligated with oligonucleotides, smearing the bands, indicating that multiple ligation products were generated (FIG. 17).

Example 16

Oligonucleotides that can be used as labels include single stranded dna (ssdna), double stranded dna (dsdna), single stranded rna (ssrna), double stranded rna (dsrna), chemically modified oligonucleotides and some functional species such as antisense rna (asrna).

The sequences of several examples of oligonucleotides used for labeling are listed below:

ssDNA 5 '-AAATAAATT, 5' amino modification

ssRNA 5 '-AUUUAUUU, 5' amino modification

ssDNA 5 '-AAATAAATT, 3' amino modification

ssRNA 5 '-AUUUAUUU, 3' amino modification

ssDNA 5 '-AAATAAATT, 5' amino group modified with phosphorothioate linkages (resistant to nuclease degradation)

ssDNA 5 '-GCGTTTGCTCTTCTTCTTGCG, 5' amino group modified with phosphorothioate linkages (resistant to nuclease degradation)

Example 17

DNA-encoded chemical libraries (DELs) link the strengths of genetic and chemical synthesis through combinatorial optimization. Through combinatorial chemistry, DELs can be developed on an unprecedented billion to trillion scale, providing a rich chemical diversity for biological and pharmaceutical research. While in most cases, at the molecular level, diversity is limited to the available building blocks for DNA compatible chemical reactions, modern chemical methods are being used to increase diversity. To take full advantage of the DEL approach, linking genetic strength directly to chemical structure would provide greater diversity in the limited chemical world. As organisms evolve, natural products have evolved structural diversity that is incredible.

Provided below is an exemplary DNA-encoding chemical library (DEL) using natural products, FDA-approved drugs, compounds in clinical trials, and compounds derived from combinatorial synthesis, prepared according to the methods described herein. In the following examples, a volatile bifunctional linker (linker IV) allows a "one-pot" reaction on an automated parallel synthesizer.

In some embodiments, the methods described herein exhibit the following criteria: (1) site non-selectivity, (2) chemical non-selectivity, (3) biocompatibility (e.g., DNA compatibility), and (4) compatibility with small reaction scales (e.g., micrograms). Traditional chemoselective reactions are modifications of the natural product at a specific atom, which may miss potential binding pockets that modulate the function of the target protein due to steric shielding. A late-stage modification approach was devised that utilizes chemical and site-non-selective reactions to target all accessible atoms. This late modification creates a set of isoforms with unique DNA tags that provide multiple spatial accessibility to the target protein.

General procedure

Unless otherwise indicated, all commercially available organic compounds and DNA heads (HP-NH)25 '-/5 phos/GAGTCA/iSP 9/iUniAm/iSP 9/TGACTCCC-3') were obtained from commercial sources. Unless otherwise indicated, all commercial reagents and solvents were used without additional purification. NMR spectra were recorded on a Bruker AM-500NMR spectrometer. Chemical shifts are reported as δ (ppm) and coupling constants are reported as J (hertz). Tetramethylsilane (TMS) as1Internal reference for H NMR, CDCl3Is used as13Internal reference for C NMR (. delta.77.0 ppm). Mass spectra were recorded on an AB SCIEX 4600 mass spectrometer or waters SQD 2 mass spectrometer. Linker IV was prepared according to example 4.

Linker screening

The various bifunctional linkers described herein were tested to develop a high throughput DNA annotation strategy for natural products. The reactivity of the functional groups on the linker is orthogonal, with one end designed for chemical and site-non-selective reactions and the other for conjugation of DNA tags by copper (I) -catalyzed azide-alkyne cycloaddition (CuAAC), as described herein.

Exemplary bifunctional linkers were tested using oridonin as a model substrate (table 1A), oridonin (1 equivalent) and bifunctional linker (2 equivalents) were dissolved in acetonitrile and irradiated with uv light at room temperature.

As shown in Table 1A, linker III containing the 3- (trifluoromethyl) -3H-diazainden-3-yl group produced three isomers in 20% yield (based on the concentration of oridonin). Linker II containing 3-methyl-3H-diazaindene-3-yl produced 6 labeled isomers of oridonin with labeling efficiency of 69%, wherein both isomers had oridonin labeled with two linker II molecules. The linker 3- (2-azidoethyl) -3-methyl-3H-diazomethane (linker IV) containing the 3-methyl-3H-diazomethane-3-yl group gives rise to an isomer. Such as1H-NMR shows in the followingAfter the reaction between linker IV and oridonin is complete, unreacted linker IV and/or by-products of linker IV are removed by vacuum.

TABLE 1A labeling efficiency of different carbene or radical generating systems

aTo a solution of 0.5mL (0.1mM) oridonin in dry acetonitrile was added 1mL (0.1mM) of the corresponding bifunctional linker, and the resulting mixture was irradiated with a 365nm lamp for 30 minutes, and then the mixture was analyzed by LC-MS.

bOridonin labelled with a linker.

cOridonin labelled with two linkers.

dAnd (4) unreacted.

As shown in Table 1B, linker IV did not react with the azide functionality by reacting linker IV with 2, 4-dihydroxyacetophenone at concentrations up to 5 equivalents. As shown in FIG. 18, after the reaction is complete, e.g.1H-NMR showed that unreacted linker IV and/or by-products of linker IV were easily removed by vacuum (linker IV-2, 4-dihydroxyacetophenone conjugate is indicated by an arrow). The results show that trace amounts of residual linker IV do not interfere with subsequent DNA ligation based on CuAAC click chemistry and enzymatic DNA ligation.

TABLE 1B labeling efficiency of linker IV

aTo a solution of 0.5mL (0.1mM) of 2, 4-dihydroxyacetophenone in anhydrous acetonitrile was added the corresponding bifunctional linker IV and the resulting mixture was irradiated with a 365nm UV lamp for 30 minutes, followed by LC-MS and1the mixture was analyzed by H NMR.

b2, 4-dihydroxy acetophenone labeled with a linker IV.

c2, 4-dihydroxy acetophenone labeled with two linkers IV.

Linker conjugates shown in table 2 were prepared using linker IV.

TABLE 2

aYield determined by LC.

bLabeled isomers as determined by LC and MS-MS.

Scheme 2 shows the amide coupling chemistry facilitated by DMTMM for reaction with drugs and natural products, from propyne- (PEG)5-CH2CH2COOH to prepare propyne-HP-DNA.

Scheme 2: Propyne-HP-DNA

In scheme 2, HP-DNA (1mM) was added to sodium borate buffer (250mM) at pH 9.5, and 40 equivalents of propyne- (PEG) were added5-CH2CH2COOH (200mM in DMF) and then 50 equivalents of dimethylformamide (200mM in water) were added. After stirring at room temperature for 18 hours, 5mL of sodium chloride solution (10% by volume) and cold ethanol (2.5 volumes, b) were added to the reaction mixtureThe alcohol was stored at-20 deg.C. The mixture was stored at-80 ℃ for more than 30 minutes. Thereafter, the mixture was centrifuged in a microcentrifuge at 12000 rpm for 15 minutes at 4 ℃. The supernatant was removed and the pellet was dissolved in water to a final concentration of 1mM and used directly in the next step of the click-coupling reaction without further purification. Calculated accurate mass: 5267.07. actually measured quality: 5266.46.

the general procedure for labeling various drugs using bifunctional linker IV is shown below in scheme 3. In scheme 3, the NP is a drug or natural product and the propyne-HP-DNA is as described above. The compounds in tables 3 and 4 were prepared using these methods.

Scheme 3: synthesis of linker-IV conjugates and labeling compounds

Briefly, CH3CN (100. mu.L) was added to each well of a 96-well plate, containing compound (1. mu. mol) and linker IV (5. mu. mol). The plate was irradiated at UV for 30 minutes at 365nm at room temperature. The product and yield were determined by LC-MS. Will CH3CN was evaporated in vacuo overnight to give the corresponding NP-N3. Then, compound (NP-N)3) Dissolved in DMSO (30. mu.L) and mixed with propyne-HP-DNA (10. mu.L, 1mM in water), THPTA (10. mu.L, 80mM in DMSO), CuSO4·5H2O (10. mu.L, 80mM in water) and sodium ascorbate (20. mu.L, 80mM in water). The resulting mixture was shaken at room temperature overnight and after the reaction was complete, the product and yield were determined by LC-MS. After this time, the scavenger sodium diethyldithiocarbamate (12L, 160mM in water) was added. All HP-DNA conjugate compounds (NP-HP-DNA) were then collected and 5M sodium chloride solution (10% by volume) and cold ethanol (2.5 volumes, ethanol stored at-20 ℃) were added. The mixture was stored in a refrigerator at-80 ℃ for more than 30 minutes. The mixture was centrifuged in a microcentrifuge at 12000 rpm for 15 minutes at 4 ℃. The supernatant was removed and the pellet was dissolved in water.

DNA encoding was performed using a "one-pot" stepwise synthesis in 96-well plates as described herein. The drug-linker iv conjugates and the labeling compounds shown in tables 3 and 4 were prepared according to the procedures described above. A total of 110 DNA-encoded end products were obtained (Table 4). For compounds with multiple functional groups (including one or more of hydroxyl, carboxyl, amine, etc.) (such as compounds numbered 17, 25, 74, 82, 108, 91, 99, and 114), DNA conjugation readily occurs at multiple sites as shown by HPLC fractions showing the same molecular weight at different retention times, whereas for compounds with a single functional group, DNA conjugation is typically observed at a single site.

TABLE 3

TABLE 4

Example 18: combined synthetic structure

To exploit the fact that DNA-encoded chemical libraries (DELs) can be screened in a single tube, selected DELs synthesized by late-modification reactions are integrated into the DEL library format for screening. To this end, post-annotated DELs (including traditional Chinese medicine natural products (TCMs), FDA approved drugs, and control compounds in clinical trials) and small combinatorial DEL libraries of size 104 were prepared and then combined in a ratio of 1:10 (single compound concentrations). The effect of different mixing ratios on late marker DELs enrichment was tested using two known carbonic anhydrase II (CAII) inhibitors, catheteride and brinzolamide. The two CAII inhibitors were first DNA-encoded and added to 104 combination DEL (0.5 pM/molecule) at final concentrations of 0.05pM, 0.5pM and 5 pM. The late 0.05pM concentration mix labeled as del (ratio of 1: 10) showed the highest enrichment of brinzolamide and casinoside as 300-fold and 410-fold, respectively (table 5).

Table 5: multiple enrichment for selection of carbonic anhydrase conjugates

The amount of spiked natural product-DNA conjugate was quantified by quantitative polymerase chain reaction (qPCR) and then mixed with the DEL library in the indicated ratios. The DEL library of leads contained 12696 compounds, composed of 6 amines- (PEG)nAcid (building block 1), 46 amino acids (building block 2) and 46 carboxylic acids (building block 3).

The DNA coding framework is designed based on the structure of the headers and other barcodes in the literature. The head is the DNA head (5 '-/5 phos/GATCCA/ISp 9/iUniAm/ISp 9/TGACCCC-3'). DNA acid oligonucleotide-barcodes were enzymatically ligated in ligation buffer and T4 DNA ligase (NEB, Cat. # Z1811S). The reaction mixture was incubated at 16 ℃ for 16 hours and analyzed by LCMS and gel. The sequencing primers were 5 '-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACG and 5' -CAAGCAGAAGACGGCATACGAGATGTCGTGATGTGACTGGAGTTC, and the overall scheme is shown in FIG. 19.

His-tag fusion recombinant human PARP-1(Sino Biological, Cat. #11040-H08B) and human HSP70(Sino Biological, Cat. #11660-H07H) were obtained from commercial sources. The panning process for these two soluble proteins was identical. Mu.g of target protein was mixed with nickel magnetic beads (GenScript, Cat. # L00295), 5nM DEL pool and 10. mu.g/mL salmon sperm DNA. The final volume was adjusted to 100. mu.L. The mixture was rotated at room temperature for 1.5 hours. After 5 washes with Phosphate Buffered Saline (PBST) containing 0.05% Tween-20, the target protein-binding chemical-DNA conjugate was eluted by heating at 95 ℃ for 10min in 50. mu.L of elution buffer (20mM Tris, pH 7.4,100mM NaCl). The eluted DEL compounds were amplified by PCR using Takara PrimerSTAR Max DNA polymerase (Takara, Cat. # R045A). Excess primers were then removed using a Hieff NGSTM smarter DNA cleaner bead (Yeasen, Cat. #12600ES03) and evaluated on a 4% DNA agarose gel. The amplification products were sequenced at high throughput using the Illumina HiSeq X10 analyzer. Table 6 summarizes the affinity selection and PCR amplification of oligonucleotide tags for different targets.

Table 6.

1.Sino Biological,Cat.#11660-H07H

2.Sino Biological,Cat.#11040-H08B

3.GenScript,Cat.#L00295

4.Takara,Cat.#R045A

Combinatorial chemical libraries are constructed from three building block sub-libraries, where the first, second and third building block sub-libraries contain 6, 46 and 46 chemical building blocks, respectively. Each building block is encoded by a 10 base pair (bp) DNA sequence. The natural product is encoded by a 30-bp DNA sequence, the length of which is identical to the DNA code of the combinatorial chemistry library. All possible combinations (combinatorial compounds) between these three sub-libraries were generated by an internal java program that generated a reference DNA coding library containing 12696 DNA coding sequences. Following sequencing, the Illumina adaptor around the DNA coding sequence was trimmed using CLC genome workbench version 12 (Qiagen). In 3 rounds of "pool splitting" iterations, the left DNA coding sequence corresponds to 3 DNA sequences of the building block, 30bp in length. For each sample, the DNA coding sequence is mapped to a library of reference DNA-encoding compounds. No mismatches are allowed in the mapping. The coding sequences for all compounds in different samples were calculated. The total reading for all compounds in a given sample is calculated. The reading for each individual compound was divided by the total reading and multiplied by the constant 100000.

Normalized countCompound M, sample ANumber of timesCompound M, sample A/(Total number)Sample A)*100000

Fold changes were calculated for each compound in the post-selection library compared to the reference library. For example, if sample a is compared to a reference library, the fold change for compound M is calculated as follows:

multiple changeCompound MNormalized countCompound M, sample ANormalized countCompound M, reference

Hit criteria for nDEL screening take into account normalized enrichment fold values (y-axis) and deep sequencing read counts (x-axis). Compounds with read counts less than 10 are considered unreliable and should therefore be cleared from the DEL immediately prior to target screening. For each DEL pool, fold enrichment from baseline was recorded in the absence of target protein, and normalized fold enrichment values for each DEL compound in the pool can be calculated. The cut-off for hit identification was based on a simplified statistical analysis of a highly diverse population of data, which is the sum of the mean of the enrichment fold (μ) plus the 3-fold standard deviation (σ) of the entire pool. Any DEL compound with a fold enrichment greater than μ +3 σ was considered a hit.

PARP-1 enzyme assay

PARP-1 automated glycosylation based assays were performed according to the protocol provided (BPS, Cat. # 80580). The compounds were dissolved in DMSO. Triplicate experimental reactions were established by preincubating proteins with different ranges of compounds depending on the different compounds (final concentration of DMSO in all samples is 1%) for 15 minutes at room temperature. The ADP-ribosylation reaction was then performed by double dilution into a substrate coated assay plate and incubation at room temperature for 1 hour. Chemiluminescence was detected using a microplate reader (EnVision, PerkinElmer). The obtained data were fitted to a single-site dose response model using GraphPad, and Experimental IC was extracted50The value is obtained. The reported errors represent the standard error of the fitting parameters for each experiment.

Molecular modeling

The possible binding pattern of luteolin on the catalytic domain of PARP1 (residues 660 to 1011) was investigated using molecular pairings on a computer. AutoDock tool (version 4.2.6) was used for PARP1(PDB 4pjt) and ligand preparation to generate pdbqt files. Water molecules and inhibitors were removed from the PARP1 PDB document, polar hydrogen and Gasteiger partial charge were added. Grid sum of 60 × 60 × 40 points on x, y, z axesIs located in the center of the inhibitor binding site. A total of 200 runs were performed, and a maximum of 2500000 energy evaluations were performed. The mode of luteolin binding with the lowest binding free energy was selected for further molecular dynamics simulations.

The parameters and topology of the luteolin molecular dynamics simulation were derived by the ANTECHAMBER software and the ACPYPE script using semi-empirical Quantum chemistry program (SQM) and Generalized Amber Force Field (GAFF). Transient energy minimization of the luteolin-PARP 1 complex was followed by 100-ns molecular dynamics simulation to relax the interaction and stabilize the structure of the complex. Simulations were performed using the gromaccs 4.6.7 software package and the Amber14ffSB force field. The system contains Cl-And K+The total atomic TIP3P solution was dissolved in water to a concentration of 0.13M to simulate physiological ionic strength. The temperature T and pressure P were kept constant at 300K and 1 atmosphere respectively, using a Berendsen thermostat and a barostat. For long range electrostatic interactions, the cut-off value for direct interactions was 1.0nm using the fast smoothing Particle-Mesh EWald summation. It is believed that PARP-1 residues interact stably if their distance from leupeptin on the molecular kinetic trajectory is below a 3.0 angstrom cut-off for more than 90% of the time. These residues are listed in table 7.

TABLE 7 PARP-1 residues that interact with luteolin in molecular dynamics simulation trajectories.

Screening of nDEL of Heat shock protein 70kDa (HSP70) and Poly [ ADP-ribose ] polymerase 1(PARP1)

Two target proteins with different cellular locations and biophysical properties were used in the nDEL screen. These include HSP70 in the cytoplasm and PARP1 in the nucleus, each of which have different affinities for small molecule ligands. Known functional conjugates of these targets are included as internal positive controls in nDEL, e.g., oridonin for HSP70 and derivatives of olaparib for PARP-1 (F001, F002, F003, and F006). To address the uneven distribution of DNA barcodes in nDEL, blank screening of nDEL was first performed in the absence of target protein but in the presence of immobilized matrix beads, followed by deep sequencing to establish a baseline distribution of barcodes in the library. All screening data were analyzed using the method described by Decutins et al (Nat Protoc.2016; 11(4): 764-80). Based on the sequencing count of each nDEL compound, their corresponding DNA sequences were counted. Fold enrichment was calculated as the ratio of normalized sequencing counts in the presence and absence of the target protein. The following table shows the fold enrichment.

Table 8A: fold enrichment for HSP70 conjugate selection

Table 8B: fold enrichment for PARP-1 binding agent selection

Medicine (NP) Multiple change
F001 4.38
F002 3.69
Oridonin-A1 3.28
F006 3.21
Plumbagin 2.97
Scopoletin 2.82
(-) -epicatechin gallate 2.81
Licorice root xanthosine 2.66
Epigallocatechin 2.53
Theobromine 2.42
F003 2.34
Hyperoside A 2.25
Jatrorrhizine hydrochloride 2.23
Gentisic acid 2.21
Naringin 2.17
Luteolin 2.17
Daphnetin 2.17
Synephrine 2.17
Berbamine hydrochloride 2.15
Thiaurethane pyridazinol 2.06
Jatrorrhizine 2.04
Alizarin 2.02
Fraxinin 2.01

As shown in fig. 20, screening fingerprints for nDEL were plotted as fold enrichment versus normalized sequencing counts. Hits from the nDEL screen were identified based on fold enrichment of positive control compounds using the known binders as internal references.

nDEL screens were performed on purified human HSP70 and PARP-1 proteins. Deep sequencing and decoding analysis was performed on affinity captured nDEL. The results are summarized in table 9. All control compounds were enriched in nDEL screening with a hit rate between 0.15% and 0.47% (fig. 20). The high hit rate of HSP70 may be related to the viscosity of HSP70 protein. The first two fractions were collected and encoded with two unique DNA sequences, N055 and N056. Notably, the two stereoisomers of the ordorion-labeled compounds N055 and N056 were enriched 5.3-fold and 2.3-fold, respectively (fig. 20(a) and table 8A), indicating the structural preference of HSP70 for the different stereoisomers.

TABLE 9 DEL screening summary

Name of target point Hit (quantity) Hit rate
HSP70 60 0.47%
PARP1 34 0.27%

In the PARP-1 screen, 34 nDELs were enriched (FIG. 20(b)), 4 of which were confirmed to include a positive control compound (FIG. 21). The use of internal controls for known compounds in nDEL appears to greatly aid in the selection of true positive hits. Interestingly, flavonoids with similar structures, in particular, traditional Chinese medicine compounds, luteolin and its glycosylated analogues naringin and hyperin, were aggregated in the enriched chemicals.

Biochemical characterization of nDEL hits for PARP-1 enzyme inhibition

PARP-1 is a potent target in cancer therapy, catalyzing the ADP-ribose fragment from NAD+Rapid transfer to the receptor protein, which itself results in an eggFormation of linear and branched homo-ADP-ribose polymers bound to white matter in response to cellular signals of DNA damage and repair. The enzyme activity was measured based on the automated ribosylation of PARP-1 in the presence of sheared DNA.

The inhibitory activity of the enriched nDELs was characterized by PARP-1 automated ribosylation analysis. The positive control derivative F003 of olaparib showed potent inhibitory effect on PARP-1, IC50The value was 2.5nM (FIG. 22 (b)). The nDEL luteolin can inhibit PARP-1 enzyme activity and IC50The value was 7.5. mu.M (FIG. 22 (a)). To understand the interaction between PARP-1 and luteolin, a series of computer analyses based on molecular modeling were performed. Through molecular docking, luteolin occupies the catalytic domain of PARP-1 in the lowest free energy binding mode. To further evaluate the stability of this model and understand the molecular details of the interaction, a 100ns molecular dynamics simulation was performed starting from the predicted docking model. The complex appears stable over a simulated time window, with several residues in the protein interacting with luteolin over a considerable period of time. In particular D766, H862, Y896 and E988 maintained contact with luteolin in more than 90% of the simulated trajectories. Luteolin appears to be stable at the catalytic site of PARP-1 due to hydrogen bonding of the side chains of the G863, E988 and D766 residues, which are also NAD+Key residues for binding (fig. 22 (c)).

Discussion of the related Art

DELs are synthesized using combinatorial approaches including pool splitting synthesis, which is fundamentally an iterative process requiring multiple complex transformations in the presence of DNA. Compounds with highly complex steric structures, such as natural products, are not typically included in DELs because their synthesis requires more complex chemical transformations. The relative advantages of natural selection over time versus DELs selection using a large number of numbers have not been determined. However, current research provides for the first time a method of simultaneously studying two systems in a single tube. Enabling DEL screening under the same environmental conditions allows for a more in-depth understanding of different DELs and their applications. Furthermore, in nDELs, known binders or inhibitors of target proteins can serve as internal controls, which greatly improves validation rates and hit selection in DEL screens. Importantly, natural products may have evolved toward one goal and may not be useful to other goals. The use of nDELs overcomes this limitation by exposing the target to a large number of potential ligands.

One particular feature of the protocols described herein is the use of volatile linkers between the DNA and the organic compound. Incomplete chemical synthesis and undesirable byproducts have been shown to affect DEL screening (e.g., excess linker can react with a biological target). Volatile linkers can readily remove unreacted linker molecules, allowing multiple reactions to be performed in a single sample without concern for linker modification. Thus, subsequent analysis is not disturbed by linker modifications. For certain compounds, particularly those containing only C-H, diazomethane labeling is inefficient because the insertion of carbenes into the C-H bonds of many natural products can be problematic. New post-modification methods are necessary to extend the nDEL method. The C-H insertion of nitrene and carbon radical generated from 4- ((trimethylsilyl) ethynyl) phenyl sulfamate and 7-azido-1, 1-difluoroheptane-1-sulfinate sodium shows great promise as an aid. These modifications may also serve as alternative labeling methods to produce additional geometric isomers in compounds having a single functional group.

Although ndels are in an early stage of development, a limited number of ndels have shown encouraging potential in the hit identification of different classes of targets. The discovery of luteolin as a PARP-1 enzyme inhibitor highlights the role of nDELs. Luteolin is a natural flavonoid found in many fruits and vegetables, such as carrot, broccoli, onion leaves, parsley, celery, sweet pepper and chrysanthemum. Luteolin is also the active ingredient of many herbs in traditional Chinese medicine, such as flos Lonicerae, flos Chrysanthemi, Herba Herba uncipe, Prunellae Spica, globe artichoke, Perillae Herba, Scutellariae radix, and Herba Violae Japonicae. Traditionally, these herbs have been used in complex formulations as anti-inflammatory agents to relieve cough, resolve phlegm and treat diseases such as cardiovascular disease and hepatitis. Luteolin has been extensively studied for its potent anticancer activity against a variety of cancer cell types. More importantly, it has shown efficacy in reversing the growth of multi-drug resistant cancer cells (MDR). Luteolin exerts its anticancer activity through apoptosis and cell cycle regulation. Luteolin has been suggested for use in a variety of molecular targets, such as JNK, NF-. kappa. B, IGF-1, and the like. However, there is still a lack of evidence of direct interaction with defined binding pockets for any proposed target. Moreover, one or all of the listed targets do not reconcile all pharmacological behaviors of luteolin. Multiple pharmacology is a common obstacle in the study of natural products, greatly limiting the clinical development of these active natural compounds. PARP-1 of luteolin was identified by nDEL screening, demonstrating the potential of nDEL in multi-drug ecology analysis of natural products. Poly (ADP-ribose) polymerase 1(PARP-1) binds to DNA in response to transient and localized DNA strand breaks in cells caused by a variety of biological processes including DNA repair, replication, recombination, and gene rearrangement.

As a clinically proven chemotherapeutic target, PARP-1 inhibition exhibits a similar pattern of regulation as luteolin in apoptosis, cell cycle arrest, etc. PARP may be one of the key targets of luteolin, which has various pharmacological actions. nDELs have the potential to integrate numbers, diversity and information, which may be invaluable in our efforts to find treatments and solutions to biomedical problems.

* * *

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. Thus, for example, the terms "comprising," "including," and the like, are to be construed broadly and without limitation. And there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed.

Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification, improvement and variation of the inventions herein disclosed may be resorted to by those skilled in the art, and that such modifications, improvements and variations are considered to be within the scope of this invention. The materials, methods, and materials provided herein are representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention.

The present invention has been described broadly and broadly herein. Each of the narrower species and subclass groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.

In addition, where features or aspects of the invention are described by markush, those skilled in the art will recognize that the invention may also be described by any single member or subgroup of markush.

All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety to the same extent as if each had been individually incorporated by reference. In case of conflict, the present specification, including definitions, will control.

It should be understood that while the disclosure has been described in conjunction with the above-described embodiments, the foregoing description and examples are intended to illustrate, but not limit the scope of the disclosure. Other aspects, advantages, and modifications within the scope of the disclosure will be apparent to those skilled in the art to which the disclosure pertains.

84页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:由人白细胞抗原呈现的随机肽文库

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!