DNA coding compound soluble in organic solvent and intermediate compound thereof

文档序号:1165523 发布日期:2020-09-18 浏览:32次 中文

阅读说明:本技术 一种可溶于有机溶剂的dna编码化合物及其中间体化合物 (DNA coding compound soluble in organic solvent and intermediate compound thereof ) 是由 李进 刘观赛 罗华东 谭平 万金桥 于 2020-03-11 设计创作,主要内容包括:本发明提供了一种可溶于有机溶剂的DNA编码化合物及其中间体化合物,它具有式I所示的通式,其中,X为原子或分子骨架;A<Sub>1</Sub>为包含有连接基团和寡核苷酸的部分;A<Sub>2</Sub>为包含有连接基团和寡核苷酸的部分;M1为包含一个或多个结构单元的功能部分;M2为包含一个或多个可以增加在有机溶剂中溶解度的部分。本发明DNA编码化合物以支链形式引入聚乙二醇,既可以在前期化学反应增加DNA编码化合物在有机溶剂中的溶解度,又可以在后期进行生物筛选时将聚乙二醇断裂掉,提高DNA编码化合物在水中的溶解度,利于DNA编码化合物与生物靶标结合。本发明提高了DNA编码化合物及中间体在有机溶剂中的溶解度,扩展了可适用于DNA编码化合物库合成的反应类型。<Image he="436" wi="363" file="DDA0002407481130000011.GIF" imgContent="drawing" imgFormat="GIF" orientation="portrait" inline="no"></Image>。(The invention provides a DNA coding compound soluble in organic solvent and an intermediate compound thereof, which have a general formula shown in a formula I, wherein X is an atom or molecular skeleton; a. the 1 Is a moiety comprising a linking group and an oligonucleotide; a. the 2 Is a moiety comprising a linking group and an oligonucleotide; m1 is a functional moiety comprising one or more structural elements; m2 is a compound containing one or more moieties that can increase solubility in organic solvents. The DNA coding compound is introduced into polyethylene glycol in a branched chain form, so that the solubility of the DNA coding compound in an organic solvent can be increased through a chemical reaction in the early stage, and the polyethylene glycol can be broken off during biological screening in the later stage, so that the solubility of the DNA coding compound in water is improved, and the combination of the DNA coding compound and a biological target is facilitated. Book (I)The invention improves the solubility of the DNA coding compound and the intermediate in the organic solvent, and expands the reaction types suitable for synthesizing the DNA coding compound library. 。)

1. A DNA-encoding compound or intermediate compound characterized by: it has the general formula shown in formula I:

wherein X is an atom or molecular skeleton;

A1is a moiety comprising a linking group and an oligonucleotide;

A2is a moiety comprising a linking group and an oligonucleotide;

m1 is a functional moiety comprising one or more structural elements;

m2 is a compound containing one or more moieties that can increase solubility in organic solvents.

2. The DNA-encoding compound or intermediate compound of claim 1, wherein: and X is a molecular skeleton of carbon atoms or polyatomic atoms.

3. The DNA-encoding compound or intermediate compound of claim 2, wherein: the polyatomic skeleton is of a ring-mounted or acyclic skeleton structure.

4. The DNA-encoding compound or intermediate compound of claim 3, wherein: the acyclic skeleton structure isWherein X1、X2、X3、X4Each independently selected from carbon, oxygen, nitrogen, sulfur.

5. The DNA-encoding compound or intermediate compound of claim 3, wherein: the acyclic skeleton structure isWherein R is1、R2Each independently selected from trivalent atoms or groups, and T is a connecting polyatomic connecting chain.

6. The DNA-encoding compound or intermediate compound of claim 5, wherein: r1Is selected fromT is selected from the group consisting of difunctional alkylene groups or difunctional oligoglycol chain groups, R2Is selected from

Figure FDA0002407481100000015

7. The DNA-encoding compound or intermediate compound of claim 6, wherein: the functional group is selected from carboxyl, amino or aldehyde group.

8. The DNA-encoding compound or intermediate compound of any one of claims 4 to 7, wherein: the acyclic skeleton being selected in particular from

Figure FDA0002407481100000022

9. The DNA-encoding compound or intermediate compound of claim 1, wherein: the moiety of M2 that can increase solubility in organic solvents can be cleaved by physical, chemical or biological means.

10. The DNA-encoding compound or intermediate compound of claim 1, wherein: the general formula of the formula I has a structure shown in a formula II:

Figure FDA0002407481100000023

wherein Z is1At its 3' end with L1Ligated oligonucleotides, Z2At its 5' end with L2Ligated oligonucleotides or Z1At its 5' end with L1Ligated oligonucleotides, Z2At its 3' end with L2A linked oligonucleotide;

L1to include energy and Z1A linking group of a functional group forming a bond at the 3 'end or the 5' end of (a);

L2to include energy and Z2A linking group of a functional group forming a bond at the 5 'end or the 3' end of (a);

Y1is a functional moiety;

Y2a moiety that can increase solubility in organic solvents;

S1、S2each independently is a linking group.

11. The DNA-encoding compound or intermediate compound of claim 10, wherein: z1And Z2Complementary to form a double strand.

12. The DNA-encoding compound or intermediate compound of claim 10, wherein: z1And Z2The lengths may be the same or different.

13. The DNA-encoding compound or intermediate compound of claim 11 or 12, wherein: z1、Z2Each of which has a length of 10 or more bases and has 10 or more base pair complementary regions.

14. The DNA-encoding compound or intermediate compound of claim 10, wherein: z1、Z2Both contain PCR primer sequences.

15. The DNA-encoding compound or intermediate compound of claim 10, wherein: l is1、L2Each independently selected from difunctional alkylene groups or difunctional oligoglycol chain groups.

16. The DNA-encoding compound or intermediate compound of claim 15, wherein: l is1、L2The functional group in (1) is selected from phosphoric acid group, amino group, hydroxyl group, carboxyl group.

17. The DNA-encoding compound or intermediate compound of claim 16, wherein: l is1、L2Are each independently selected from

Figure FDA0002407481100000031

18. The DNA-encoding compound or intermediate compound of claim 10, wherein: s1、S2Each independently selected from difunctional alkylene groups or difunctional oligoglycol chain groups.

19. The DNA-encoding compound or intermediate compound of claim 18, wherein: s1、S2The functional group in (1) is selected from carboxyl, amino, halogen or aldehyde group.

20. The DNA-encoding compound or intermediate compound of claim 19, wherein: s1、S2Is selected from Wherein m is an integer of 1 to 10.

21. The DNA-encoding compound or intermediate compound of claim 10, wherein: y is2Contain groups that can increase solubility in organic solvents.

22. The DNA-encoding compound or intermediate compound of claim 21, wherein: y is2Comprising a polyethylene glycol group.

23. The DNA-encoding compound or intermediate compound of claim 22, wherein: y is2The polymerization degree range of the medium polyethylene glycol group is 10-2000.

24. The DNA-encoding compound or intermediate compound of claim 10, wherein: s2Fragmentation can occur by physical, chemical or biological means.

25. The DNA-encoding compound or intermediate compound of claim 1, wherein: the general formula of the formula I has a structure shown in a formula III:

wherein the content of the first and second substances,

Figure FDA0002407481100000035

L1、L2each independently is a linking group;

R1is a linking group;

n is 10 to 2000;

r is a functional moiety.

26. The DNA-encoding compound or intermediate compound of claim 1, wherein: the general formula of the formula I has a structure shown in a formula IV:

Figure FDA0002407481100000041

wherein the content of the first and second substances,

is double-stranded DNA of 10-200 bp;

L1、L2each independently is a linking group;

m is a group that can be cleaved by physical, chemical, or biological means;

R1is a linking group;

n is 10 to 2000;

r is a functional moiety.

27. The DNA-encoding compound or intermediate compound of claim 26, wherein: when M is cleaved by physical, chemical or biological means, it does not affect other moieties in the compound.

28. A library of DNA-encoding compounds characterized by: it consists of a compound encoded by the DNA of any one of claims 1 to 27.

29. The library of DNA encoding compounds of claim 28, wherein: the library of DNA encoding compounds comprises at least 106A variety of different DNA-encoding compounds.

30. The library of DNA encoding compounds of claim 28, wherein: the library of DNA encoding compounds comprises at least 108A variety of different DNA-encoding compounds.

31. The library of DNA encoding compounds of claim 28, wherein: the library of DNA encoding compounds comprises at least 1010A variety of different DNA-encoding compounds.

Technical Field

The invention relates to a DNA coding compound soluble in organic solvent and an intermediate compound thereof.

Background

In drug development, especially new drug development, high-throughput screening for biological targets is one of the main means for rapidly obtaining lead compounds. However, traditional high throughput screening based on single molecules requires long time, large equipment investment, limited number of library compounds (millions), and the building of compound libraries requires decades of accumulation, limiting the efficiency and possibility of discovery of lead compounds. The recent emergence of DNA-encoded compound library synthesis technology, combining combinatorial chemistry and molecular biology techniques, with one DNA tag per compound at the molecular level, has enabled synthesis of libraries of compounds up to the billions in extremely short time. Moreover, compounds can be identified by gene sequencing methods, which greatly increases the size and synthesis efficiency of compound libraries, and becomes the trend of the next generation of compound library screening technology, and starts to be widely applied in foreign pharmaceutical industry, resulting in many positive effects (Accounts of Chemical Research,2014,47, 1247-.

Traditional libraries of DNA-encoding compounds have been used to screen lead compounds, but there are some limitations: the DNA generally used for coding is suitable for aqueous phase reaction, and DNA coding compounds and intermediates have poor solubility in pure organic solvents and cannot be used for water-sensitive pure organic phase chemical reaction, so that the type of synthetic reaction which can be used for a DNA coding compound library is limited, and the molecular diversity in the DNA coding compound library is reduced.

Therefore, there is a need to develop a new library of DNA encoding compounds, to increase the solubility of DNA encoding compounds and intermediates in organic solvents, and to expand the types of reactions that can be applied to the synthesis of libraries of DNA encoding compounds. Patent CN107916456A discloses a method for improving the solubility of DNA-encoding compounds under organic conditions, which is to insert polyethylene glycol between a small molecule compound and a nucleic acid, and this method can greatly improve the solubility of DNA-encoding compounds under organic conditions.

However, the primary use of DNA-encoded compounds is ultimately to bind to biological targets in aqueous solution for subsequent biological screening, and if the solubility of a DNA-encoded compound is too high under organic conditions, it will be too low or insoluble in water, which may affect subsequent biological screening of the DNA-encoded compound. The practical need is further met if a DNA encoding compound is available that has high solubility in organic solvents during earlier chemical reactions and high solubility in water during later biological screening. However, the patent CN107916456A introduces polyethylene glycol as the connection between the compound and the nucleic acid before constructing the DNA coding compound, and if the polyethylene glycol is broken, the compound and the DNA code are broken, so that the solubility of the DNA coding compound in water cannot be improved by breaking the polyethylene glycol in the subsequent biological screening.

The research on the high solubility of a DNA coding compound in an organic solvent during the early chemical reaction and the high solubility in water during the later biological screening can expand the use of the DNA coding compound, and has important significance for the development of the DNA coding compound.

Disclosure of Invention

In order to solve the above problems, the present invention provides a DNA encoding compound soluble in an organic solvent and an intermediate compound thereof.

The invention provides a DNA coding compound or an intermediate compound, which has a general formula shown in a formula I:

Figure BDA0002407481110000021

wherein X is an atom or molecular skeleton;

A1is a moiety comprising a linking group and an oligonucleotide;

A2is a moiety comprising a linking group and an oligonucleotide;

m1 is a functional moiety comprising one or more structural elements;

m2 is a compound containing one or more moieties that can increase solubility in organic solvents.

Further, X is a molecular skeleton of carbon atoms or polyatomic atoms.

Further, the polyatomic skeleton is of a ring-mounted or acyclic skeleton structure.

Further, the acyclic skeleton structure isWherein X1、X2、X3、X4Each independently selected from carbon, oxygen, nitrogen, sulfur.

Further, the acyclic skeleton structure is

Figure BDA0002407481110000023

Wherein R is1、R2Each independently selected from trivalent atoms or groups, and T is a connecting polyatomic connecting chain.

Further, R1Is selected from

Figure BDA0002407481110000024

T is selected from difunctional alkylene groups or difunctional oligo-sDiol chain radical, R2Is selected from

Figure BDA0002407481110000031

Further, the functional group is selected from a carboxyl group, an amino group or an aldehyde group.

Further, the acyclic skeleton structure is specifically selected from

Further, the moiety of M2 that can increase solubility in organic solvents can be cleaved by physical, chemical, or biological methods.

Further, the general formula of formula I has the structure shown in formula II:

Figure BDA0002407481110000034

wherein Z is1At its 3' end with L1Ligated oligonucleotides, Z2At its 5' end with L2Ligated oligonucleotides or Z1At its 5' end with L1Ligated oligonucleotides, Z2At its 3' end with L2A linked oligonucleotide;

L1to include energy and Z1A linking group of a functional group forming a bond at the 3 'end or the 5' end of (a);

L2to include energy and Z2A linking group of a functional group forming a bond at the 5 'end or the 3' end of (a);

Y1is a functional moiety;

Y2a moiety that can increase solubility in organic solvents;

S1、S2each independently is a connectionA group.

Further, Z1And Z2Complementary to form a double strand.

Further, Z1And Z2The lengths may be the same or different.

Further, Z1、Z2Each of which has a length of 10 or more bases and has 10 or more base pair complementary regions.

Further, Z1、Z2Both contain PCR primer sequences.

Further, L1、L2Each independently selected from difunctional alkylene groups or difunctional oligoglycol chain groups.

Further, L1、L2The functional group in (1) is selected from phosphoric acid group, amino group, hydroxyl group, carboxyl group.

Further, L1、L2Are each independently selected from

Figure BDA0002407481110000035

Wherein n is an integer of 1 to 10.

Further, S1、S2Each independently selected from difunctional alkylene groups or difunctional oligoglycol chain groups.

Further, S1、S2The functional group in (1) is selected from carboxyl, amino, halogen or aldehyde group.

Further, S1、S2Is selected from

Figure BDA0002407481110000042

Wherein m is an integer of 1 to 10.

Further, Y2Contain groups that can increase solubility in organic solvents.

Further, Y2Comprising a polyethylene glycol group.

Further, Y2The polymerization degree range of the medium polyethylene glycol group is 10-2000.

Further, S2Fragmentation can occur by physical, chemical or biological means.

Further, the general formula of formula I has a structure represented by formula III:

wherein the content of the first and second substances,

Figure BDA0002407481110000045

is double-stranded DNA of 10-200 bp;

L1、L2each independently is a linking group;

R1is a linking group;

n is 10 to 2000;

r is a functional moiety.

Further, the general formula of formula I has the structure shown in formula IV:

wherein the content of the first and second substances,

is double-stranded DNA of 10-200 bp;

L1、L2each independently is a linking group;

m is a group that can be cleaved by physical, chemical, or biological means;

R1is a linking group;

n is 10 to 2000;

r is a functional moiety.

Further, when M is cleaved by physical, chemical or biological means, it does not affect other moieties in the compound.

The invention also provides a DNA coding compound library which consists of the DNA coding compound.

Further, the library of DNA encoding compounds comprises at least 106A variety of different DNA-encoding compounds.

Further, the library of DNA encoding compounds comprises at least 108A variety of different DNA-encoding compounds.

Further, the library of DNA encoding compounds comprises at least 1010A variety of different DNA-encoding compounds.

In the compounds of formula I or other compounds of the invention, the oligonucleotide may be any oligonucleotide suitable for use in a DNA encoding compound.

The DNA compound modified with a polymer (e.g., polyethylene glycol, polyurethane, etc.) of the present invention can be dissolved in a nonaqueous solvent, and can undergo a reaction on DNA in an organic solvent thereof.

The DNA coding compound is introduced into a polyethylene glycol chain when the DNA coding compound is synthesized, and the polyethylene glycol chain is introduced as a branched chain, so that the polyethylene glycol chain is more flexible in the compound, the solubility of the DNA coding compound in an organic solvent can be increased through a chemical reaction in the former stage, the polyethylene glycol can be broken off when biological screening is carried out in the later stage, the solubility of the DNA coding compound in water is improved while the rest part of the compound is not influenced, and the combination of the DNA coding compound and a biological target is facilitated.

The invention improves the solubility of the DNA coding compound and the intermediate in the organic solvent, and expands the reaction types suitable for synthesizing the DNA coding compound library.

Obviously, many modifications, substitutions, and variations are possible in light of the above teachings of the invention, without departing from the basic technical spirit of the invention, as defined by the following claims.

The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention.

Drawings

FIG. 1 is a LC-Ms spectrum of Compound 1.

Fig. 2 is the Ms spectrum of compound 1.

FIG. 3 is an HPLC chromatogram of Compound 1.

FIG. 4 is a LC-Ms spectrum of compound 2-1.

FIG. 5 is the Ms spectrum of Compound 2-1.

FIG. 6 is an HPLC chromatogram of Compound 2-1.

FIG. 7 is a LC-Ms spectrum of compound 2-2.

FIG. 8 is the Ms spectrum of compound 2-2.

FIG. 9 is an HPLC chromatogram of Compound 2-2.

FIG. 10 is a LC-Ms spectrum of Compound 3.

Fig. 11 is an Ms spectrum of compound 3.

FIG. 12 is a LC-Ms spectrum of compound 4-1.

FIG. 13 is the Ms spectrum of compound 4-1.

FIG. 14 is an HPLC chromatogram of Compound 4-1.

FIG. 15 is the LC-Ms spectrum of compound 4-2.

FIG. 16 is the Ms spectrum of compound 4-2.

FIG. 17 is an HPLC chromatogram of compound 4-2.

FIG. 18 is a LC-Ms spectrum of Compound 6.

Fig. 19 is the Ms spectrum of compound 6.

FIG. 20 is an HPLC chromatogram of Compound 6.

FIG. 21 is a LC-Ms spectrum of Compound 8.

Fig. 22 is the Ms spectrum of compound 8.

Figure 23 is an HPLC profile of compound 8.

FIG. 24 is a LC-Ms spectrum of Compound 9.

Fig. 25 is an Ms spectrum of compound 9.

Fig. 26 is an HPLC profile of compound 9.

FIG. 27 is a LC-Ms spectrum of Compound 11.

Fig. 28 is the Ms spectrum of compound 11.

Figure 29 is an HPLC profile of compound 11.

FIG. 30 is a LC-Ms spectrum of compound 12.

Fig. 31 is an Ms spectrum of compound 12.

Figure 32 is an HPLC profile of compound 12.

Fig. 33 is a uv-vis absorption spectrum of compound 1 in different solvents.

FIG. 34 shows UV-VIS absorption spectra of Compound 2-1 in different solvents.

FIG. 35 shows UV-VIS absorption spectra of compound 2-2 in different solvents.

Fig. 36 shows uv-vis absorption spectra of compound 3 in different solvents.

FIG. 37 shows UV-VIS absorption spectra of Compound 4-1 in different solvents.

FIG. 38 shows UV-VIS absorption spectra of compound 4-2 in different solvents.

FIG. 39 is a LC-Ms spectrum of the photocleavage reaction for 0 min.

FIG. 40 is a LC-Ms spectrum of a photocleavage reaction for 40 min.

FIG. 41 is a LC-Ms spectrum of a photocleavage reaction for 120 min.

Detailed Description

The raw materials and equipment used in the embodiment of the present invention are known products and obtained by purchasing commercially available products.

1. Abbreviations: fmoc represents fluorenylmethyloxycarbonyl; DMF means N, N-dimethylformamide; DMSO represents dimethyl sulfoxide; DMT-MM represents 2-chloro-4, 6-dimethoxy-1, 3, 5-triazine; TEAA represents triethylamine acetate; DIPEA represents N, N-diisopropylethylamine; boc represents tert-butyloxycarbonyl; DMA represents N, N-dimethylacetamide; HATU represents 2- (7-benzotriazol oxide) -N, N, N ', N' -tetramethyluronium hexafluorophosphate.

2. The following DNA linker sequences (HUB):

is abbreviated as

Wherein A is adenine, T is thymine, C is cytosine, and G is guanine.

Post-treatment operations of chemical reactions or DNase ligation reactions carried out on DNA:

the first operation mode is as follows: a reaction solution for a chemical reaction or a DNA enzyme ligation reaction on DNA was added with 10% by volume of a 5M NaCl aqueous solution and 2 to 3 times by volume of ethanol. After vortexing, the solution was frozen in a freezer at-20 ℃ for 1 hour. And (4) centrifuging and precipitating the frozen solution at a high speed, and removing supernatant to obtain a DNA sample.

The second operation mode is as follows: the reaction solution of the chemical reaction or the DNase ligation reaction on the DNA is concentrated and then purified by HPLC to obtain a DNA sample.

41页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种炔雌醇的制备方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!