Method for predicting boiling point of polycyclic aromatic hydrocarbon compound based on molecular energy data

文档序号:1891664 发布日期:2021-11-26 浏览:33次 中文

阅读说明:本技术 一种基于分子能量数据预测多环芳烃化合物沸点的方法 (Method for predicting boiling point of polycyclic aromatic hydrocarbon compound based on molecular energy data ) 是由 周晶晶 刘太昂 刘太行 刘振昌 吴治富 周央 朱峰 刘远 刘婷婷 朱鲁阳 于 2021-09-30 设计创作,主要内容包括:多环芳烃是指两个以上苯环以稠环形式相连的化合物,是一类广泛存在于环境中的有机污染物。文献中有关多环芳烃理化性质的数据报道不多,即使文献中有记载的数据其精度也不够。主要的原因是检测多环芳烃理化性质实验复杂、而且困难比较多。为克服传统方法的缺点,本专利利用第一性原理密度泛函理论方法,全优化计算了多个多环芳烃的分子结构和电子结构,得到量子化学的能量描述符的基础数据,然后对这些基础数据进行映射组合得到其中间数据,基于中间数据利用基于支持向量回归方法建立多环芳烃沸点与量子化学能量参数间的构效关系模型,最后用建立的构效关系模型,预报新收集的多环芳烃样本的沸点。(Polycyclic aromatic hydrocarbons refer to compounds in which two or more benzene rings are connected in a fused ring form, and are a class of organic pollutants widely existing in the environment. There are few data reports on the physicochemical properties of polycyclic aromatic hydrocarbons in the literature, and the accuracy is not sufficient even for the data described in the literature. The main reason is that the experiment for detecting the physical and chemical properties of the polycyclic aromatic hydrocarbon is complex and difficult. In order to overcome the defects of the traditional method, the method utilizes a first principle density functional theory method to calculate the molecular structures and the electronic structures of a plurality of polycyclic aromatic hydrocarbons in a full optimization manner to obtain basic data of quantum chemistry energy descriptors, then the basic data are mapped and combined to obtain intermediate data, a structure-activity relationship model between the polycyclic aromatic hydrocarbon boiling point and quantum chemistry energy parameters is established based on the intermediate data and based on a support vector regression method, and finally the boiling point of a newly collected polycyclic aromatic hydrocarbon sample is forecasted by the established structure-activity relationship model.)

1. A method for predicting the boiling point of a polycyclic aromatic hydrocarbon compound based on molecular energy data comprises the following steps:

1) searching a plurality of polycyclic aromatic hydrocarbon compounds and corresponding boiling point data thereof from the literature, and obtaining 4 quantum chemical structure energy descriptors by utilizing quantum chemical calculation; 2) mapping and converting the 4 quantum chemical result energy descriptors; 3) based on the equation after mapping conversion, establishing a boiling point quantitative prediction model of the polycyclic aromatic hydrocarbon compound by using a support vector machine regression calculation; 4) collecting a plurality of new polycyclic aromatic hydrocarbon compounds, obtaining 4 quantum chemical structure energy descriptors by utilizing quantum chemical calculation, substituting the 4 parameters into a mapping conversion equation, substituting the converted data into a support vector machine regression model for boiling point prediction, and predicting the boiling points of the new polycyclic aromatic hydrocarbon compounds.

2. The method for predicting the boiling point of the polycyclic aromatic hydrocarbon compound based on the molecular energy data as set forth in claim 1, wherein the step 2) maps the conversion, and the equation is as follows:

Y1=+42.587[EH]-30.122[EL]+22.400[ΔE]-4.197E-3[ET]+8.091

Y2=-44.325[EH]+12.856[EL]-13.697[ΔE]-5.696E-3[ET]-14.333

Y3=+79.749[EH]+42.566[EL]-9.506[ΔE]-1.968E-3[ET]+16.500。

Technical Field

The invention relates to the prediction of the boiling point of a polycyclic aromatic hydrocarbon compound, in particular to a method for predicting the boiling point of the polycyclic aromatic hydrocarbon compound based on molecular energy data.

Background

Polycyclic aromatic hydrocarbons refer to compounds in which two or more benzene rings are connected in a fused ring form, and are a class of organic pollutants widely existing in the environment. Polycyclic aromatic hydrocarbons are also the largest number of environmental carcinogens, with the polycyclic aromatic hydrocarbons accounting for over 1/3 among as many as 1000 carcinogens, since the hydrogen atoms on the benzene ring can be replaced by different groups to form thousands of different compounds. There are few data reports on the physicochemical properties of polycyclic aromatic hydrocarbons in the literature, and the accuracy is not sufficient even for the data described in the literature. The main reason is that the experiment for detecting the physical and chemical properties of the polycyclic aromatic hydrocarbon is complex and difficult. Due to the research on the structure-activity relationship of the quantitative structure-property relationship of the physical and chemical properties of the polycyclic aromatic hydrocarbon, the method has important significance for understanding the toxicity of the polycyclic aromatic hydrocarbon compound and the prevention and treatment of the polycyclic aromatic hydrocarbon compound on environmental pollution. In recent years, the check relation of polycyclic aromatic hydrocarbons has been increasingly studied. Most of the researches establish a structure-activity relationship model of the physicochemical properties of the polycyclic aromatic hydrocarbon by using the traditional chemometrics methods, such as multivariate linear regression, partial least squares regression and the like as modeling tools and using the structure parameters calculated by a semi-empirical method as molecular descriptors, and the researches are semi-empirical methods, and the parameters are more, so that a certain amount of noise data is brought, the accuracy of the result is questionable, and the interpretability of the result is poor. In recent years, quantum chemical methods can describe electrons and interactions thereof in more detail and accurately, theoretically characterize the electronic and geometric characteristics of molecules and intermolecular interactions, and quantum chemical parameters also have definite physical meanings, so that more researches introduce the quantum chemical parameters into a structure-activity relationship modeling process. The method utilizes a first principle density functional theory method to calculate the molecular structures and electronic structures of 50 polycyclic aromatic hydrocarbons in a fully-optimized manner, obtains basic data of energy descriptors of 4 quantum chemistry, then performs mapping combination on the basic data to obtain intermediate data, establishes a structure-activity relationship model between the polycyclic aromatic hydrocarbon boiling point and quantum chemistry energy parameters based on the intermediate data by utilizing a support vector regression-based method, and finally predicts the boiling points of 7 newly collected polycyclic aromatic hydrocarbons by using the established structure-activity relationship model.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a method for predicting the boiling point of a polycyclic aromatic hydrocarbon compound based on molecular energy data. The process is realized by using a computer technology, chemical experiments are avoided, and time and cost are saved.

The purpose of the invention can be realized by the following technical scheme:

a method for predicting the boiling point of a polycyclic aromatic hydrocarbon compound based on molecular energy data comprises the following steps:

1) searching a plurality of polycyclic aromatic hydrocarbon compounds and corresponding boiling point data thereof from the literature, and obtaining 4 quantum chemical structure energy descriptors by utilizing quantum chemical calculation;

2) mapping and converting the 4 quantum chemical result energy descriptors;

3) based on the equation after mapping conversion, establishing a boiling point quantitative prediction model of the polycyclic aromatic hydrocarbon compound by using a support vector machine regression calculation;

4) collecting a plurality of new polycyclic aromatic hydrocarbon compounds, obtaining 4 quantum chemical structure energy descriptors by utilizing quantum chemical calculation, substituting the 4 parameters into a mapping conversion equation, substituting the converted data into a regression model of a support vector machine for boiling point prediction, and predicting the boiling points of the new polycyclic aromatic hydrocarbon compounds.

Compared with the prior art, the invention has the following advantages:

1. the whole process of the invention can be realized by a computer, thereby saving time, avoiding experiments and greatly reducing cost.

2. The benzene preparation method is simple in the whole operation process, and can be completed by one person through simple training.

3. The whole process of the invention does not relate to experiments and chemicals, does not produce environmental pollution, and accords with the concept of green environmental protection.

Drawings

FIG. 1 is a graph of modeling results for the boiling point of polycyclic aromatic hydrocarbons.

FIG. 2 is a leave-one-out cross-validation result diagram of a polycyclic aromatic hydrocarbon boiling point model.

Detailed Description

The invention is described in detail below with reference to specific embodiments, comprising the following steps:

(1) 50 polycyclic aromatic hydrocarbon compounds and corresponding boiling point data thereof are found from the literature, and the polycyclic aromatic hydrocarbon is subjected to molecular geometry optimization calculation by using a Gaussian 03 software package and a DFT-B3LYP/6-311G method. Through vibration frequency analysis, the obtained stable structure has no virtual frequency, and after reaching a minimal point corresponding to a potential energy surface, 4 quantum chemical structure energy descriptors are obtained through quantum chemical calculation, so that 48 x 4 basic data are formed by the 4 quantum chemical structure energy descriptors and the boiling point of the polycyclic aromatic hydrocarbon compound. Some of the basic example data are shown in table 1.

TABLE 1 partial basic data

Molecule Boiling point EH (molecular highest occupied orbital energy) EL (molecular minimum unoccupied orbital energy) Delta E (front track energy difference)) ET (Total energy of molecule)
Acenaphthylene 543.15 -0.223 -0.079 -0.144 -462.19
1-Methylnaphthalene 518.15 -0.217 -0.044 -0.173 -425.31
Acenaphthene 552.15 -0.21 -0.038 -0.172 -463.42
2-Methylnaphthalene 536.15 -0.214 -0.041 -0.173 -464.64
Anthracene 613.15 -0.201 -0.07 -0.131 -539.66
Phenanthrene 611.15 -0.22 -0.046 -0.174 -539.66
2,3,5-Trimethylnaphthalene 558.15 -0.211 -0.038 -0.173 -503.97
Pyrene 666.15 -0.205 -0.064 -0.141 -615.91
1-Methylphenanthrene 632.15 -0.217 -0.047 -0.17 -578.99
3-Methylphenanthrene 625.15 -0.215 -0.043 -0.172 -578.99
Fluoranthene 656.15 -0.221 -0.074 -0.147 -615.89
2-Methylphenanthrene 628.15 -0.218 -0.045 -0.173 -578.99
11H-benzo[a]fluorene 676.15 -0.209 -0.052 -0.157 -655.22

(2) The 4 quantum chemical result energy descriptors were mapped to give 48 × 4 intermediate data. The mapping transformation equation is as follows:

Y1=+42.587[EH]-30.122[EL]+22.400[ΔE]-4.197E-3[ET]+8.091

Y2=-44.325[EH]+12.856[EL]-13.697[ΔE]-5.696E-3[ET]-14.333

Y3=+79.749[EH]+42.566[EL]-9.506[ΔE]-1.968E-3[ET]+16.500

some intermediate data are shown in table 2.

TABLE 2 partial intermediate data

Y1 Y2 Y3
-0.3117 -0.8588 -2.3684
-1.9147 -0.4875 -0.1971
-1.615 -0.5173 0.682
-1.7122 -0.3579 0.2472
0.9705 -1.4551 -0.2021
-1.5247 0.2848 -0.2869
-1.5098 -0.2283 0.6916
0.7155 -0.6294 -0.0205
-1.1121 0.3081 -0.0509
-1.1922 0.2983 0.2979
0.2008 0.0334 -1.6651
-1.2821 0.4193 -0.017
-0.0097 0.1453 0.4007

(3) And based on the equation after mapping conversion, establishing a boiling point quantitative prediction model of the polycyclic aromatic hydrocarbon compound by using a support vector machine regression calculation. And selecting a radial basis kernel function in a support vector machine reply algorithm, selecting a penalty factor of 80 and selecting an insensitive function of 0.015.

(4) And 7 new polycyclic aromatic hydrocarbon compounds are collected, 4 quantum chemical structure energy descriptors are obtained by utilizing quantum chemical calculation, the 4 parameters are substituted into a mapping conversion equation, and the converted data are substituted into a support vector machine regression model for boiling point prediction to predict the boiling points of the new polycyclic aromatic hydrocarbon compounds.

Example 1: based on 4 quantum chemical structure energy descriptors of 50 polycyclic aromatic hydrocarbons, a support vector machine regression structure-activity relationship forecasting model of the boiling point of the polycyclic aromatic hydrocarbon compound is established, and the modeling result is shown in figure 1.

And carrying out regression modeling on 50 polycyclic aromatic hydrocarbon sample data by using a support vector machine regression algorithm, and establishing a quantitative prediction model of the polycyclic aromatic hydrocarbon boiling point, wherein the correlation coefficient of the model prediction value and the literature true value is 0.99.

Example 2: based on 4 quantum chemical structure energy descriptors of 50 polycyclic aromatic hydrocarbons, a support vector machine regression structure-activity relationship forecasting model of the boiling point of the polycyclic aromatic hydrocarbon compound is established, and the model is shown in figure 2 by one-method internal cross validation results.

And carrying out regression modeling on 50 polycyclic aromatic hydrocarbon sample data by using a support vector machine regression algorithm, and establishing a quantitative prediction model of the polycyclic aromatic hydrocarbon boiling point, wherein the correlation coefficient of the model prediction value and the document true value is 0.98 according to the internal cross validation result of the model.

Example 3: the 4 polycyclic aromatic hydrocarbon compounds newly collected, their 4 quantum chemical structure energy descriptors, mapping conversion data, and boiling point prediction results are shown in table 3.

TABLE 3 prediction results

Molecule EHOMO ELUMO ΔE ETot Y1 Y2 Y3 Boiling point (forecast value)
2-Methylpyrene -0.203 -0.062 -0.141 -655.24 0.905568 -0.4682 0.301476 692.35
4-Methylpyrene -0.203 -0.062 -0.141 -655.24 0.905568 -0.4682 0.301476 692.55
Cyclopenta[cd]pyrene -0.208 -0.087 -0.121 -692.12 2.048481 -0.63185 -1.27897 694.36
Indeno[1.2.3-cd]fluoranthene -0.212 -0.092 -0.12 -845.78 2.696115 0.342725 -1.51794 764.71
dibenzo(a. e)pyrene -0.2 -0.07 -0.13 -923.26 2.645691 0.67195 0.62301 850.66
dibenzo(a. i)pyrene -0.194 -0.075 -0.119 -923.26 3.298222 0.191049 0.784106 843.40
Dibenzo[b.def]chrysene -0.189 -0.083 -0.106 -923.26 4.043333 -0.31149 0.718747 831.34

6页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于半监督非线性变分贝叶斯混合模型的成分参数鲁棒软测量方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!