Method of generating chemical structure, neural network device, and non-transitory computer-readable recording medium
阅读说明:本技术 产生化学结构的方法、神经网络设备和非瞬时计算机可读的记录介质 (Method of generating chemical structure, neural network device, and non-transitory computer-readable recording medium ) 是由 金勍德 权宁千 金美淑 庾志镐 崔伦硕 于 2019-02-26 设计创作,主要内容包括:本发明涉及产生化学结构的方法、神经网络设备和非瞬时计算机可读的记录介质。通过使用神经网络使用在对于参比化学结构的描述符或图像中的表达特定性质的表达区域来产生新的化学结构。所述新的化学结构可通过改变在所述参比化学结构中的对应于所述表达区域的局部结构而产生。(The present invention relates to a method of generating a chemical structure, a neural network device, and a non-transitory computer-readable recording medium. New chemical structures are generated by using neural networks using expression regions expressing specific properties in descriptors or images for reference chemical structures. The new chemical structure may be generated by altering the local structure in the reference chemical structure corresponding to the expression region.)
1. A method of generating a chemical structure by using a neural network device, the method comprising:
inputting a descriptor of a chemical structure to a trained neural network, the trained neural network producing property values for a property of the chemical structure, the descriptor of the chemical structure representing a structural characteristic of the chemical structure and the property of the chemical structure being a characteristic possessed by the chemical structure;
determining an expression region in the descriptor for expressing the property, the expression region comprising a bit position in the descriptor; and
generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
2. The method of claim 1, wherein the determining comprises:
determining an expression region in the descriptor for expressing the property by: the trained neural network carries out an interpretation process to determine whether the property value is expressed by the local structure in the chemical structure.
3. The method of claim 2, wherein the determining comprises:
determining an expression region in the descriptor for expressing the property by applying a layer-wise correlation propagation (LRP) technique to the trained neural network,
wherein an activation function applied to a node of the trained neural network is selected as a linear function to apply the LRP technique to the trained neural network, and a Mean Square Error (MSE) is selected for optimization.
4. The method of claim 1, wherein the generating comprises:
obtaining a bit value of a bit position of the expression region in the descriptor; and
generating the new chemical structure by applying a genetic algorithm to the bit values of the bit positions and modifying the local structure corresponding to the expression region.
5. The method of claim 1, wherein the generating comprises:
generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region;
inputting descriptors for the new first chemical structure to the trained neural network to output property values for a particular property of the new first chemical structure; and
generating a new second chemical structure by changing a local structure in the new first chemical structure when a property value for a specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
6. A neural network device configured to generate a chemical structure, the neural network device comprising:
a memory configured to store at least one program; and
a processor configured to implement a neural network by executing the at least one program to control the neural network device, the processor being configured when the at least one program is executed to:
inputting a descriptor of a chemical structure to a trained neural network, the trained neural network producing property values for a property of the chemical structure, the descriptor of the chemical structure representing a structural characteristic of the chemical structure and the property of the chemical structure being a characteristic possessed by the chemical structure;
determining an expression region in the descriptor for expressing the property, the expression region comprising a bit position in the descriptor; and
generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
7. The neural network device of claim 6, wherein the at least one program, when executed, the processor is further configured to determine an expression region in the descriptor for expressing the property by: the trained neural network carries out an interpretation process to determine whether the property value is expressed by the local structure in the chemical structure.
8. The neural network device of claim 7, wherein when the at least one program is executed, the processor is further configured to:
determining an expression region in the descriptor for expressing the property by applying a layer-wise correlation propagation (LRP) technique to the trained neural network; and
selecting an activation function applied to a node of the trained neural network as a linear function to apply the LRP technique and a selected Mean Square Error (MSE) to the trained neural network for optimization.
9. The neural network device of claim 6, wherein when the at least one program is executed, the processor is further configured to obtain bit values for bit positions of the expression region in the descriptor and to generate the new chemical structure by applying a genetic algorithm to the bit values for the bit positions and modifying the local structure corresponding to the expression region.
10. The neural network device of claim 6, wherein the processor, when the at least one program is executed, is configured to:
generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region;
inputting descriptors for the new first chemical structure to the trained neural network to output property values for a particular property of the new first chemical structure; and
generating a new second chemical structure by changing a local structure in the new first chemical structure when a property value for a specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure in the memory when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
11. A method of generating a chemical structure by using a neural network device, the method comprising:
inputting an image of a chemical structure to a trained neural network, the trained neural network producing property values for properties of the chemical structure, the image of the chemical structure representing structural characteristics of the chemical structure and the properties of the chemical structure being characteristics possessed by the chemical structure;
determining an expression region in the image for expressing the property, the expression region comprising one or more pixels in the image; and
generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
12. The method of claim 11, wherein the determining comprises:
determining an expression region in the image for expressing the property by: the trained neural network carries out an interpretation process to determine whether the property value is expressed by the local structure in the chemical structure.
13. The method of claim 12, wherein the determining comprises:
determining an expression region in the image for expressing the property by applying a layer-wise correlation propagation (LRP) technique to the trained neural network,
wherein an activation function applied to a node of the trained neural network is selected as a linear function to apply the LRP technique to the trained neural network, and a Mean Square Error (MSE) is selected for optimization.
14. The method of claim 11, wherein the generating comprises:
obtaining pixel values for one or more pixels in the expression region in the image; and
generating the new chemical structure by applying Gaussian noise to pixel values of the one or more pixels and modifying the local structure corresponding to the expression region.
15. The method of claim 11, wherein the expression region comprises a plurality of expression regions expressing the property, and the generating comprises:
obtaining coordinate information corresponding to the plurality of expression regions in the image;
calculating center points of the plurality of expression regions in the image based on the coordinate information and obtaining pixel values of the center points; and
generating the new chemical structure by applying Gaussian noise to the pixel values and modifying the local structure corresponding to the central point.
16. The method of claim 11, wherein the generating comprises:
generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region;
inputting an image for the new first chemical structure to the trained neural network to output a property value for a particular property of the new first chemical structure; and
generating a new second chemical structure by changing a local structure in the new first chemical structure when a property value for a specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
17. A neural network device configured to generate a chemical structure, the neural network device comprising:
a memory configured to store at least one program; and
a processor configured to implement a neural network by executing the at least one program to control the neural network device, the processor being configured when the at least one program is executed to:
inputting an image of a chemical structure to a trained neural network, the trained neural network producing property values for properties of the chemical structure, the image of the chemical structure representing structural characteristics of the chemical structure and the properties of the chemical structure being characteristics possessed by the chemical structure;
determining an expression region in the image for expressing the property, the expression region comprising one or more pixels in the image; and
generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
18. The neural network device of claim 17, wherein the at least one program, when executed, the processor is further configured to determine an expression region in the image for expressing the property by: the trained neural network carries out an interpretation process to determine whether the property value is expressed by the local structure in the chemical structure.
19. The neural network device of claim 18, wherein when the at least one program is executed, the processor is further configured to:
determining an expression region in the image for expressing the property by applying a layer-wise correlation propagation (LRP) technique to the trained neural network; and
selecting an activation function applied to a node of the trained neural network as a linear function to apply the LRP technique and a selected Mean Square Error (MSE) to the trained neural network for optimization.
20. The neural network device of claim 17, wherein the at least one program, when executed, is further configured to obtain pixel values of one or more pixels in the expression region in the image and to generate the new chemical structure by applying gaussian noise to the pixel values of the one or more pixels and modifying the local structure corresponding to the expression region.
21. The neural network device of claim 17, wherein the expression region comprises a plurality of expression regions that express the property, and when the at least one program is executed, the processor is further configured to:
obtaining coordinate information corresponding to the plurality of expression regions in the image;
calculating center points of the plurality of expression regions in the image based on the coordinate information and obtaining pixel values of the center points; and
generating the new chemical structure by applying Gaussian noise to the pixel values and modifying the local structure corresponding to the central point.
22. The neural network device of claim 17, wherein when the at least one program is executed, the processor is further configured to:
generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region;
inputting an image for the new first chemical structure to the trained neural network to output a property value for a particular property of the new first chemical structure; and
generating a new second chemical structure by changing a local structure in the new first chemical structure when a property value for a specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure in the memory when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
23. A non-transitory computer-readable recording medium containing a program which, when executed by a computer, carries out the method according to any one of claims 1 to 5 and 11 to 16.
Technical Field
The present disclosure relates to methods and apparatus for generating chemical structures using neural networks.
Background
Neural networks refer to computational architectures that mimic (model) a biological brain. With advanced neural network technology, various types of electronic systems have analyzed input data and generated optimized information by using neural networks.
In recent years, a great deal of research has been conducted on the following methods: chemical structures to be used in material development are selected by evaluating the properties of the chemical structures using neural network techniques. In particular, there is a need to develop a method of generating a new chemical structure satisfying various requirements by using neural network technology.
Disclosure of Invention
Embodiments of the present disclosure relate to methods and apparatus for generating chemical structures using neural networks. Further, a computer-readable recording medium is provided, which includes a program that when executed by a computer carries out the method. The technical problems to be solved are not limited to those as described, but other technical problems may exist.
Additional aspects will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the embodiments provided.
According to an aspect of an embodiment, there is provided a method of generating a chemical structure by using a neural network device, comprising: inputting a descriptor of a chemical structure to a trained neural network, the trained neural network producing property values for a property of the chemical structure, the descriptor of the chemical structure representing a structural characteristic of the chemical structure and the property of the chemical structure being a characteristic possessed by the chemical structure; determining an expression region in the descriptor for expressing the property, the expression region comprising bit positions in the descriptor; and generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
Determining the expression region may comprise determining the expression region in the descriptor for expressing the property by: the trained neural network carries out an interpretation (interpretation) process to determine whether the property value is expressed by the local structure in the chemical structure.
Determining the expression region may comprise determining the expression region in the descriptor for expressing the property by: applying a layer-wise relevance propagation (LRP) technique to the trained neural network, wherein activation functions applied to nodes of the trained neural network may be selected (specified) as linear functions to apply the LRP technique to the trained neural network, and a Mean Square Error (MSE) may be selected for optimization.
Creating a new chemical structure may include: obtaining a bit value for the bit position of the expression region in the descriptor; and generating the new chemical structure by applying a genetic algorithm to the bit values of the bit positions and modifying the local structure corresponding to the expression region.
Creating a new chemical structure may include: generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region; inputting descriptors for the new first chemical structure to the trained neural network to output property values for a particular property of the new first chemical structure; and generating a new second chemical structure by changing a local structure in the new first chemical structure when the property value for the specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
According to an aspect of an embodiment, there is provided a neural network device configured to generate a chemical structure, comprising: a memory configured to store at least one program(s); and a processor configured to drive a neural network by executing the at least one program, wherein the processor is configured to: inputting a descriptor of a chemical structure to a trained neural network, the trained neural network producing property values for a property of the chemical structure, the descriptor of the chemical structure representing a structural characteristic of the chemical structure and the property of the chemical structure being a characteristic possessed by the chemical structure; determining an expression region in the descriptor for expressing the property, the expression region comprising a bit position in the descriptor; and generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
According to an aspect of an embodiment, there is provided a method of generating a chemical structure by using a neural network device, comprising: inputting an image of a chemical structure to a trained neural network, the trained neural network producing property values for properties of the chemical structure, the image of the chemical structure representing structural characteristics of the chemical structure and the properties of the chemical structure being characteristics possessed by the chemical structure; determining an expression region in the image for expressing the property, the expression region comprising one or more pixels in the image; and generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
Determining the expression region may comprise determining the expression region in the image for expressing the property by: the trained neural network carries out an interpretation process to determine whether the property value is expressed by the local structure in the chemical structure.
Determining the expression region may comprise determining the expression region in the image for expressing the property by: applying a layer-by-layer correlation propagation (LRP) technique to the trained neural network, wherein activation functions applied to nodes of the trained neural network may be selected as linear functions to apply the LRP technique to the trained neural network, and a mean-square error (MSE) may be selected for optimization.
Creating a new chemical structure may include: obtaining pixel values of the one or more pixels in the expression region in the image; and generating the new chemical structure by applying gaussian noise to pixel values of the one or more pixels and modifying the local structure corresponding to the expression region.
Creating a new chemical structure may include: when a plurality of expression regions expressing the property in the image exist, obtaining coordinate information in the image corresponding to the plurality of expression regions; calculating center points of the plurality of expression regions in the image based on the coordinate information and obtaining pixel values of the center points; and generating the new chemical structure by applying gaussian noise to the pixel values and modifying the local structure corresponding to the central point.
Creating a new chemical structure may include: generating a new first chemical structure by modifying the local structure in the chemical structure, the local structure corresponding to the expression region; inputting an image for the new first chemical structure to the trained neural network to output a property value for a particular property of the new first chemical structure; and generating a new second chemical structure by changing a local structure in the new first chemical structure when the property value for the specific property of the new first chemical structure is less than a preset value, the local structure corresponding to the expression region, and storing the new first chemical structure when the property value for the specific property of the new first chemical structure is equal to or greater than the preset value.
According to an aspect of an embodiment, there is provided a neural network device configured to generate a chemical structure, comprising: a memory configured to store at least one program; and a processor configured to drive a neural network by executing the at least one program, wherein the processor is configured to: inputting an image of a chemical structure to a trained neural network, the trained neural network producing property values for properties of the chemical structure, the image of the chemical structure representing structural characteristics of the chemical structure and the properties of the chemical structure being characteristics possessed by the chemical structure; determining an expression region in the image for expressing the property, the expression region comprising one or more pixels in the image; and generating a new chemical structure by modifying a local structure in the chemical structure, the local structure corresponding to the expression region.
According to an aspect of an embodiment, there is provided a non-transitory computer-readable recording medium including a program which, when executed by a computer, carries out any of the methods.
Drawings
These and/or other aspects will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a block diagram illustrating a hardware configuration of a neural network device according to an embodiment;
fig. 2 is a diagram illustrating calculations carried out by a deep (deep) neural network (DNN) according to an embodiment;
fig. 3 is a diagram illustrating calculations carried out by a recurrent (recurrent) neural network (RNN) according to an embodiment;
FIG. 4 is a conceptual diagram illustrating a neural network system for generating chemical structures, according to an embodiment;
FIG. 5 is a diagram illustrating a method of representing a chemical structure, according to an embodiment;
FIG. 6 is a diagram illustrating a method of interpreting a neural network, according to an embodiment;
FIG. 7 is a diagram illustrating an example of changing an expression region of a descriptor to generate a new chemical structure, according to an embodiment;
FIG. 8 is a diagram illustrating an example of changing a local structure by changing a bit value of a descriptor, according to an embodiment;
fig. 9 is a diagram illustrating an example of changing a local structure by changing a pixel value of an image according to an embodiment;
fig. 10 is a diagram illustrating an example of changing a pixel value when there are a plurality of expression regions on an image according to an embodiment;
fig. 11 is a flowchart of a method of generating a new chemical structure by changing descriptors for chemical structures in a neural network device, according to an embodiment; and
fig. 12 is a flowchart of a method of generating a new chemical structure by changing an image for the chemical structure in a neural network device according to an embodiment.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. In this regard, the present embodiments may have different forms and should not be construed as limited to the descriptions set forth herein. Accordingly, the embodiments are described below to illustrate aspects only by referring to the drawings. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. Expressions such as "at least one (of) … …" when preceding or succeeding a list of elements modifies the entire list of elements without modifying individual elements of the list such that expressions of "at least one (of) a, b, and c" or expressions similar thereto include: only a, only b, only c, only a and b, only b and c, only a and c, and all of a, b, and c.
The terms "according to some embodiments" or "according to an embodiment" used throughout the specification do not necessarily denote the same embodiment.
Some embodiments of the disclosure may be described in terms of functional block(s) (block) construction and various processing operations. Some or all of these functional blocks may be implemented using various numbers of hardware and/or software components (components, ingredients) that perform the specified functions. For example, the functional blocks of the present disclosure may be implemented using one or more microprocessors or circuits executing instructions to perform a given function. Further, for example, the functional blocks of the present disclosure may be implemented in a variety of programming or scripting languages. The functional blocks may be implemented with algorithms that are executed by one or more processors. The present disclosure may also employ conventional techniques for electronic construction, signal processing, and/or data processing. The terms "mechanism," "element," "unit," and "configuration" may be used in a broad sense and are not limited to mechanical and physical configurations.
Furthermore, the connecting lines or connecting means between the components shown in the figures merely illustrate functional connections and/or physical or electrical connections. In an actual device, the connections between the components may be provided by various functional connections, physical connections, or circuit connections, which may be substituted or added.
Meanwhile, in relation to the terminology used herein, a descriptor, which is data used in a neural network system, refers to an indication value for describing a structural characteristic (feature) of a chemical structure and can be obtained by performing a relatively simple calculation on a given chemical structure. According to an embodiment, the descriptor may include a molecular structure fingerprint (e.g., Morgan (Morgan) fingerprint and Extended Connectivity Fingerprint (ECFP)) indicating whether a specific local structure is included. Further, the descriptor may be a quantitative structure-property relationship (QSPR) model configured with a value that can be immediately calculated from a given chemical structure, such as molecular weight or the number of local structures (e.g., rings) included in a molecular structure.
In addition, properties refer to characteristics possessed by chemical structures and may be real numerical values measured through experiments or may be calculated through simulation. For example, when the chemical structure is used as a display (display) material, the property of the chemical structure may be a transmission wavelength, an emission wavelength, or the like for light. When the substance is used as a battery material, the property of the chemical structure may be a voltage. Unlike descriptors, the computation of properties may require complex simulations, which requires additional computation (computation) and estimation (computation) beyond a similar simulation for descriptors.
Further, the structure refers to an atomic level structure of the chemical structure. In order to deduce (derive) properties by performing first principles calculations, structures need to be expressed at the atomic level. Thus, the atomic-level structure needs to be deduced to generate new chemical structures. The structure may be a structural formula based on an atomic bonding relationship or a character string in a simple format (one-dimensional). The format of the string of expression constructs may be a simplified molecular linear input system (specifications) (SMILES) code, a SMILES architecture Target Specification (smart) code, an international compound identification (InChi) code, or the like.
In addition, a factor refers to an element that defines a relationship between a descriptor, a property, and a structure. The factors may be determined by machine learning based on descriptor-property-structural formulas stored in the database. Thus, it can be determined how the relationship between factors, descriptors, properties, and structural formulae.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
Fig. 1 is a block diagram illustrating a hardware configuration of a
The
Referring to fig. 1, the
The
The
The
Although not shown in fig. 1, the
The
The
The
The
The
Fig. 2 is a diagram illustrating a calculation performed by DNN according to an embodiment.
Referring to fig. 2, DNN20 may have a structure including an input layer, a hidden layer, and an output layer based on received input data (e.g., I |)1And I2) Performing calculations and generating output data (e.g., O) based on the results of the calculations1And O2)。
For example, as shown in fig. 2, DNN20 may include an input layer (layer 1), two hidden layers (
The layers included in DNN20 may each have a plurality of channels. The channels may correspond to a plurality of artificial nodes, respectively, referred to as neurons, Processing Elements (PEs), cells, or similar terms. For example, as shown in fig. 2,
The channels included in each of the layers of the DNN20 may be interconnected to process data. For example, a channel may perform a computation of data received from a channel of one layer and output the computation result to a channel of another layer.
The inputs and outputs of each channel may be referred to as input enable and output enable. That is, the activation may be not only an output of one channel but also a parameter corresponding to an input of a channel included in a successive layer. Meanwhile, the channels may each determine their activation based on the activation and the weight received from the channel included in the previous layer. The weights are parameters used to calculate the output activation of each channel and may be values assigned to relationships between the channels.
The channels may each be processed by a computing unit or processing element that receives input and produces output activation. The input-output of each channel can be plotted. For example, when σ is the activation function,
Weights from a k channel included in an (i-1) th layer to a j channel included in an i layer,An offset for a jth channel included in an ith layer, andfor activation of the jth channel of the ith layer, activation can be calculated using[ equation 1]
As shown in fig. 2, activation of the first channel CH1 of the second layer,
However, the
In an embodiment, the
That is, among the
The trained DNN20 may then be driven by receiving the new descriptor (or new image) as input data, and may thus output property values corresponding to the received new descriptor (or new image) as output data.
Fig. 3 is a diagram illustrating calculations performed by the RNN according to an embodiment.
Hereinafter, for convenience of description, the description given above with reference to fig. 2 will not be repeated.
The
Referring to fig. 3, a node s constituting a hidden layer of the
[ equation 2]
Here, stIs a storage part of the network and stores information about events at a previous time step. Output value otOnly on the storage of the current time step t.
Meanwhile, unlike the existing neural network structure in which the parameters are different from each other, the
In an embodiment, the
For example, when W of
The trained
Fig. 4 is a conceptual diagram illustrating a neural network system for generating a chemical structure according to an embodiment.
Referring to fig. 4, a neural network system configured to generate chemical structures by using
A descriptor as data used in a neural network system can be represented by ECFP as an indication value for representing a structural characteristic of a chemical structure. The property refers to a characteristic possessed by a chemical structure and may be a true numerical value indicating a transmission wavelength and an emission wavelength with respect to light. Structure refers to the atomic level structure of a chemical structure and may be represented by the SMILES code. For example, the structural formula may be expressed according to the SMILES code as shown in
[ equation 3]
OCl=C(C=C2C=CNC2=Cl)Cl=C(C=CC=Cl)Cl=CC2=C(NC=C2)C=Cl
A factor is an element that defines a relationship between a descriptor, a property, and a structure. The factor may be at least one hidden layer. When the factor includes a plurality of hidden layers, a factor defining a relationship between the descriptor and the property, a factor defining a relationship between the descriptor and the structure, and the like may be determined for each hidden layer.
Fig. 5 is a diagram illustrating a method of representing a
Referring to fig. 5, a
In one embodiment,
In another embodiment, the
Hereinafter, for convenience of description, a method in which the
The atoms constituting the
Referring to fig. 5, on the
The type of color that a certain atom is displayed on the
The
Fig. 6 is a diagram illustrating a method of interpreting a neural network according to an embodiment.
The
The
In this case, the
Referring to fig. 6, in an embodiment, the
The method of calculating the correlation in the LRP technique can be expressed by equation 4. In equation 4, aiAnd ajRespectively, an output value to be determined in a specific node of the ith layer and an output value to be determined in a specific node of the jth layer. w is a+ ijIs a weight value that associates the particular node of layer i with the particular node of layer j. RiAnd RjRespectively representing the correlation of the specific node of the ith layer and the correlation of the specific node of the jth layer.
[ equation 4]
In an embodiment, for application of LRP techniques, the
However, the techniques that may be used in
When the input data of the neural network is a descriptor for the reference chemical structure, the plurality of nodes of the input layer may respectively correspond to bit values constituting the descriptor. The
When the input data of the neural network is an image for the reference chemical structure, a plurality of nodes of the input layer may respectively correspond to pixel values constituting the image. The
Hereinafter, the bit position of the descriptor and the pixel coordinates of the image having the greatest correlation with the expression of the specific property value of the reference chemical structure will be referred to as an expression region.
Fig. 7 is a diagram illustrating an example of changing an expression region of a descriptor to generate a new chemical structure according to an embodiment.
Referring to fig. 7,
The
The
Regarding the method of generating the
The
The
Specifically, the
When the property value for the new chemical structure generated through the above-described process is equal to or greater than the preset value, the
Fig. 8 is a diagram illustrating an example of changing a local structure by changing a bit value of a descriptor according to an embodiment.
In an embodiment, the
The
Referring to fig. 8, the
In addition, the
In addition, the
Additionally, the
However, an example of changing the local structure by changing the bit value of the descriptor is not limited to the above description.
Fig. 9 is a diagram illustrating an example of changing a local structure by changing a pixel value of an image according to an embodiment.
Referring to fig. 9, an
The
The
The
The
Regarding the method of generating the
The
The
Specifically, the
When the property value for the new chemical structure generated through the above-described process is equal to or greater than the preset value, the
Fig. 10 is a diagram illustrating an example of changing a pixel value when there are a plurality of expression regions on an image according to an embodiment.
Referring to fig. 10, an
The
In an embodiment, there may be multiple nodes of the input layer that have the greatest correlation, or a high correlation relative to other nodes, with the expression of the wavelength value of the
When there are a plurality of expression regions, i.e., the
The
The
The
Fig. 11 is a flowchart of a method of generating a new chemical structure by changing a descriptor for the chemical structure in a neural network device, according to an embodiment.
The method of generating a chemical structure in a neural network device relates to the embodiments described above with reference to the drawings, and thus, although omitted in the following description, the description given above with reference to the drawings is also applicable to the method shown in fig. 11.
Referring to fig. 11, in operation 1110, a neural network device may obtain descriptors for a reference chemical structure.
Descriptors are indicative values for structural properties representing a chemical structure. Descriptors can be obtained by performing relatively simple operations on a given chemical structure. In an embodiment, a descriptor may be represented by an ECFP and may include multiple bit values. However, the expression of the descriptor is not limited thereto.
Hereinafter, the descriptor for the reference chemical structure will be referred to as a reference descriptor.
In operation 1120, the neural network device may input a reference descriptor into the trained neural network and output a property value for a particular property of the reference chemical structure.
The property refers to a characteristic possessed by a chemical structure and may be a true numerical value indicating a transmission wavelength and an emission wavelength with respect to light. Unlike the case of descriptors, the computation of properties can require complex simulations and is time consuming.
The memory of the neural network device may store descriptors for a specific chemical structure and property values numerically representing properties of the specific chemical structure, which are matched with each other, as one group.
In an embodiment, the neural network device may allow a neural network (e.g., DNN) to learn by using descriptors and property values stored in memory. In a learning process using the descriptor and the property value, a factor defining a relationship between the descriptor and the property value may be determined in the neural network.
The neural network device may output a property value corresponding to the reference descriptor as output data of the neural network by: inputting the reference descriptor as input data for a trained neural network, and driving the neural network.
In operation 1130, the neural network device may determine an expression region in the reference descriptor that expresses a particular property.
The neural network device may perform an interpretation process to determine whether a particular property value is expressed by any local structure in the reference chemical structure.
In an embodiment, the neural network device may interpret the trained neural network by using LRP techniques. LRP technology is a method of propagating correlations in the opposite direction of a trained neural network (i.e., the direction from the output layer to the input layer). In the LRP technique, when a correlation is propagated between layers, a node having the largest correlation with an upper layer among a plurality of nodes of a lower layer obtains the largest correlation from a corresponding node of the upper layer.
For application of LRP techniques, the neural network device may select the activation function applied to the nodes of the trained neural network as a linear function, and may select the MSE for optimization.
A plurality of nodes of an input layer of the neural network may respectively correspond to bit values constituting the descriptor. The neural network device may obtain, through an interpretation process, a node of the input layer having the greatest correlation in the expression of the specific property value of the reference chemical structure, that is, a bit position (or expression region) of the reference descriptor. Since the expression region of the reference descriptor corresponds to a specific local structure in the reference chemical structure, the neural network device may determine the specific local structure having the greatest correlation in the expression of the specific property value of the reference chemical structure via obtaining the expression region of the reference descriptor through the interpretation process.
In operation 1140, the neural network device may generate a new chemical structure by altering a local structure in the reference chemical structure corresponding to the expression region.
The neural network device may receive a target property value as an input. In an embodiment, the neural network device may include a user interface, which is a tool for inputting data for controlling the neural network device. For example, the user interface may be a key pad, a touch pad, etc., but is not limited thereto.
The target property value is a numerical value of a specific property of a chemical structure to be finally generated in the neural network device. In embodiments, the target property value may be a refractive index value, an elastic modulus, a melting point, a transmission wavelength, and/or an emission wavelength. For example, the neural network device may receive "transmit wavelength: 350nm "as the target property value. Alternatively, the target property value may be set in an increasing (+) direction or a decreasing (-) direction, rather than a numerical value.
The neural network device may generate a new chemical structure having a property value close to the target property value by changing a local structure in the reference chemical structure.
In an embodiment, the neural network device may output a new descriptor by changing a bit value of an expression region of the reference descriptor. When the bit value of the expression region of the reference descriptor is changed, the local structure in the reference chemical structure may be changed. The method of changing the place value of the expression region may use a genetic algorithm, but is not limited thereto.
The neural network device may output a structural feature value corresponding to the new descriptor as output data of the neural network by: inputting a new descriptor in which a bit value of an expression region of the reference descriptor is changed as input data of a trained neural network (e.g., RNN), and driving the neural network. The neural network device may generate a new chemical structure based on the outputted structural feature values. Alternatively, the neural network device may use the factors for the new descriptors output in the learning process of the DNN as input data to a trained neural network (e.g., RNN).
The neural network device may repeatedly generate chemical structures through the above-described process until a chemical structure having a property value close to a target property value (e.g., "emission wavelength: 350 nm") is generated.
In particular, the neural network device may compare the property value for the new chemical structure with a target property value, and when the property value for the new chemical structure is less than the target property value, regenerate the new chemical structure by changing the bit value of the expression region of the reference descriptor.
When the property value of the new chemical structure generated through the above-described process is equal to or greater than the target property value, the neural network device may store the generated new chemical structure in the memory.
Fig. 12 is a flowchart of a method of generating a new chemical structure by changing an image for the chemical structure in a neural network device according to an embodiment.
Hereinafter, the same descriptions as those given with reference to fig. 11 are omitted.
Referring to fig. 12, in
In an embodiment, the image for the reference chemical structure may comprise n × m pixels (where n and m are natural numbers). For example, 8 bits, i.e., values from 0 (black) to 255 (white), may be assigned to each pixel of the image.
Hereinafter, the image for the reference chemical structure will be referred to as a reference image.
In
The memory of the neural network device may store images for a specific chemical structure and property values numerically representing properties of the specific chemical structure, which are matched with each other, as one group.
In an embodiment, the neural network device may allow a neural network (e.g., DNN) to learn by using images and property values stored in memory. In a learning process using the image and the property value, a factor defining a relationship between the image and the property value may be determined in the neural network.
The neural network device may output the property value corresponding to the reference image as output data of the neural network by: inputting the reference image as input data to a trained neural network, and driving the neural network.
In
A plurality of nodes of an input layer of the neural network may respectively correspond to pixel values constituting the image. The neural network device may obtain, through an interpretation process, a node of the input layer having the greatest correlation in the expression of the specific property value of the reference chemical structure, that is, a pixel coordinate (or an expression region) of the reference image. Since the expression region of the reference image corresponds to a specific local structure in the reference chemical structure, the neural network device can determine the specific local structure having the greatest correlation in the expression of the specific property value of the reference chemical structure through obtaining the expression region of the reference image through an interpretation process.
In
In an embodiment, the neural network device may generate a new image by changing pixel values of an expression region of a reference image and/or pixel values around the expression region. When the pixel values of the expression region of the reference image and/or the pixel values around the expression region are changed, the local structure in the reference chemical structure may be changed. In an embodiment, the pixel value of the expression region of the reference image and/or the pixel values around the expression region may be changed by using gaussian noise, but the method of changing the pixel values is not limited thereto.
The neural network device may output a structural feature value corresponding to a new image as output data of the neural network by: inputting a new image in which pixel values of an expression region of the reference image and/or pixel values around the expression region are changed as input data of a trained neural network (e.g., RNN), and driving the neural network. The neural network device may generate a new chemical structure based on the outputted structural feature values. Alternatively, the neural network device may use factors for the new image output in the learning process of the DNN as input data to the trained neural network (e.g., RNN).
The neural network device may repeatedly generate chemical structures through the above-described process until a chemical structure having a property value close to a target property value (e.g., "emission wavelength: 350 nm") is generated.
In particular, the neural network device may compare the property value for the new chemical structure with a target property value, and when the property value for the new chemical structure is less than the target property value, regenerate the new chemical structure by changing the pixel values of the expression region of the reference image and/or the pixel values around the expression region.
When the property value of the new chemical structure generated through the above-described process is equal to or greater than the target property value, the neural network device may store the generated new chemical structure in the memory.
According to the above embodiments, the trained neural network can be interpreted to unambiguously (specialize) local structures expressing the nature of the chemical structure. In addition, by changing the well-defined local structure, new chemical structures with improved properties can be created.
Further, the foregoing embodiments may be embodied in the form of a recording medium storing instructions executable by a computer, such as program modules, executed by a computer. Computer readable media can be any recording media that can be accessed by the computer and can include both volatile and nonvolatile media, and removable and non-removable media. Additionally, the computer-readable media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. The communication media includes computer readable instructions, data structures, program modules, or other data in a modulated data signal or other transport mechanism and may include any transmission media.
In addition, throughout the specification, the term "unit" may be a hardware component such as a processor or a circuit and/or a software component executed by a hardware component such as a processor.
The above description of the present disclosure is provided for the purpose of illustration, and it will be understood by those skilled in the art that various changes and modifications may be made without changing the technical concept and essential features of the present disclosure. It is therefore to be understood that the foregoing illustrative embodiments are illustrative in all respects and not restrictive of the disclosure. For example, components described as a single type may be implemented in a distributed manner. Also, components described as distributed may be implemented in a combined manner.
It is to be understood that the embodiments described herein are to be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects in various embodiments should typically be considered as available for other similar features or aspects in other embodiments.
Although one or more embodiments have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope as defined by the following claims.