Polishing apparatus and recording medium

文档序号：478268 发布日期：2022-01-04 浏览：19次中文

阅读说明：本技术 研磨装置及记录介质 (Polishing apparatus and recording medium ) 是由中村显铃木佑多关山俊介于 2021-06-17 设计创作，主要内容包括：一种研磨装置及记录介质,即使研磨的状况变化也能推定研磨中的对象时刻的参数。研磨装置具有：生成部,该生成部使用与研磨中的对象时刻的研磨部件和对象基板之间的摩擦力有关的数据、或者研磨部件或对象基板的温度的测定数据生成特征量；以及推定部,该推定部对使用学习用数据组完成学习的机器学习模型至少输入通过该生成部生成的特征量,输出对象基板的研磨中的对象时刻的研磨量或残余膜量的推定值,该学习用数据组在输入中包括基于与研磨中的各时刻的研磨部件和基板之间的摩擦力有关的数据的特征量、或基于研磨部件或基板的温度的测定数据的特征量,将至少使用研磨后测定出的膜厚推定的研磨中的各时刻的研磨量或残余膜量作为输出。(A polishing apparatus and a recording medium are provided, which can estimate the parameter of the target time during polishing even if the polishing status changes. The polishing device comprises: a generation unit that generates a feature amount using data relating to a frictional force between the polishing member and the target substrate at a target time during polishing or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs at least the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning including, as an output, a feature amount based on data relating to a frictional force between the polishing member and the substrate at each time during polishing or a feature amount based on measured data of a temperature of the polishing member or the substrate, and outputs the polishing amount or the residual film amount at each time during polishing estimated using at least a film thickness measured after polishing, and outputs the estimated value of the polishing amount or the residual film amount at each time during polishing at the target time during polishing of the target substrate.)

1. A polishing apparatus is characterized by comprising:

a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and

and an estimating unit that inputs, to a machine learning model that has completed learning using a data set for learning including, as inputs, time series data of at least the feature amount generated by the generating unit, time series data of the polishing amount or the residual film amount at a target time during polishing of the target substrate, and outputs, as an output, time series data of the feature amount up to a specific time during polishing of another substrate, and a polishing amount or a residual film amount at the specific time during polishing estimated by using at least a film thickness measured after polishing of the other substrate, or a polishing amount or residual film amount up to the specific time.

2. The polishing apparatus according to claim 1, comprising:

a determination unit that determines whether or not the polishing end point has been reached using the estimated value; and

and a control unit that controls to end polishing when the determination unit determines that the end point of polishing has been reached.

3. Grinding device as claimed in claim 1 or 2,

the inputs to the machine learning may further include a polishing method, a time for which one consumable part is used, the number of substrates processed by the same consumable part, and/or an initial film thickness.

4. The abrading apparatus of claim 1,

the polishing amount or the residual film amount at each time in the data set for learning is calculated using a first polishing rate until an interface between the layer to be polished and the lower layer is exposed and a second polishing rate after the interface is exposed.

5. A polishing apparatus is characterized by comprising:

and a determination unit that determines whether or not the polishing end point has been reached using the estimated value.

6. The abrading apparatus of claim 5,

the polishing apparatus is provided with a control unit that controls the polishing to be terminated when the determination unit determines that the end point of polishing has been reached.

7. A polishing apparatus is characterized by comprising:

an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning including time series data of the feature amount up to a specific time point in polishing another substrate, and outputs the time series data of the polishing remaining time at the specific time point or the additional polishing time from an end point inspection time point, or the time series data of the polishing remaining time up to the specific time point or the additional polishing time from the end point inspection time point, which is determined so that the residual film thickness or the polishing amount becomes a target value for the other substrate, or the time series data of the polishing remaining time up to the specific time point or the additional polishing time from the end point inspection time point; and

and a determination unit that determines whether or not the polishing end point has been reached using the estimated value.

8. The abrading apparatus of claim 7,

the polishing apparatus is provided with a control unit for controlling the polishing to be finished by using the estimated value of the polishing remaining time or the additional polishing time from the end point verification time.

9. A recording medium that is readable by a computer and that has a program recorded thereon, the program causing the computer to function as:

10. A recording medium that is readable by a computer and that has a program recorded thereon, the program causing the computer to function as:

and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a learning data set including, as inputs, time series data of the feature amount up to a specific time point in polishing of another substrate, and outputs, as an output, a polishing end point probability at the specific time point in polishing of the other substrate or time series data of the polishing end point probability up to the specific time point.

11. A recording medium that is readable by a computer and that has a program recorded thereon, the program causing the computer to function as:

and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning including time series data of the feature amount up to a specific time point during polishing of another substrate and outputs the time series data of the polishing time at the specific time point or the additional polishing time from the end point inspection time point, which is determined so that the residual film thickness or the polishing amount becomes a target value for the other substrate, or the time series data of the polishing time up to the specific time point or the additional polishing time from the end point inspection time point.

Technical Field

The present invention relates to a polishing apparatus and a program.

Background

Polishing apparatuses for polishing a substrate (e.g., a wafer) are known. For example, as in patent document 1, a technique is known in which a new interface is exposed based on a signal related to a frictional force during polishing, and the polishing is terminated by detecting that initial unevenness is flattened. This test is also referred to as an endpoint test. At this time, whether the signal waveform satisfies a predetermined condition is determined in real time, and the end point is determined.

Documents of the prior art

Patent document

Patent document 1: japanese patent laid-open publication No. 2017-76779

Technical problem to be solved by the invention

However, in the case of the conventional end point inspection method, the time point of the end point inspection differs between substrates, and the thickness of the remaining film (also referred to as residual film thickness) of the substrate is not constant.

In a conventional end point inspection method, there is a method of inspecting whether or not a simple numerical value (for example, inclination) characterizing a signal waveform related to a frictional force during polishing satisfies a predetermined condition, and further performing predetermined additional polishing after the inspection. In actual polishing, for example, the polishing rate varies due to wear of the polishing pad, and the polishing profile on the substrate is not always constant. In order to keep the residual film thickness constant in accordance with the polishing state (or state) that changes in this manner, a new endpoint detection method needs to be established. Further, it is preferable that the polishing is performed under a different polishing condition (for example, polishing pressure) when the polishing amount or the residual film amount during polishing does not satisfy a predetermined condition, and that the target polishing amount can be achieved without increasing the polishing time, for example. In either case, it is desirable to estimate parameters (for example, a polishing amount or a residual film amount, a polishing end point probability, a polishing remaining time, an additional polishing time from an end point inspection time point, and the like) at a target time during polishing even if the polishing state changes.

Disclosure of Invention

The present invention has been made in view of the above problems, and an object of the present invention is to provide a polishing apparatus and a program capable of estimating a parameter of a target time during polishing even if a polishing state changes.

Means for solving the problems

A polishing apparatus according to a first aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs, to a machine learning model that has completed learning using a data set for learning including, as inputs, time series data of at least the characteristic amount generated by the generating unit, and outputs an estimated value of the polishing amount or the residual film amount at a target time during polishing of the target substrate, the time series data of the characteristic amount up to a specific time during polishing of another substrate, and outputs, as outputs, at least the polishing amount or the residual film amount at the specific time during polishing estimated by using a film thickness measured after polishing of the other substrate, or the time series data of the polishing amount or the residual film amount up to the specific time.

According to this configuration, the relationship between the characteristic amount relating to the change in the frictional force or temperature at the time of polishing and the polishing amount or the residual film amount as a result of polishing is learned, and the residual film amount or the residual film amount during polishing of a new substrate is estimated using a machine learning model for which learning has been completed. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing amount or the residual film amount in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing amount or the residual film amount in the polishing of a new substrate in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized in which the difference in residual film thickness between the substrates can be suppressed even if the polishing state changes.

A polishing apparatus according to a second aspect of the present invention is the polishing apparatus according to the first aspect, and includes a determination unit configured to determine whether or not an end point of polishing has been reached using the estimated value; and a control unit that controls to end polishing when the determination unit determines that the end point of polishing has been reached.

According to this configuration, since the polishing amount or the residual film amount during polishing estimated using the consumable part such as the polishing pad and the influence of the unevenness of the substrate can be controlled to the end of polishing, the difference between the polishing amount or the residual film amount at the end of polishing can be reduced.

A polishing apparatus according to a third aspect of the present invention is the polishing apparatus according to the first or second aspect, wherein the machine learning input further includes a polishing method, a use time of one consumable part, the number of substrates processed by the same consumable part, and/or an initial film thickness.

With this configuration, the polishing amount or the residual film amount can be estimated from the polishing conditions and the state of the consumable part, and the estimation accuracy can be improved.

A polishing apparatus according to a fourth aspect of the present invention is the polishing apparatus according to any one of the first to third aspects, wherein the polishing amount or the residual film amount at each time in the data set for learning is calculated using a first polishing rate until an interface between the layer to be polished and the lower layer is exposed and a second polishing rate after the interface is exposed.

A polishing apparatus according to a fifth aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning, the data set for learning including, as an input, time series data of the feature amount up to a specific time in polishing of another substrate, and outputs, as an output, a polishing end point probability at the specific time in polishing of the other substrate or time series data of the polishing end point probability up to the specific time; and a determination unit that determines whether or not the polishing end point is reached using the estimated value.

According to this configuration, the relationship between the feature amount relating to the change in the friction force or the temperature at the time of polishing and the polishing end point probability at each time point during polishing is learned, and the polishing end point probability at each time point during polishing of a new substrate is estimated using the learned machine learning model. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing end point probability at each time point during polishing in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized in which the difference in residual film thickness between the substrates can be suppressed even if the polishing state changes.

A polishing apparatus according to a sixth aspect of the present invention is the polishing apparatus according to the fifth aspect, and includes a control unit configured to control the polishing unit to end polishing when the determination unit determines that the end point of polishing has been reached.

According to this configuration, since the influence of the unevenness of the substrate and the consumable part such as the polishing pad can be taken into consideration, the range of variation in the polishing amount or the residual film amount at the end of polishing can be narrowed.

A polishing apparatus according to a seventh aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; an estimating unit that inputs at least time-series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning that includes time-series data of the feature amount up to a specific time point in polishing another substrate and outputs the time-series data of the feature amount up to the specific time point in polishing the other substrate, or the time-series data of the additional polishing time up to the specific time point or the additional polishing time up to the terminal point inspection time point, so that the residual film thickness or the polishing amount of the other substrate becomes a target value; and a determination unit that determines whether or not the polishing end point has been reached using the estimated value.

According to this configuration, the relationship between the feature value and the remaining polishing time or the additional polishing time from the end point verification time point, which are associated with the change in the friction force or the temperature during polishing, is learned, and the remaining polishing time during polishing of the new substrate or the additional polishing time from the end point verification time point is estimated using the machine learning model for which learning has been completed. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing remaining time or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing remaining time during polishing of a new substrate or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized in which the difference in residual film thickness between the substrates can be suppressed even if the polishing state changes.

A polishing apparatus according to an eighth aspect of the present invention is the polishing apparatus according to the seventh aspect, and includes a control unit that controls to end polishing using the estimated value of the polishing remaining time or the additional polishing time from the end point verification time.

A recording medium according to a ninth aspect of the present invention is a computer-readable recording medium having a program recorded thereon, the program causing a computer to function as: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and

A recording medium according to a tenth aspect of the present invention is a recording medium readable by a computer, the recording medium having a program recorded thereon, the program causing the computer to function as: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning, the data set for learning including, as inputs, time series data of the feature amount up to a specific time point in polishing of another substrate, and outputs, as output, a polishing end point probability at the specific time point in polishing of the other substrate or time series data of the polishing end point probability up to the specific time point.

A computer-readable recording medium according to an eleventh aspect of the present invention is a computer-readable recording medium having a program recorded thereon, the program causing a computer to function as: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning including time series data of the feature amount up to a specific time point in polishing of another substrate, and outputs the time series data of the polishing remaining time or the additional polishing time from the end point inspection time point, the polishing remaining time or the additional polishing time from the end point inspection time point being determined so that the residual film thickness or the polishing amount of the other substrate becomes a target value, or the polishing remaining time or the additional polishing time up to the specific time point or the additional polishing time from the end point inspection time point, as output data.

An information processing system according to a twelfth aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs, to a machine learning model that has completed learning using a data set for learning including, as inputs, time series data of at least the feature amount generated by the generating unit, and outputs an estimated value of the polishing amount or the residual film amount at a target time during polishing of the target substrate, the time series data of the feature amount up to a specific time during polishing of another substrate, and outputs, as outputs, at least the polishing amount or the residual film amount at the specific time during polishing estimated by using a film thickness measured after polishing of the other substrate, or the time series data of the polishing amount or the residual film amount up to the specific time.

An information processing system according to a thirteenth aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning, the data set for learning including, as inputs, time series data of the feature amount up to a specific time point in polishing of another substrate, and outputs, as output, a polishing end point probability at the specific time point in polishing of the other substrate or time series data of the polishing end point probability up to the specific time point.

An information processing system according to a fourteenth aspect of the present invention includes: a generation unit that generates time series data of the characteristic amount up to a target time during polishing using data relating to a frictional force between the polishing member and the target substrate up to the target time or measurement data of a temperature of the polishing member or the target substrate; and an estimating unit that inputs at least time series data of the feature amount generated by the generating unit into a machine learning model that has completed learning using a data set for learning including time series data of the feature amount up to a specific time point during polishing of another substrate, and outputs time series data of the polishing remaining time at the specific time point, the additional polishing time from the end point inspection time point, or the polishing remaining time up to the specific time point or the additional polishing time from the end point inspection time point, which is determined so that the residual film thickness or the polishing amount becomes a target value for the other substrate, to output, as the estimation value, the time series data of the feature amount generated by the generating unit.

A substrate polishing method according to a fifteenth aspect of the present invention includes: a generation step of generating time series data of the characteristic amount up to a target time using data relating to a frictional force between the polishing member and the target substrate until the target time during polishing or measurement data of a temperature of the polishing member or the target substrate; an estimation step of inputting at least time series data of the feature amount generated by the generation unit into a machine learning model for which learning is completed using a data set for learning including time series data of the feature amount up to a specific time during polishing of another substrate and outputting the polishing amount or the residual film amount at the specific time during polishing estimated by using at least the film thickness measured after polishing of the other substrate or the time series data of the polishing amount or the residual film amount up to the specific time, and outputting the time series data of the feature amount or the residual film amount at the specific time during polishing; a determination step of determining whether or not the polishing end point has been reached using the estimated value; and a polishing step of polishing the target substrate until it is determined that the polishing end point is reached.

A substrate polishing method according to a sixteenth aspect of the present invention includes: a generation step of generating time series data of the characteristic amount up to a target time using data relating to a frictional force between the polishing member and the target substrate until the target time during polishing or measurement data of a temperature of the polishing member or the target substrate; an estimation step of inputting at least time series data of the feature amount generated by the generation unit into a machine learning model that has completed learning using a learning data set including, as an input, time series data of the feature amount up to a specific time point in polishing of another substrate, and outputting, as an output, the polishing end point probability at the specific time point in polishing of the other substrate or the time series data of the polishing end point probability up to the specific time point; a determination step of determining whether or not the polishing end point has been reached using the estimated value; and a polishing step of polishing the target substrate until it is determined that the polishing end point is reached.

A substrate polishing method according to a seventeenth aspect of the present invention includes: a generation step of generating time series data of the characteristic amount up to a target time using data relating to a frictional force between the polishing member and the target substrate until the target time during polishing or measurement data of a temperature of the polishing member or the target substrate; an estimation step of inputting at least time series data of the feature amount generated by the generation unit into a machine learning model that has completed learning using a data set for learning including time series data of the feature amount up to a specific time point in polishing of another substrate and outputting the time series data of the polishing remaining time at the specific time point or the additional polishing time from the end point inspection time point, or the polishing remaining time up to the specific time point or the additional polishing time from the end point inspection time point, which is determined so that the residual film thickness or the polishing amount becomes a target value, as the data of the time series data of the polishing remaining time or the additional polishing time from the end point inspection time point, for the other substrate, and outputting the data of the time series data of the feature amount generated by the generation unit; a determination step of determining whether or not the polishing end point has been reached using the estimated value; and a polishing step of polishing the target substrate until it is determined that the polishing end point is reached.

Effects of the invention

According to one aspect of the present invention, the relationship between the characteristic amount relating to the change in the friction force or the temperature at the time of polishing and the polishing amount or the residual film amount as a result of polishing is learned, and the residual film amount or the residual film amount during polishing of a new substrate is estimated using a machine learning model in which the learning is completed. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing amount or the residual film amount in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing amount or the residual film amount in the polishing of a new substrate in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad.

According to another aspect of the present invention, the relationship between the characteristic amount relating to the change in the friction force or the temperature during polishing and the polishing end point probability at each time during polishing is learned, and the polishing end point probability at each time during polishing of a new substrate is estimated using the machine learning model having been learned. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing end point probability at each time point during polishing in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad.

According to another aspect of the present invention, the relationship between the characteristic amount and the polishing remaining time or the additional polishing time from the end point verification time point, which are related to the change in the friction force or the temperature at the time of polishing, is learned, and the polishing remaining time during polishing of a new substrate or the additional polishing time from the end point verification time point is estimated using the machine learning model in which the learning is completed. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing remaining time or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing remaining time during polishing of a new substrate or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad.

Drawings

Fig. 1 is a schematic diagram showing the overall configuration of a polishing apparatus according to a first embodiment.

Fig. 2 is a schematic configuration diagram of the AI unit according to the first embodiment.

Fig. 3 is a diagram illustrating a correspondence relationship between a polishing state of a wafer and a waveform of a table torque current.

Fig. 4 is a diagram for explaining the difference between the detection point of the conventional end point test and the ideal detection point.

Fig. 5A is a schematic diagram illustrating an example of the learning step and the estimation step according to the first embodiment.

Fig. 5B is an example of a graph showing a temporal change in the platen torque current and a graph showing a temporal change in the polishing amount/residual film amount at this time.

Fig. 5C is a schematic diagram showing a first example of the learning method of machine learning.

Fig. 5D is a schematic diagram showing a second example of the learning method of machine learning.

Fig. 6 is a flowchart showing a first example of the processing of the AI section during polishing of the wafer.

Fig. 7 is a flowchart showing a second example of the processing of the AI section during polishing of the wafer.

Fig. 8 is a flowchart showing an example of the processing of the AI section in the polishing of the wafer in the first modification of the first embodiment.

Fig. 9 is a flowchart showing an example of the processing of the AI section in the polishing of the wafer in the second modification of the first embodiment.

Fig. 10 is a flowchart showing another example of the processing of the AI section in the polishing of the wafer in the second modification of the first embodiment.

Fig. 11 is a schematic diagram showing the overall configuration of the polishing system according to the second embodiment.

Fig. 12 is a schematic diagram showing the overall configuration of the polishing system according to the third embodiment.

Description of the symbols

1 grinding head

100 grinding table

100a grinding table shaft

101 polishing pad

101a abrasive surface

102 grinding table rotating motor

110 top ring head

111 top ring shaft

112 rotating cylinder

113 timing pulley

114 rotating electric machine for top ring

115 timing belt

116 timing pulley

117 top ring head shaft

124 up-and-down movement mechanism

126 bearing

128 bridge component

129 support table

130 support

132 ball screw

132a threaded shaft

132b nut

138 servo motor

26 rotating joint

3 baffle ring

4AI part

41 storage part

42 memory

43 input unit

44 output part

45 processor

451 Generation part

452 estimating part

453 judgment part

500 control part

S1-S3 information processing system

Detailed Description

Hereinafter, embodiments will be described with reference to the drawings. However, unnecessary detailed description may be omitted. For example, detailed descriptions of well-known matters and repetitive descriptions of substantially the same structures may be omitted. This is to avoid unnecessarily lengthy descriptions that follow, and to facilitate understanding by those skilled in the art.

The inventors of the present application have found that there is a correlation between a characteristic amount related to a change in friction or temperature when polishing is performed and a polishing amount or a residual film amount as a result of polishing. The inventors of the present application have also found that there is a correlation between a characteristic amount relating to a change in friction or temperature during polishing and a polishing end point probability at each point in time during polishing. Further, the inventors of the present application have found that there is a correlation between the characteristic amount relating to the change in friction or temperature during polishing and the polishing remaining time or the additional polishing time from the end point verification time. Thus, in various embodiments, a machine learning model (e.g., a recurrent neural network or LSTM: Long short-term memory) is used to learn one of the above relationships. In each embodiment, a wafer is described as an example of a substrate.

< first embodiment >

First, the first embodiment will be explained. Fig. 1 is a schematic diagram showing the overall configuration of a polishing apparatus according to a first embodiment. As shown in fig. 1, the polishing apparatus 10 includes an information processing system S having an AI unit 4. The information processing system S may further include a control unit 500.

The polishing apparatus 10 includes a polishing table 100 and a polishing head 1 as a substrate holding device, and the polishing head 1 holds a substrate (here, a wafer) as an object to be polished and presses a polishing surface on the polishing table 100. The polishing head 1 is also referred to as a top ring. The polishing table 100 is connected to a polishing table rotating motor 102 disposed below the polishing table via a polishing table shaft 100 a. The polishing table 100 is rotated about a polishing table shaft 100a by rotation of the polishing table rotating motor 102. A polishing pad 101 as a polishing member is attached to the upper surface of the polishing table 100. The surface of the polishing pad 101 constitutes a polishing surface 101a for polishing the semiconductor wafer W. In this way, the polishing apparatus 10 includes a polishing table 100 and a polishing head 1, the polishing table 100 is provided with a polishing member (here, a polishing pad 101 as an example) and is configured to be rotatable, the polishing head 1 is configured to be rotatable so as to face the polishing table 100, and a substrate (here, a wafer) can be mounted on a surface of the polishing head 1 facing the polishing table 100.

A polishing liquid supply nozzle 60 is provided above the polishing table 100. A polishing liquid (polishing slurry) Q is supplied from the polishing liquid supply nozzle 60 onto the polishing pad 101 on the polishing table 100.

The grinding head 1 basically comprises: a top ring body 2 for pressing the semiconductor wafer W against the polishing surface 101a, and a retainer ring 3 serving as a retainer member for retaining the outer periphery of the semiconductor wafer W so that the semiconductor wafer W does not fly out of the polishing head 1. The polishing head 1 is connected to the top ring shaft 111. The top ring shaft 111 is moved up and down with respect to the top ring head 110 by the up-down movement mechanism 124. The polishing head 1 is positioned in the vertical direction by moving the entire polishing head 1 up and down with respect to the top ring head 110 by moving the top ring shaft 111 up and down. A rotary joint 26 is attached to the upper end of the top ring shaft 111.

The vertical movement mechanism 124 for vertically moving the top ring shaft 111 and the polishing head 1 includes: the top ring shaft 111 is rotatably supported by a bridge member 128 via a bearing 126, a ball screw 132 attached to the bridge member 128, a support table 129 supported by a support column 130, and a servo motor 138 provided on the support table 129. A support table 129 supporting the servo motor 138 is fixed to the top ring head 110 via a support column 130.

The ball screw 132 includes a screw shaft 132a connected to the servo motor 138 and a nut 132b to which the screw shaft 132a is screwed. When the servo motor 138 is driven, the bridge member 128 moves up and down via the ball screw 132, and the top ring shaft 111 and the polishing head 1, which move up and down integrally with the bridge member 128, move up and down.

As shown in fig. 1, when the top ring rotating motor 114 is driven to rotate, the rotary cylinder 112 and the top ring shaft 111 rotate integrally via the timing pulley 116, the timing belt 115, and the timing pulley 113, and the polishing head 1 rotates.

The top ring head 110 is supported by a top ring head shaft 117, and the top ring head shaft 117 is rotatably supported by a frame (not shown). The polishing apparatus 10 includes a control unit 500, and the control unit 500 is connected to and controls each device in the apparatus including the top ring rotating motor 114, the servo motor 138, and the polishing table rotating motor 102 via a control line. The control unit 500 controls the polishing head 1 and the polishing table 100 on which the substrate is mounted to be rotated while pressing the substrate against the polishing member (here, the polishing pad 101) to polish the substrate.

Although the table rotation, the polishing head rotation, and the rotation of the motor (not shown) for swinging the top ring head 110, which are sources of the characteristics of the machine learning model to be described later, may be input, one or more sensor detection values (for example, motor current values) or a calculated value of the torque calculated from the sensor detection values may be used.

The polishing apparatus 10 includes an AI unit 4 connected to the control unit 500 via a wire. Fig. 2 is a schematic configuration diagram of the AI unit according to the first embodiment. As shown in fig. 2, the AI unit 4 is, for example, a computer, and includes a storage unit 41, a memory 42, an input unit 43, an output unit 44, and a processor 45.

The storage unit 41 stores a machine learning model that is learned using a learning data set including, as an input, a feature value based on data relating to a frictional force at each time during polishing or a feature value based on measurement data of temperature, and outputs, as an output, at least a polishing amount at each time during polishing or a residual film amount estimated using a film thickness measured after polishing. And a program for reading and execution by the processor 45 is stored in the storage section 41. The storage unit 41 may be a memory such as a hard disk or a DVD, an external storage medium such as an SD card or a memory, or an on-line memory, as long as it is a storage device.

Here, the data relating to the frictional force at each time during polishing is, for example, a current value for calculating the torque of the polishing table rotating motor 102 during polishing (hereinafter, also referred to as a table torque current). Here, the data on the frictional force at each time during polishing may be a calculated value of the torque converted from the current value of the motor. The data on the frictional force at each time during polishing may be a drive current value of the top ring rotation motor 114 for rotating the polishing head 1, or may be a drive current value of a motor (not shown) for rotating the top ring head 110 (i.e., the top ring head shaft 117).

In this case, the data relating to the frictional force at each point in time during polishing may be a signal value of the load cell. The polishing apparatus 10 may include a distortion sensor for measuring distortion of the substrate, and in this case, the data relating to the frictional force at each point in time during polishing may be a signal value of the distortion sensor.

The memory 42 is a medium that temporarily stores information.

The input unit 43 receives information from the control unit 500 and outputs the information to the processor 45.

The output unit 44 receives information from the processor 45 and outputs the information to the control unit 500.

The processor 45 reads and executes the program from the storage unit 41, thereby functioning as the generation unit 451, the estimation unit 452, and the determination unit 453.

The generating unit 451 generates the feature value using data on the frictional force between the polishing member and the target substrate at the target time during polishing, for example. Here, the polishing refers to, for example, a period of time during which the substrate is pressed against the polishing member to polish the substrate while the polishing head 1 and the polishing table 100 on which the substrate is mounted are rotated. This process is described in detail later.

The estimation unit 452 inputs at least the feature amount generated by the generation unit 451 to the machine learning model having completed the learning, and outputs an estimated value of the polishing amount or the residual film amount at the target time during polishing of the target substrate. This process is described in detail later. The determination section 453 determines whether or not the polishing end point has been reached using the estimated value.

Fig. 3 is a diagram illustrating a correspondence relationship between a polishing state of a wafer and a waveform of a table torque current. In the graph shown in fig. 3, the vertical axis represents the torque current value of the polishing table rotating motor 102 during polishing, and the horizontal axis represents time, and a waveform C1 showing the time change of the table torque current is shown. Since the frictional force with the polishing pad 101 varies depending on the ratio of the exposed film species, the value of the table torque current also varies accordingly.

As shown in fig. 3, the wafer W has a layer 51 to be polished mounted so as to face the polishing pad 101, and a lower layer 52 provided on the layer 51 to be polished. The layer 51 to be polished is cut by a force generated by polishing friction. At point P1 on waveform C1, the polished layer 51 is less cut to, and at point P2 on waveform C1 after further passage of time, the underlying layer 52 is partially revealed. At a point P3 on the waveform C1 after the lapse of time, the entire surface of the lower layer 52 is exposed. When the entire surface of the lower layer 52 is exposed, the polishing table rotating motor 102 is stopped, and polishing is completed.

As shown in fig. 3, the length of arrows a12, a13 is shorter than the length of arrows a11, a14, and is over-ground by an amount corresponding to arrow a 15.

The inventors of the present application have found that data (for example, a signal of a table torque current) relating to a frictional force between a polishing member and a substrate has a relationship with a residual film thickness or a polishing amount because a time point at which an underlayer film is exposed varies in a wafer surface due to a change in a polishing rate depending on a polishing position due to wear of a polishing pad or the like, such that the film is polished unevenly, or the like. Here, the residual film thickness is the thickness of the remaining layer 51, which is the thickness from the bottom in the recess to the lower surface of the layer 51 to be polished, and is the thickness of the film remaining in the recess (for example, the length of arrows a11, a12, a13, and a 14) when the interface such as point P3 in fig. 3 is exposed. The residual film thickness may be a residual film thickness at a certain determined position, or may be an average value of residual film thicknesses measured at a plurality of positions. The polishing amount is, for example, a thickness of the polished layer 51 cut by polishing. The polishing amount may be the polishing amount at a certain determined position, or may be an average value of the polishing amounts measured at a plurality of positions.

In the present embodiment, the machine learning model is learned using a learning data set that takes as input data relating to the frictional force between the polishing member and the substrate when the substrate having a certain initial film thickness is polished to a certain residual film thickness, and takes as output the residual film thickness or the polishing amount at that time. Data relating to the frictional force between the polishing member and the substrate to be newly subjected to the polishing is read into the machine learning model having completed the learning, thereby outputting an estimated value of the residual film thickness or the polishing amount, and the polishing is terminated at a time point when the residual film thickness or the polishing amount reaches a target value.

Fig. 4 is a diagram for explaining the difference between the detection point of the conventional end point test and the ideal detection point. As shown in fig. 4, since the detection point (actual detection point) in the conventional end point inspection is earlier in time than the ideal detection point, there is a problem that even if additional polishing (also referred to as over polishing) is performed in a predetermined period T1 thereafter, the film cannot be cut so that the film thickness becomes the target residual film thickness (also referred to as target residual film thickness). On the other hand, if the detection point of the end point inspection coincides with the ideal detection point, the film thickness is cut so as to be the target residual film thickness when the polishing is performed for a predetermined period T1 thereafter, and therefore, the detection is preferably performed at the ideal detection point.

Fig. 5A is a schematic diagram illustrating an example of the learning step and the estimation step according to the first embodiment. As shown in fig. 5A, data on waveforms of various signals during polishing (also referred to as polishing waveforms), a film thickness after polishing, a use time of a consumable part (e.g., a polishing pad), and the like are stored as reserve data in the storage unit 41. However, the time of use of the consumable part is not required. The polishing state may vary depending on the time of use of the polishing pad. In the learning step, the use time of the polishing pad is not added to the parameter, and when the learning data of the polishing pad in various states from immediately after the start of the polishing pad to the stage of consumption is collectively learned and the residual film thickness or the polishing amount can be appropriately estimated, the use time of the polishing pad does not need to be added to the stock data. However, the use time of the polishing pad may be added to the stock data to estimate the residual film thickness or the polishing amount corresponding to the use time of the polishing pad when the target substrate is polished.

In the learning step, the storage unit 41 is referred to extract a feature amount based on data (for example, a table torque current) relating to a frictional force between the polishing member and the substrate at each time during polishing. Then, with reference to the storage unit 41, at least the polishing amount or the residual film amount at each time during polishing estimated using the film thickness measured after polishing is extracted.

Machine learning is performed using a data set for learning, which includes, as an output, a feature amount based on data relating to a frictional force between the polishing member and the substrate at each time during polishing, and a polishing amount or a residual film amount at each time during polishing estimated using at least a film thickness measured after polishing. As a result, the machine learning model that has completed learning is stored in the storage unit 41. In the input of the data set for learning, in addition to the above-described characteristic amount based on the data relating to the frictional force between the polishing member and the substrate at each time during polishing, the characteristic amount may include a polishing method, a use time of one consumable part, the number of substrates processed by the same consumable part, and/or an initial film thickness, as will be described later.

Here, the polishing amount or the residual film amount at each time in the data set for learning is calculated, for example, based on the results of measuring the initial film thickness and the film thickness after polishing, assuming that the polishing rate during polishing is constant. Alternatively, the change in polishing rate during polishing may be experimentally obtained, and the polishing amount or the residual film amount at each time may be calculated. In addition, a first polishing rate until an interface between the layer to be polished and the lower layer is exposed and a second polishing rate after the interface is exposed may be calculated.

Fig. 5B is an example of a graph showing a temporal change in the platen torque current and a graph showing a temporal change in the polishing amount/residual film amount at this time. As shown in fig. 5B, a curve W1 shows a temporal change in the moving average of the table torque current, and a curve W2 shows a temporal change in the differential value of the table torque current. t4 is the polishing end time, and t5 is the ideal polishing end time. The curve W11 shows the temporal change in the polishing amount, and the curve W12 shows the residual film amount.

Fig. 5C is a schematic diagram showing a first example of the learning method of machine learning. In the example of fig. 5C, a plurality of learning data are obtained from the polishing result of one substrate. That is, in the example of fig. 5C, the learning data set has time series data of the characteristic amount (for example, a moving average value, a differential value, an integral value, a loss amount of the polishing pad, or a step number) up to a certain time t as an input, and outputs a value of the output parameter (for example, a residual film amount, a polishing amount, an end point probability, or an estimated value of a residual polishing time) up to the same time t.

For example, learning is performed using learning data in which time series data of feature values from the start of polishing to time t1 is input and a value of an output parameter at time t1 is output.

As the other learning data, the learning data is used which has the time series data of the feature value from the start of polishing to the time t2 as an input and the value of the output parameter at the time t2 as an output.

As the other learning data, the learning data is used which has the time series data of the feature value from the start of polishing to the time t3 as an input and the value of the output parameter at the time t3 as an output.

The values of the output parameters at the output times t1, t2, and t3 are learned from the time series data of the feature amounts up to the times t1, t2, and t 3.

After completion of the learning, when the time series data of the feature amount up to a certain time point is input to the machine learning model in a new polishing, the estimated value of the output parameter (for example, unknown residual film amount) at the time point is output. The machine learning model may also use, for example, RNN or LSTM. However, machine learning models (methods) other than RNN or LSTM may be used.

Further, the machine learning model may be learned as shown in fig. 5D. Fig. 5D is a schematic diagram showing a second example of the learning method of machine learning. In the example of fig. 5D, a set of learning data can be obtained from the polishing results of one substrate. That is, in the example of fig. 5D, one learning data set is input with time series data of the feature quantity from the start of polishing to the end of polishing, and is output with time series data of the output parameter (for example, the residual film quantity, the polishing quantity, the end point probability, or the estimated value of the residual polishing time) from the start of polishing to the end of polishing. Here, the feature amount is a feature amount based on data relating to a frictional force between the polishing member and the target substrate at the target time during polishing. The characteristic amount is at least one of a moving average value of the table torque current, a differential value of the table torque current, and an integrated value of the table torque current, for example. The characteristic amount may be, in addition to or instead of, the amount of wear of the polishing pad or a step number in the polishing method. The reason why the step number in the polishing method is used as the "feature amount based on the data on the frictional force between the polishing member and the substrate" is that the polishing conditions (the bladder pressure, the slurry flow rate, and the like) can be changed for each polishing step, and the frictional force between the polishing member and the substrate can be changed accordingly. For example, the polishing can be set to be performed at a high speed by increasing the bladder pressure at the first stage and to be performed at a low speed by decreasing the bladder pressure in the latter half stage in order to detect the end point accurately.

That is, the time series data of the feature value from the start of polishing to the end of polishing is used as input, and the time series data of the output parameter from the start of polishing to the end of polishing is used as output learning data to learn.

After completion of the learning, when the time series data of the feature amount up to a certain point in time is input to the machine learning model in a new polishing, the estimated value of the output parameter up to that point in time (for example, the unknown residual film amount) is output. That is, when the time series data of the feature quantity up to the time t1 is input to the machine learning model, the estimated value of the output parameter up to the time t1 (for example, the unknown residual film quantity) is output. When the time series data of the feature quantity up to the time t2 is input to the machine learning model, the estimated value of the output parameter up to the time t2 (for example, the unknown residual film amount) is output. When the time series data of the feature quantity up to the time t3 is input to the machine learning model, the estimated value of the output parameter up to the time t3 (for example, the unknown residual film amount) is output. Since a plurality of estimated values of the output parameter up to the time point are output in this way, the estimating unit 452 may acquire an estimated value at the time point among the plurality of estimated values. The determination section 453 may determine whether or not the polishing end point has been reached using the estimated value at the time point.

Next, returning to fig. 5A, in the estimation step, when the machine learning model is learned as described in fig. 5C, when the time series data of the feature amount up to the target time is input to the machine learning model having completed learning, the estimated value of the polishing amount or the residual film amount at the target time during polishing of the target substrate is output.

When the machine learning model is learned as described with reference to fig. 5D, if the time series data of the feature amount up to the target time is input to the machine learning model having completed learning, the time series data of the estimated value of the polishing amount or the residual film amount at the target time during polishing of the target substrate up to the target time is output.

The inputs for machine learning may further include a polishing method, a time period for which one consumable part is used, the number of substrates processed by the same consumable part, and/or an initial film thickness. This makes it possible to estimate the polishing amount or the residual film amount in accordance with the polishing conditions and the state of the consumable part, and the estimation accuracy can be improved.

Fig. 6 is a flowchart showing a first example of the processing of the AI section during polishing of the wafer.

(step S110) first, the processor 45 loads the machine learning model (also referred to as AI model) having completed learning from the storage section 41 to the memory 42.

(step S120) next, the processor 45 acquires stage torque current data.

(step S130) next, the generation unit 451 calculates a feature amount from the table torque current data acquired in step S120.

(step S140) next, the estimating unit 452 inputs the feature amount calculated in step S130 to the machine learning model having completed learning, and outputs an estimated value of the polishing amount at the target time during polishing of the target substrate.

(step S150) next, the determination section 453 determines whether or not the estimated polishing amount output in step S140 is equal to or greater than a set threshold. If the estimated value of the polishing amount is not equal to or greater than the set threshold, the process returns to step 130 to repeat the process. On the other hand, when the estimated value of the polishing amount is equal to or greater than the set threshold, the determination section 453 outputs a notification to the control section 500 that polishing is completed, and the control section 500 that receives a signal to the notification to the completion of polishing controls polishing to be completed. In this way, the determination unit 453 controls to end the polishing using the estimated value estimated by the estimation unit 452. According to this configuration, since the influence of the unevenness of the substrate and the consumable part such as the polishing pad can be taken into consideration, the range of variation in the polishing amount or the residual film amount at the end of polishing can be narrowed.

In practice, as shown in the upper diagram of fig. 4, before the polishing is completed and before the estimated value of the polishing amount reaches the target polishing amount, the polishing may be completed after the predetermined polishing amount or the predetermined polishing time is detected and additional polishing (over-polishing) is performed thereafter. This makes it possible to perform control for changing the conditions for additional polishing while avoiding excessive polishing due to signal processing delay.

Fig. 7 is a flowchart showing a second example of the processing of the AI section during polishing of the wafer.

(step S210) first, the processor 45 acquires the initial film thickness of the substrate.

(step S220) first, the processor 45 loads the machine learning model (also referred to as AI model) having completed learning from the storage section 41 to the memory 42.

(step S230) next, the processor 45 acquires stage torque current data.

(step S240) next, the generation unit 451 calculates a feature amount from the table torque current data acquired in step S230.

(step S250) next, the estimating unit 452 inputs the feature amount calculated in step S240 to the machine learning model having completed learning, outputs an estimated value of the polishing amount at the target time during polishing of the target substrate, and calculates an estimated value of the residual film thickness by subtracting the estimated value of the polishing amount from the initial film thickness acquired in step S210.

Next, the determination section 453 determines whether or not the estimated value of the residual film thickness output in step S250 is equal to or less than the set threshold value (step S260). If the estimated value of the residual film thickness is not equal to or less than the set threshold value, the process returns to step 230 to repeat the process. On the other hand, when the estimated value of the residual film thickness is equal to or less than the set threshold, the determination section 453 outputs a notification to the control section 500 that polishing is completed, and the control section 500 that receives a signal to the notification to the completion of polishing controls polishing to be completed.

In addition, in the case where a machine learning model for completing learning with a learning data set including a feature amount based on data on a frictional force at each time during polishing as an input and using as an output at least a residual film amount at each time during polishing estimated for a film thickness measured after polishing is used, the estimated value of the residual film thickness may be directly output from the machine learning model for completing learning in step S240 without outputting the estimated value of the polishing amount.

As described above, the information processing system S according to the first embodiment includes the generation unit 451, and the generation unit 451 generates the feature amount based on the data relating to the frictional force between the polishing member and the target substrate at the target time during polishing. The information processing system S further includes an estimating unit 452 that inputs at least the feature amount generated by the generating unit 451 to a machine learning model that has completed learning using a data set for learning including, as an output, the feature amount based on data relating to the frictional force between the polishing member and the substrate at each time point during polishing, and outputs the polishing amount at each time point during polishing or the residual film amount at each time point during polishing, the estimated film thickness being estimated at least using the film thickness measured after polishing, and outputs the estimated film thickness.

According to this configuration, the relationship between the characteristic amount relating to the change in the frictional force or temperature at the time of polishing and the polishing amount or the residual film amount as a result of polishing is learned, and the residual film amount or the residual film amount during polishing of a new substrate is estimated using a machine learning model for which learning has been completed. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing amount or the residual film amount in consideration of the influence of the wear member such as the polishing pad and the polishing unevenness, and therefore, can estimate the polishing amount or the residual film amount in the polishing of a new substrate in consideration of the influence of the wear member such as the polishing pad and the polishing unevenness. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized which can suppress the difference in residual film thickness between substrates even if the polishing state changes.

< first modification of the first embodiment >

Next, a first modification of the first embodiment will be described. In the first modification, the storage unit 41 stores a machine learning model that completes learning using a data set for learning, which takes as input at least a feature amount based on data relating to frictional force at each time during polishing or a feature amount based on measured data of temperature, and outputs a polishing end point probability at each time during polishing. The polishing end point probability is, for example, 0 as an output of learning data based on data up to the middle of polishing, and 1 as an output of learning data based on data of polishing up to an ideal polishing end point or an ideal detection point.

The generating unit 451 generates a feature value using data relating to the frictional force between the polishing member and the target substrate at the target time during polishing.

The estimation unit 452 inputs at least the feature amount generated by the generation unit 451 to the machine learning model having completed learning stored in the storage unit 41, and outputs an estimated value of the polishing end point probability at the target time.

According to this configuration, since not only the instantaneous value of the feature amount of the data but also the waveform change can be stored and inferred by using the machine learning model, for example, the polishing end point probability can be estimated in consideration of the influence of the non-uniformity of the substrate or the consumable part such as the polishing pad. By using the estimated value of the polishing end point probability for the polishing end control, the difference between the substrates of the residual film thickness after polishing can be reduced.

The determination section 453 controls to end the polishing using the estimated value estimated by the estimation section 452.

Fig. 8 is a flowchart showing an example of the processing of the AI section in the polishing of the wafer in the first modification of the first embodiment.

(step S310) first, the processor 45 loads the machine learning model (also referred to as AI model) having completed learning from the storage section 41 to the memory 42.

(step S320) next, the processor 45 acquires stage torque current data.

(step S330) next, the generation unit 451 calculates a feature amount from the table torque current data acquired in step S320.

(step S340) next, the estimation unit 452 inputs the feature amount calculated in step S330 to the machine learning model having completed learning, and outputs an estimated value of the polishing end point probability at the target time.

(step S350) next, the determination section 453 determines whether or not the estimated value of the polishing end point probability output in step S340 is equal to or greater than a set threshold value. If the estimated value of the polishing end point probability is not equal to or greater than the set threshold, the process returns to step 320 and is repeated. On the other hand, when the estimated value of the polishing end point probability is equal to or greater than the set threshold, the determination section 453 outputs a notification to the control section 500 that polishing has been completed, and the control section 500 that receives a signal to the notification to the completion of polishing controls the polishing to be completed. In this way, when the determination unit 453 determines that the end point of polishing has been reached, the control unit 500 controls to end polishing. According to this configuration, the relationship between the feature value relating to the change in the friction force or the temperature during polishing and the polishing end point probability at each time during polishing is learned, and the polishing end point probability at each time during polishing of a new substrate is estimated using the learned machine learning model. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing end point probability at each time point during polishing in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized which can suppress the difference in residual film thickness between the substrates even if the polishing state changes.

< second modification of the first embodiment >

Next, a second modification of the first embodiment will be described. In the second modification, a machine learning model in which learning is completed using a learning data set in which at least a feature amount based on data on a frictional force at each time during polishing is input and a polishing remaining time determined so that a residual film thickness or a polishing amount becomes a target value or an additional polishing time from an end point verification time point is output is stored in the storage unit 41. Here, the estimated value of the additional polishing time from the end point verification time is an estimated value of the time for polishing added from the end point verification time until the target residual film thickness shown in fig. 4 is reached.

The generating unit 451 generates the feature value using data on the frictional force between the polishing member and the target substrate at the target time during polishing or measurement data on the temperature of the polishing member or the substrate.

According to this configuration, since not only the instantaneous value of the characteristic amount of data, for example, but also the waveform change can be stored and inferred by using the machine learning model, for example, the polishing remaining time or the additional polishing time from the end point inspection time can be estimated in consideration of the influence of the non-uniformity of the substrate or the consumable part such as the polishing pad. By using the estimated value of the polishing remaining time or the additional polishing time from the end point verification time point for the polishing end control, the difference in the residual film thickness between the substrates after polishing can be reduced.

Fig. 9 is a flowchart showing an example of the processing of the AI section in the polishing of the wafer in the second modification of the first embodiment.

(step S410) first, the processor 45 loads the machine learning model (also referred to as AI model) having completed learning from the storage section 41 to the memory 42.

(step S420) next, the processor 45 acquires stage torque current data.

(step S430) next, the generation unit 451 calculates a feature amount from the table torque current data acquired in step S420.

(step S440) next, the estimation unit 452 inputs the feature amount calculated in step S430 to the machine learning model for which learning is completed, and outputs an estimated value of the polishing remaining time.

(step S450) next, the determination section 453 determines whether or not the estimated value of the polishing remaining time output in step S440 is equal to or less than zero. If the estimated value of the polishing end point probability is not zero or less, the process returns to step 420 and repeats. On the other hand, when the estimated value of the polishing end point probability is zero or less, the determination section 453 outputs a notification to the control section 500 that polishing has been completed, and the control section 500 that receives a signal to the notification to the completion of polishing controls polishing to be completed. In this way, the control unit 500 controls to end the polishing when the determination unit 453 determines that the end point of the polishing has been reached. According to this configuration, the relationship between the feature value and the remaining polishing time or the additional polishing time from the end point verification time point is learned with respect to the change in the friction force or the temperature during polishing, and the remaining polishing time during polishing of the new substrate or the additional polishing time from the end point verification time point is estimated using the machine learning model having completed the learning. By learning the machine learning model, the machine learning model having completed learning can estimate the polishing remaining time or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad, and therefore can estimate the polishing remaining time during polishing of a new substrate or the additional polishing time from the end point inspection time point in consideration of the influence of the polishing unevenness and the consumable parts such as the polishing pad. By using the estimated value for the end point inspection of the polishing of the target substrate, the end point inspection can be realized which can suppress the difference in residual film thickness between substrates even if the polishing state changes.

Fig. 10 is a flowchart showing another example of the processing of the AI section in the polishing of the wafer in the second modification of the first embodiment.

(step S510) first, the processor 45 loads the machine learning model (also referred to as AI model) having completed learning from the storage section 41 to the memory 42.

(step S520) next, the processor 45 acquires stage torque current data.

(step S530) next, the generation unit 451 calculates a feature amount from the table torque current data acquired in step S520.

(step S540) next, the estimation unit 452 inputs the feature amount calculated in step S530 to the machine learning model for which learning is completed, and outputs an estimated value of the polishing remaining time.

(step S550) the processor 45 executes the end point detection processing of the existing manner in parallel with step S530 and step S540. For example, the processor 45 detects the polishing end point when the time differential value of the table torque current is lower than a preset threshold value.

(step S560) the processor 45 determines whether or not the polishing end point is detected in step S550, and returns to step S520 to repeat the processing if the polishing end point is not detected (step S560: No).

(step S570) on the other hand, when the end point of polishing is detected (step S560: YES), the estimated value of the polishing remaining time outputted from the estimating unit 452 at that point in time is set as an additional polishing time (also referred to as an overpolish time).

(step S580) the determination section 453 determines whether or not the additional polishing time (overpolish time) has elapsed after the polishing end point is detected. When the additional polishing time (overpolish time) has elapsed after the detection of the polishing end point, the determination section 453 outputs a notification to the control section 500 that polishing has ended, and the control section 500 that has received a signal to the notification to the completion of polishing controls the termination of polishing.

The AI unit 4 may be installed in a gateway connected to the polishing apparatus through a network line in a factory. The gateway is preferably in the vicinity of the grinding device. When high-speed processing is required (for example, when the sampling rate is 100ms or less), the AI unit 4 in the polishing apparatus or the AI unit 4 mounted on the gateway can be executed as the edge calculation (japanese: エッジコンピューティング). The AI unit 4 in the polishing apparatus may be mounted on a PC or a controller for the apparatus.

< second embodiment >

Next, a second embodiment will be described. While the polishing apparatus 10 of the first embodiment includes the information processing system having the AI section 4, the second embodiment is different in that the information processing system S2 having the AI section 4 is not provided in the polishing apparatus but is provided in a plant management room, a clean room, or the like in a plant.

Fig. 11 is a schematic diagram showing the overall configuration of the polishing system according to the second embodiment. As shown in FIG. 11, the polishing system according to the second embodiment includes polishing apparatuses 10-1 to 10-N and an information processing system S2, and the information processing system S2 is installed in the same factory or a factory management room as the polishing apparatuses 10-1 to 10-N installed in the factory. The information processing system S2 includes an AI unit 4, and the AI unit 4 can communicate with the polishing apparatuses 10-1 to 10-N via a local network NW 1. The AI unit 4 is mounted on a computer (e.g., a server or a fog computer), for example.

When the AI unit 4 is provided in the polishing apparatus or in the gateway, the machine learning model having completed learning is executed by edge calculation, and high-speed processing can be performed. For example, the processing can be performed at high speed in time (real time).

When the AI unit 4 is installed in a server or a mist computer in a factory, the machine learning model can be updated by aggregating data of a plurality of polishing apparatuses in the factory. And data of a plurality of polishing apparatuses in a factory can be collectively analyzed, and the analysis result can be reflected in the polishing parameter setting.

< third embodiment >

Next, a third embodiment will be explained. While the polishing apparatus 10 includes the AI unit 4 in the first embodiment, the AI unit 4 is not provided in the polishing apparatus but in the analysis center in the third embodiment.

Fig. 12 is a schematic diagram showing the overall configuration of the polishing system according to the third embodiment. As shown in fig. 12, a polishing system according to a third embodiment includes: polishing apparatuses 10-1 to 10-N installed in a plurality of factories, and an information processing system S3 installed in an analysis center. The information processing system S3 includes an AI unit 4, and the AI unit 4 can communicate with the polishing apparatuses 10-1 to 10-N via a global network NW2 and a local network NW 1. The AI unit 4 is, for example, a computer (e.g., a server).

By providing the AI unit 4 in the analysis center physically separated from the polishing apparatus, the AI unit 4 can be made common among a plurality of factories, and the maintainability of the AI unit 4 can be improved. Further, by using data during polishing in a plurality of factories and re-learning the machine learning model with a large amount of data, the estimation accuracy can be improved more quickly.

Further, the machine learning model may be updated by collecting data (for example, a large amount of data) concerning a plurality of polishing apparatuses in a plurality of plants. Further, data (for example, a large amount of data) relating to a plurality of polishing apparatuses in a plurality of factories may be collectively analyzed, and the analysis result may be reflected in the polishing parameter setting.

The AI unit 4 may be installed not in an analysis center that performs analysis in a centralized manner but in a cloud.

The AI unit 4 may be installed in (1) a polishing apparatus, and/or (2) a gateway near the polishing apparatus, and/or (3) a computer (PC, server, fog computer, etc.) in a factory (e.g., a factory management room).

The place where the AI unit 4 is installed may be (1) inside the polishing apparatus, (2) a gateway near the polishing apparatus, and/or (4) a computer of an analysis center (or cloud).

The place where the AI unit 4 is installed may be (1) a gateway in the polishing apparatus and/or (2) a gateway in the vicinity of the polishing apparatus and/or (3) a computer in the plant (for example, in a plant management room) and/or (4) a computer in the analysis center (or cloud).

The AI unit 4 may be distributed and disposed in (1) the polishing apparatus, and/or (2) a gateway in the vicinity of the polishing apparatus, and/or (3) a computer (PC, server, fog computer, etc.) in the plant (for example, in a plant management room), and/or (4) a computer of the analysis center (or cloud).

In each embodiment, the input of the machine learning model is a feature amount based on data on the frictional force between the polishing member and the substrate at each time during polishing, but is not limited thereto. The input of the machine learning model may be a feature amount based on measurement data of the temperature of the polishing member (here, the polishing pad 101) or the substrate at each time during polishing. This is because, when the frictional force between the polishing member and the substrate during polishing increases, the amount of heat generation of the polishing member or the substrate increases in accordance with the increase, and the temperature of the polishing member or the substrate rises, so that the temperature of the polishing member or the substrate and the frictional force between the polishing member and the substrate during polishing have a positive correlation.

For example, in the case of the first embodiment, the storage unit 41 may store a machine learning model in which learning is completed using a data set for learning, which takes as input at least a feature amount of measurement data based on the temperature of the polishing member or the substrate at each time during polishing, and which takes as output at least a polishing amount or a residual film amount at each time during polishing estimated from the film thickness measured after polishing.

In this case, the generating unit 451 may generate the feature value using measurement data of the temperature of the polishing member or the target substrate at the target time during polishing. The estimation unit 452 may input at least the feature amount generated by the generation unit 451 to the machine learning model having completed learning, and output an estimated value of the polishing amount or the residual film amount at the target time during polishing of the target substrate.

For example, in the case of the first modification of the first embodiment, the storage unit 41 may store a machine learning model in which learning is completed using a learning data set having at least a feature amount of measurement data based on the temperature of the polishing member or the substrate at each time point during polishing as an input, and a polishing end point probability at each time point during polishing as an output.

For example, in the case of the second modification of the first embodiment, a machine learning model in which learning is completed using a data set for learning, which has at least a feature amount based on measured data of the temperature of the polishing member or the substrate at each time during polishing as an input, and a polishing remaining time determined so that the residual film thickness or the polishing amount becomes a target value or an additional polishing time from the end point inspection time point as an output, may be stored in the storage unit 41.

In this case, the generating unit 451 may generate the feature value using the measurement data of the temperature of the polishing member or the target substrate at the target time during polishing. The estimation unit 452 inputs at least the feature amount generated by the generation unit 451 to the machine learning model having completed learning, and outputs an estimated value of the polishing remaining time or the additional polishing time from the end point verification time.

At least a part of the AI unit 4 described in the above embodiment may be configured by hardware or software. In the case of being constituted by software, a program that realizes at least a part of the functions of the AI section 4 may be stored in a recording medium such as a flexible disk or a CD-ROM, and read and executed by a computer. The recording medium is not limited to a removable recording medium such as a magnetic disk or an optical disk, and may be a fixed recording medium such as a hard disk device or a memory.

Further, the program that realizes at least a part of the functions of the AI section 4 may be distributed via a communication line (including wireless communication) such as the internet. The program may be encrypted, modulated, and compressed, and distributed via a wired line such as the internet, a wireless line, or stored in a recording medium.

Further, the AI unit 4 may be caused to function by one or more information processing devices. In the case of using a plurality of information processing apparatuses, at least one of the information processing apparatuses is a computer that realizes a function as at least one means of the AI section 4 by executing a predetermined program.

In the method of the present invention, all the steps (steps) may be automatically controlled by a computer. Further, progress control between the steps may be manually performed by a person while the steps are performed by a computer. Further, at least a part of the whole process may be manually performed by a person.

In the above-described embodiment, as shown in fig. 3, the process of polishing the layer to be polished 51 until the lower layer 52 is exposed was described as an example, but the present invention can also be applied to a process of finishing polishing without exposing the lower layer and leaving a layer to be polished of a predetermined thickness. As compared with the treatment of exposing the lower layer, although the signal relating to the frictional force or the temperature is less likely to change, by learning how long the state in which the change does not occur continues at what value, the polishing can be terminated such that the residual film amount comes closer to the target value.

The present invention is not limited to the determination of the end of polishing, and may be configured to change the polishing conditions (e.g., polishing pressure) when the polishing amount estimated during polishing or the residual film amount does not satisfy predetermined conditions, and to perform polishing so as to achieve the target polishing amount without extending the polishing time, for example.

As described above, the present invention is not limited to the above embodiments as it is, and the components can be modified and embodied without departing from the scope of the invention in the implementation stage. Further, various inventions can be formed by appropriate combinations of a plurality of constituent elements disclosed in the above embodiments. For example, several components may be deleted from all the components shown in the embodiments. Further, the constituent elements related to the different embodiments may be appropriately combined.

36页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：法兰密封面就地研磨方法及其专用研磨装置

Polishing apparatus and recording medium

相关技术

网友询问留言