Method and apparatus for optimizing etch profile via reflected light matching and surface dynamics modeling

文档序号：876598 发布日期：2021-03-19 浏览：22次中文

阅读说明：本技术 经反射光匹配和表面动力模型优化蚀刻轮廓的方法和装置 (Method and apparatus for optimizing etch profile via reflected light matching and surface dynamics modeling ) 是由穆罕默德·德里亚·特泰克萨拉瓦纳普里亚·西里拉曼安德鲁·D·贝利三世亚历克斯·帕特森于 2017-02-08 设计创作，主要内容包括：本发明涉及经反射光匹配和表面动力模型优化蚀刻轮廓的方法和装置。公开了优化计算机模型的方法,其通过使用多个模型参数(B)将半导体衬底上的特征的蚀刻轮廓与成组的独立输入参数(A)相关联。在一些实施方式中,所述方法可以包括：修改B的一个或多个值,以便相对于A的一组或者多组成组的值减少指示在从模型生成的计算反射光谱和对应的实验反射光谱之间的差的尺度。在一些实施方式中,计算所述尺度可以包括：将所述计算反射光谱和对应的实验反射光谱投射到经降维的子空间上,并且计算投射到所述子空间上的所述反射光谱之间的差。还公开了实现这样的优化计算机模型的蚀刻系统。(The invention relates to a method and apparatus for optimizing an etch profile via reflected light matching and surface dynamics modeling. A method of optimizing a computer model is disclosed that relates an etch profile of a feature on a semiconductor substrate to a set of independent input parameters (a) by using a plurality of model parameters (B). In some embodiments, the method may comprise: modifying one or more values of B so as to reduce a scale indicative of a difference between a calculated reflectance spectrum generated from the model and a corresponding experimental reflectance spectrum relative to one or more sets of values of a. In some embodiments, calculating the scale may include: projecting the calculated reflectance spectrum and a corresponding experimental reflectance spectrum onto a reduced-dimension subspace, and calculating a difference between the reflectance spectra projected onto the subspace. An etching system implementing such an optimized computer model is also disclosed.)

1. A method of optimizing a computer model relating etch profiles of features on a semiconductor substrate to sets of independent input parameters, the method comprising:

(a) determining values for one or more model parameters to be optimized, wherein the model parameters are used to execute the computer model;

(b) receiving an experimental reflectance spectrum generated from optical measurements of one or more semiconductor substrates etched using an experimental etch process performed using the set of values of the independent input parameters;

(c) generating, using a computer processor, a calculated reflectance spectrum by executing the computer model using the values of the set of input parameters specified in (b) and the values of the model parameters determined in (a); and

(d) modifying with a computer processor the values of the one or more model parameters determined in (a), and repeating (c) with the modified values of the one or more model parameters so as to reduce a scale indicative of a difference between the reflection spectrum received in (b) and the corresponding calculated reflection spectrum generated in (c) relative to the values of the set of independent input parameters, thereby producing modified values of the one or more model parameters for the computer model that correlate an etch profile of a feature on a semiconductor substrate with the set of independent input parameters.

2. The method of claim 1, wherein at least some of the calculated reflectance spectra are produced by a process comprising:

(i) generating a calculated etch profile represented by a series of etch profile coordinates using the model;

(ii) (ii) from the calculated etch profile produced in (i), producing a calculated reflection spectrum by simulating reflection of electromagnetic radiation from the calculated etch profile.

3. The method of claim 1, wherein:

the experimental reflectance spectra received in (b) comprise reflectance spectra corresponding to a sequence of etch times representing different durations of an etch process; and

the calculated reflectance spectra produced in (c) include reflectance spectra calculated from the model so as to correspond to the etch time series in (b).

4. The method of claim 3, wherein the experimental reflectance spectrum is generated in (b) from optical measurements made during an ongoing etching process at the etching time series.

5. The method of claim 4, wherein successive etch times in at least a portion of the sequence of etch times are separated by 0.01-1 second.

6. The method of claim 4, wherein at least some of the experimental reflectance spectra received in (b) have been adjusted based on optical measurements made with respect to and compared to optical measurements made after the end of substrate etching processes of various durations.

7. The method of claim 6, wherein the optical measurements corresponding to an end of etch process of various durations are taken after a corresponding etched substrate has been removed from the processing chamber in which the substrate was etched.

8. The method of claim 1, further comprising repeating (d).

9. The method of claim 8, further comprising further repeating (d) until a substantially local minimum of error is obtained with respect to the one or more parameters. .

10. The method of claim 1, wherein the computer model calculates a local etch rate at grid points representing the etch profile of the feature on the semiconductor substrate as a function of time.

11. The method of claim 10, wherein the one or more model parameters include a reaction rate constant, reactant and product sticking coefficients, and reactant and product diffusion constants.

12. The method of claim 1, further comprising determining the set of independent input parameters by performing PCA.

13. The method of claim 12, wherein the PCA is performed on a cascaded vector of independent input parameters and corresponding measured etch profiles.

14. The method of claim 1, further comprising etching a semiconductor substrate using or adjusting a set of etch conditions determined using the computer model with the modified values of the one or more model parameters.

15. An optimized computer model comprising a non-transitory computer readable medium having provided thereon computer executable instructions encoding a computer model that associates an etch profile of a feature on a semiconductor substrate with a set of independent input parameters, the optimized computer model configured to have modified values for one or more model parameters, wherein the computer model has been optimized by the method of (a) - (d) of claim 1.

16. A computer-implemented method of approximating a profile of a feature on a semiconductor substrate after the feature has been etched by an etching process, the method comprising:

specifying test values for a set of independent input parameters corresponding to the etch process; and

generating an etch profile using the optimized computer model of claim 15 using the test values specified for the independent input parameters.

17. A method of determining a set of values for a set of independent input parameters for an etching process that will approximately produce a desired etch profile for a feature on a semiconductor substrate after the feature is etched by the etching process, the method comprising:

(a) specifying test values for the set of independent input parameters corresponding to the etch process;

(b) generating a calculated etch profile using the optimized computer model according to claim 15 using the test values specified for the independent input parameters;

(d) modifying one or more values specified in (a) for the set of independent input parameters to reduce a difference between the desired etch profile and the calculated etch profile determined by repeating (b) - (c).

18. The method of claim 17, further comprising:

(e) repeating (d) until a substantially local minimum of error is obtained with respect to the values of the set of independent input parameters selected in (d).

19. A system for processing a semiconductor substrate, the system comprising:

an etcher apparatus for etching a semiconductor substrate, the operation of which is adjusted by a set of independent input parameters; and

a controller for controlling operation of the etcher apparatus, the controller comprising a processor and a memory;

wherein:

the memory stores an etched feature profile model optimized by operations (a) - (d) of claim 1; and

the processor is configured to calculate an etch feature profile from the set of values for the set of independent input parameters using the etch feature profile model stored in the memory.

20. The system of claim 19, wherein the controller adjusts operation of the etcher device by changing one or more values of the set of independent input parameters in response to the calculated etch feature profile.

21. The system of claim 20, wherein the set of independent input parameters whose values vary in response to the calculated etch feature profile comprises one or more parameters selected from RF plasma frequency and RF plasma power level.

22. The system of claim 19, wherein the etcher device comprises:

a processing chamber;

a substrate holder for holding a substrate within the process chamber;

a plasma generator for generating a plasma within the processing chamber, the plasma generator comprising an RF power source;

one or more valve-controlled process gas inlets for flowing one or more process gases into the processing chamber; and

one or more gas outlets fluidly connected to one or more vacuum pumps to exhaust gases from the process chamber.

23. The system of claim 22, wherein the controller is configured to adjust a frequency and/or power level of the RF power source to modify a characteristic of the plasma in the processing chamber in response to the calculated etch feature profile.

24. The system of claim 22, wherein the controller is configured to operate the one or more valve-controlled process gas inlets to adjust a flow rate of one or more process gases into the processing chamber in response to the calculated etch feature profile.

25. The system of claim 22, wherein the controller is configured to adjust a temperature and/or pressure within the processing chamber in response to the calculated etch feature profile.

26. The method of claim 1, wherein the computer model enables development of etch process parameters, wherein the computer model is optimized with the modified values for the one or more model parameters generated in (d).

27. The method of claim 1, wherein the computer model enables development of a lithography mask for an etching process, wherein the computer model is optimized using the modified values for the one or more model parameters generated in (d).

28. The optimized computer model of claim 15, wherein the computer-executable instructions further comprise instructions for developing a lithography mask for an etch process.

29. The optimized computer model of claim 15, wherein the computer-executable instructions further comprise instructions for developing etch process parameters.

Technical Field

The present invention relates generally to the field of semiconductor processing, and more particularly to a method and apparatus for optimizing etch profiles via reflectance spectrum matching and surface dynamics model optimization.

Background

The performance of a plasma assisted etch process is often critical to the success of a semiconductor processing workflow. However, optimizing the etch process can be difficult and time consuming, typically involving a process engineer manually adjusting the etch process parameters in a particular manner in an attempt to produce the desired target feature profile. Currently there is no automated procedure at all of sufficient precision by which a process engineer can rely on to determine the values of process parameters that will result in a given desired etch profile.

Some models attempt to simulate the physicochemical processes that occur on the surface of a semiconductor substrate during an etching process. Examples include the etch profile model of m.kushner and coworkers and the etch wheel of cooper berg and coworkersAnd (4) a profile model. The former is described in Y.Zhang, "Low Temperature Plasma ethanol Control through Ion Energy Distribution and 3-Dimensional Profile Simulation," Chapter 3, Distribution, University of Mich (2015), the latter is described in Cooerberg, Valedi, and Gottsch, "semi-schematic Profile Simulation of aluminum ethanol Cl₂/BCl₃plasma, "j.vac.sci.technol.a 20(5),1536(2002), each of which is incorporated herein by reference in its entirety for all purposes. Additional descriptions of etch profile models by m.kushner and coworkers may be found in j.vac.sci.technol.a 15(4),1913(1997), j.vac.sci.technol.b 16(4),2102(1998), j.vac.sci.technol.a 16(6),3274(1998), j.vac.sci.technol.a 19(2),524(2001), j.vac.sci.technol.a 22(4),1242(2004), j.appl.phys.97,023307(2005), each of which is also incorporated herein by reference in its entirety for all purposes. Despite the considerable effort made in developing these models, they do not yet have the desired degree of accuracy and reliability found to be of substantial use in the semiconductor processing industry.

Disclosure of Invention

A method of optimizing a computer model is disclosed that relates an etch profile of a feature on a semiconductor substrate to a set of independent input parameters (a) by using a plurality of model parameters (B). The method may comprise determining a set of values for the selected set of model parameters (B) to be optimized: multiple sets of values are determined for the selected set of independent input parameters (A) for optimization. Then, for each set of values for a, the method may further comprise: receiving an experimental reflectance spectrum generated from optical measurements of an experimental etch process performed using the set of values for the specified A; and further generating, by the model, a calculated reflectance spectrum using the set of values for a and B. In some such embodiments, the method may further comprise: modifying one or more values of B and repeating the generation of the calculated reflectance spectrum from the model but now using the modified set of values of B so as to reduce a metric (metric) indicative of a difference between the experimental reflectance spectrum and the corresponding calculated reflectance spectrum relative to one or more sets of values of a.

In some embodiments, calculating the scale may include the following operations: calculating a difference between the calculated reflectance spectrum and the corresponding experimental reflectance spectrum and projecting the difference onto a reduced-dimension subspace; and/or projecting the calculated reflectance spectrum and a corresponding experimental reflectance spectrum onto a reduced-dimension subspace, and calculating a difference between the reflectance spectra projected onto the subspace.

In some embodiments, at least some of the calculated reflectance spectra are produced by a process comprising: generating a calculated etch profile represented by a series of etch profile coordinates using the model; and generating, from the calculated etch profile, a calculated reflection spectrum by simulating reflection of electromagnetic radiation from the calculated etch profile.

In some embodiments, the experimental reflectance spectra include reflectance spectra corresponding to a sequence of etch times representing different durations of the etch process; and the calculated reflectance spectrum includes reflectance spectra calculated from a model so as to correspond to the same etching time series. In some such embodiments, the experimental reflectance spectra are generated from optical measurements taken during the etching process in the etch time series, in some cases successive etch times in at least a portion of the etch time series are separated by 0.01-1 second.

Also disclosed herein is a computer model for generating a calculated etch profile, the computer model having been optimized by the foregoing method. Also disclosed herein are methods of approximating the profile of a feature on a semiconductor substrate after the feature has been etched by an etching process. These methods may include: specifying a set of values for a set of independent input parameters corresponding to the etch process; and generating an etch profile using the optimized computer model using the set of values specified for the independent input parameters. Also disclosed is a method of determining a set of values for a set of independent input parameters for an etching process that will approximately produce a desired etch profile for a feature on a semiconductor substrate after the feature is etched by the etching process.

Also disclosed herein is a system for processing a semiconductor substrate. These systems may include: an etcher apparatus for etching a semiconductor substrate, the operation of which is adjusted by a set of independent input parameters; and a controller for controlling operation of the etcher apparatus. The controller typically includes a processor and a memory. The memory may store an etch feature profile model optimized by any of the aforementioned model optimization methods. The processor may be configured to calculate an etch feature profile from the set of values for the set of independent input parameters using the optimized etch feature profile model stored in the memory.

In particular, some aspects of the invention may be set forth as follows:

1. a method of optimizing a computer model relating etch profiles of features on a semiconductor substrate to sets of independent input parameters by using a plurality of model parameters, the method comprising:

(a) determining a set of values for the selected set of model parameters to be optimized;

(b) determining a plurality of sets of values for the selected set of independent input parameters for optimization;

(c) for each set of values specified in (b), receiving an experimental reflectance spectrum resulting from optical measurements of an experimental etch process performed using the set of values specified in (b);

(d) for each set of values specified in (b), generating, by the model, a calculated reflectance spectrum using the set of values specified in (a) and (b); and

(e) modifying one or more values specified in (a) for the selected set of the model parameters, and repeating (d) with the modified set of values to reduce a metric indicative of a difference between the reflection spectrum received in (c) and a corresponding calculated reflection spectrum generated in (d) relative to one or more sets of values for the selected set of independent input parameters specified in (b);

wherein calculating the scale in (e) comprises:

(1) calculating a difference between the calculated reflectance spectrum and a corresponding experimental reflectance spectrum and projecting the difference onto a reduced-dimension subspace; and/or

(2) Projecting the calculated reflectance spectrum and a corresponding experimental reflectance spectrum onto a reduced-dimension subspace, and calculating a difference between the reflectance spectra projected onto the subspace.

2. The method of clause 1, wherein at least some of the calculated reflectance spectra are produced by a process comprising:

(i) generating a calculated etch profile represented by a series of etch profile coordinates using the model;

(ii) (ii) from the calculated etch profile produced in (i), producing a calculated reflection spectrum by simulating reflection of electromagnetic radiation from the calculated etch profile.

3. The method of clause 1, wherein:

the experimental reflectance spectra generated in (c) comprise reflectance spectra corresponding to a sequence of etch times representing different durations of an etch process; and

the calculated reflectance spectra produced in (d) include reflectance spectra calculated from the model so as to correspond to the etch time series in (c).

4. The method of clause 3, wherein the experimental reflectance spectrum is generated in (c) from optical measurements made during an ongoing etching process at the etching time series.

5. The method of clause 4, wherein the successive etch times in at least a portion of the etch time series are separated by 0.01-1 seconds.

6. The method of clause 4, wherein the successive etch times in at least a portion of the etch time series are separated by 0.05-0.5 seconds.

7. The method of clause 4, wherein at least some of the experimental reflectance spectra generated in (c) have been adjusted based on optical measurements made with respect to and compared to optical measurements made after the end of substrate etching processes of various durations.

8. The method of clause 7, wherein the optical measurements corresponding to the end of etch processes of various durations are taken after the corresponding etched substrates have been removed from the processing chamber in which the substrates were etched.

9. The method of clause 1, wherein determining the reduced-dimension subspace in (1) comprises Principal Component Analysis (PCA) of the difference between the experimental reflectance spectrum and the calculated reflectance spectrum.

10. The method of clause 1, wherein determining the reduced-dimension subspace in (2) comprises a PCA of the experimental reflectance spectrum, a PCA of the calculated reflectance spectrum, or a PCA of a combination of both experimental and calculated reflectance spectra.

11. The method of clause 1, wherein the reduced-dimension subspace corresponds to a selection of a particular spectral wavelength at a particular etch time, and the scale calculates a weighted sum of quantities monotonically related to the magnitude of the difference between the corresponding experimental and calculated reflection spectra projected onto the subspace at a corresponding experimental reflection spectrum sum.

12. The method of clause 11, wherein the weights used to calculate the weighted sum have equal values, the quantity monotonically related to the difference is the square of the difference, and the scale is monotonically related to the mean square difference between the corresponding experimental and calculated reflectance spectra projected onto the subspace.

13. The method of clause 11, wherein some of the weights corresponding to a particular wavelength at a particular etch time are greater than some of the weights corresponding to the same wavelength at other etch times.

14. The method of clause 13, wherein some of the weights corresponding to a particular wavelength at a particular etch time are greater than some of the weights corresponding to other wavelengths at the same etch time.

15. The method of clause 13, wherein the weight corresponding to a particular wavelength at a particular etch time is determined via a procedure comprising:

partial Least Squares (PLS) analysis;

wherein the PLS analysis correlates geometric coordinates of an etch profile of a feature on a semiconductor substrate at the end of an etch process with a reflection value corresponding to a particular wavelength at a particular etch time earlier in the etch process.

16. The method of clause 15, wherein the etch profile and the reflectance value are determined experimentally.

17. The method of clause 15, wherein the etch profile and reflectance values are determined from a model optimized according to clause 1.

18. The method of clause 1, further comprising repeating (e).

19. The method of clause 18, further comprising further repeating (e) until a substantially local minimum of error is obtained for the model parameters selected in (a).

20. The method of clause 1, wherein the computer model calculates a local etch rate at grid points of the etch profile of the feature on the semiconductor substrate as a function of time.

21. The method of clause 20, wherein the model parameters include a reaction rate constant, reactant and product sticking coefficients, and reactant and product diffusion constants.

22. The method of clause 1, wherein the selecting the multiple sets of values for the set of etch process parameters in (b) comprises PCA.

23. The method of clause 22, wherein the PCA is performed on the independent input parameters and corresponding cascaded vectors of measured etch profiles.

24. An optimized computer model that generates a calculated etch profile of a feature on the semiconductor substrate from a set of values of a set of independent input parameters, the computer model having been optimized by the method of clause 1 above.

25. A method of approximating a profile of a feature on a semiconductor substrate after the feature has been etched by an etching process, the method comprising:

specifying a set of values for a set of independent input parameters corresponding to the etch process; and

the optimized computer model of clause 24 is used to generate an etch profile using the set of values specified for the independent input parameters.

26. A method of determining a set of values for a set of independent input parameters for an etching process that will approximately produce a desired etch profile for a feature on a semiconductor substrate after the feature is etched by the etching process, the method comprising:

(a) assigning a set of values to a set of independent input parameters corresponding to an etch process;

(b) generating a calculated etch profile using the optimized computer model according to clause 24 using the set of values specified for the independent input parameters;

27. The method of clause 26, further comprising:

(e) repeating (d) until a substantially local minimum of error is obtained with respect to the model parameters selected in (d).

28. A system for processing a semiconductor substrate, the system comprising:

an etcher apparatus for etching a semiconductor substrate, the operation of which is adjusted by a set of independent input parameters; and

a controller for controlling operation of the etcher apparatus, the controller comprising a processor and a memory;

wherein:

the memory stores an etched feature profile model optimized by the method of clause 1; and

the processor is configured to calculate an etch feature profile from the set of values for the set of independent input parameters using the etch feature profile model stored in the memory.

29. The system of clause 28, wherein the controller adjusts the operation of the etcher device by changing one or more values of the set of independent input parameters in response to the calculated etch feature profile.

30. The system of clause 29, wherein the set of independent input parameters whose values vary in response to the calculated etch feature profile comprises one or more parameters selected from RF plasma frequency and RF plasma power level.

31. The system of clause 28, wherein the etcher device comprises:

a processing chamber;

a substrate holder for holding a substrate within the process chamber;

a plasma generator for generating a plasma within the processing chamber, the plasma generator comprising an RF power source;

one or more valve-controlled process gas inlets for flowing one or more process gases into the processing chamber; and

one or more gas outlets fluidly connected to one or more vacuum pumps to exhaust gases from the process chamber.

32. The system of clause 31, wherein the controller adjusts a frequency and/or power level of the RF power source to modify a characteristic of the plasma in the processing chamber in response to the calculated etch feature profile.

33. The system of clause 31, wherein the controller operates the one or more valve-controlled process gas inlets to adjust a flow rate of one or more process gases into the processing chamber in response to the calculated etch feature profile.

34. The system of clause 31, wherein the controller adjusts the temperature and/or pressure within the processing chamber in response to the calculated etch feature profile.

These and other features of the present disclosure will be presented below with reference to the associated drawings.

Drawings

Fig. 1 shows an example of an etch profile generated by calculation from a surface dynamics model of an etch process.

Fig. 2 shows an example of an etch profile similar to that shown in fig. 1, but in which the etch profile is calculated from experimental measurements taken with one or more optical metrology tools.

FIG. 3 is a process flow diagram representing a procedure for optimizing an etch profile model with respect to an etch profile coordinate space.

FIG. 4A is a process flow diagram representing a procedure for optimizing etch profile models, particularly certain model parameters used in these models.

FIG. 4B is a process flow diagram representing a procedure for optimizing etch profile models, particularly certain model parameters used in these models.

Figure 4B-1 is a cross-sectional view of an idealized feature having parallel lines and showing relevant dimensions including critical dimension and pitch.

FIG. 5 depicts a set of exemplary canonical etch profiles that may be identified using a model optimized according to the present disclosure.

FIG. 6 is a process flow diagram representing a routine for optimizing an etch profile model with respect to a reflectance spectrum space.

FIG. 7A is a graphical representation of the reflectance spectrum history as the etch profile evolves through the etch process.

FIG. 7B schematically presents sets of spectral reflectance data collected over a number of wafers in 3-D data blocks (the 3 indices of the data blocks correspond to the number of wafers (i), spectral wavelength (j), and etch process time (k)); and the 3-D data block is expanded into a 2-D data block that can be used as independent data for PLS spectral history analysis, the relevant data being the etch profile coordinates also indicated in the figure.

FIG. 8 is a process flow diagram showing an iterative procedure for optimizing a PLS model that correlates etch spectral reflectance history with etch profiles throughout the etch process, while optimizing the EPM used to generate the calculated reflectance spectra to be used in the optimization of the PLS model.

Fig. 9A-9C illustrate embodiments of adjustable gap Capacitively Coupled (CCP) plasma reactors.

Fig. 10 shows an embodiment of an Inductively Coupled Plasma (ICP) reactor.

Detailed Description

Introduction to

Programs are disclosed herein for improving the practical utility of the above-described Etch Profile Models (EPMs) (and other similar models) so that they can be used to generate sufficiently accurate representations (representations) of the etch profiles of semiconductor features that are sufficiently good approximations to be relied upon in the semiconductor processing industry. In general, the inventive procedure improves the predictive capabilities of these models.

In general, EPM and similar models attempt to model the etch profile evolution of a substrate feature over time (i.e., the time-dependent change in the shape of the feature at various spatial locations on the surface of the feature) by calculating the reaction rates associated with the etch process at each of these spatial locations, which are caused by the incident flux of etchant and the characteristics of the deposited species under the plasma conditions established in the reaction chamber, and throughout the simulated etch process. The output is a simulated etch profile represented by a discrete set of data points (i.e., profile coordinates) that spatially map the shape of the profile. An example of such a simulated etch profile is shown in fig. 1; the simulated profile may correspond to the actual measured etch profile as shown in fig. 2. The evolution of the simulated etch profile over time depends on the theoretically modeled, spatially resolved local etch reaction rates, which of course depend on the underlying chemistry and physics of the etch process. Therefore, the etch profile simulation depends on: various physical and chemical parameters associated with the chemical reaction mechanism underlying the etch process, and any physical and chemical parameters that can characterize the chamber environment (temperature, pressure, plasma power, reactant flow rates, etc.), typically under the control of a process engineer.

With respect to the former, the etch profile model requires a set of "basic" chemical and physical input parameters, examples being reaction probabilities, sticking coefficients, ion and neutral fluxes, etc., which are typically not independently controllable and/or even directly known by the process engineer, but must be specified as inputs to the simulation. Thus, these sets of "basic" or "mechanical" input parameters are assumed to have certain values, which are often taken from the literature, and their use implies some simplification (and approximation) of calling up the basic physical and chemical mechanisms after the etch process is modeled.

The present disclosure presents a procedure that combines experimental techniques and data analysis methods to improve the industrial applicability of EPM for these substrate etching processes. Note that the phrase "substrate etching process" includes a process that etches a mask layer, or more generally, any material layer that has been deposited on and/or resides on the surface of the substrate. These techniques focus on the "basic" chemical and physical input parameters employed by these models, and improve the models by using a program to determine what can be considered as a more efficient set of values for these parameters-effective in terms of their improving the accuracy of the etch model-even though the optimal values determined for these "basic" parameters are different from what the literature (or other experiments) might determine as "true" physical/chemical values for these parameters.

Fig. 3 and 4, discussed more fully below, provide a flow chart illustrating an exemplary process for generating an improved etch profile model. In fig. 3, for example, the process flow depicted has two input branches, one from experimental measurements and the other from the current version of the model, which has not been optimized. Both the experimental branch and the predictive model branch produce etch profile results. The results are compared and the comparison is used to refine the model such that the deviation between the results is reduced.

Detailed characterization of 2 or 3 dimensional etch profile data output by EPM presents particular challenges for optimizing the model. In various embodiments disclosed herein, the contour data is represented as a series of height slices, each height slice having a thickness. In other embodiments, the contours are represented as a series of vectors from a common origin or as a series of geometric forms, such as a trapezoid. When using many of these height slices or other components of the contour, an optimization problem that minimizes the error between the experimental contour and the EPM contour may be computationally demanding. To reduce the required computations, dimension reduction techniques (e.g., principal component analysis) are used to identify the relevant contributions of the various contour components to the overall physical contour used in the optimization. Representing the etch profile in a reduced dimensional space (a reduced dimensional space) with several principal components or other vectors can greatly simplify the process of improving the predictive power of the etch profile model. Furthermore, such principal components are orthogonal to each other, thereby ensuring that independent profile contributions can be individually optimized.

The following terms are used in this specification.

Independent variable-as is generally understood, an independent variable is any variable that causes a response. The etch profile model may include various types of independent variables, such as reactor process conditions (e.g., temperature, pressure, gas composition, flow rate, plasma power, etc.), local plasma conditions, and local reaction conditions.

Result variable — as is commonly understood, a result variable is a variable that is caused by an argument. Typically, the resulting model is output by the model. In some cases, the outcome variable is synonymous with the term dependent variable. In the present disclosure, the etch profile is one type of resulting variable.

Input variables — input variables are similar to arguments, but may be more specific because some arguments may be fixed for many runs and therefore not technically "input" variables for such runs. The input variables serve as inputs to the operation under consideration.

Mechanical parameter-a mechanical parameter is an independent variable that represents a physical and/or chemical condition at one or more specific locations in a reactor or substrate in which etching is performed.

Plasma parameter-a plasma parameter is a mechanical parameter that describes the local plasma conditions (e.g., plasma density and plasma temperature at a particular location on a substrate).

Reaction parameter-a reaction parameter is a mechanical parameter that describes the local chemical or physicochemical conditions.

Process parameters-Process parameters are reactor parameters (e.g., chamber pressure and susceptor temperature) controlled by the process engineer. The process parameters, along with the substrate characteristics, may control the values of mechanical parameters in the etch reactor.

Model parameters-model parameters are one type of independent variable that is optimized. It is typically a mechanical parameter, such as a chemical reaction parameter. The initial values of the model parameters are not optimized.

Etching profile

The concept of describing the etch profile of a feature is useful before studying the details of the etch profile model and its improved procedures. In general, an Etch Profile (EP) refers to any set of values for a set of one or more geometric coordinates that may be used to characterize the shape of an etched feature on a semiconductor substrate. In a simple case, the etch profile may be approximated as the width of the feature determined at the half waist (the midpoint between the base (or bottom) of the feature and the top opening of the feature on the surface of the substrate) relative to the base of the feature, as viewed by a 2-dimensional vertical cross-sectional slice through the feature. In a more complex example, the etch profile may be a series of feature widths determined at different heights above the base of the feature as viewed through the same 2-dimensional vertical cross-sectional slice. Fig. 2 provides an illustration of this. Note that depending on the implementation, the width may be the distance between one sidewall of a recessed feature and another sidewall, i.e., the width of the area that has been etched away, or the width may refer to the width of a post that has been etched away on either side. The latter is schematically shown in fig. 2. Note that in some cases, this width is referred to as the "critical dimension" (labeled "CD" in fig. 2), and the height from the base of the feature may be referred to as the height or z-coordinate (labeled as a percentage in fig. 2) of the critical dimension referred to. As described above, the etch profile may be represented on other geometric bases, such as by a set of vectors from a common origin or a stack of shapes (stack) such as trapezoids or triangles or a set of characteristic shape parameters (e.g., arcuate, straight or tapered sidewalls, rounded bottoms, facets, etc.) that define a typical etch profile.

In this way, a series of geometric coordinates (e.g., feature widths at different heights) map out a discrete depiction of the feature profile. Note that there are many ways to express a series of coordinates representing feature widths at different heights. For example, each coordinate may have a value representing a relative deviation (a fractional deviation) from some baseline feature width (e.g., an average feature width or a vertical average feature width), or each coordinate may represent a change from a vertically adjacent coordinate, or the like. In any case, any situation referred to as "width" and the scheme generally used for the set of coordinates used to represent the profile of the etched profile will be clear from context and usage. The idea is to use a set of coordinates to represent the shape of the etched profile of a feature. It should also be noted that a series of geometric coordinates may also be used to describe the complete 3-dimensional shape of the etched profile of a feature or other geometric feature, such as the shape of an etched cylinder or trench on the surface of a substrate. Thus, in some embodiments, the etch profile model may provide a complete 3-D etch shape of the modeled feature.

Etch profile model

An Etch Profile Model (EPM) calculates a theoretically determined etch profile from a set of input etch reaction parameters (independent variables) that characterize the underlying physical and chemical etch process and reaction mechanism. These processes are modeled as a function of time and position in a grid representing the etched feature and its surroundings. Examples of input parameters include plasma parameters, such as ion flux and chemical reaction parameters, such as the probability that a particular chemical reaction will occur. These parameters (and in particular, in some embodiments, plasma parameters) may be obtained from various sources, including from other models that calculate these parameters based on the general reactor configuration and process conditions such as pressure, substrate temperature, plasma source parameters (e.g., power supplied to the plasma source, frequency, duty cycle), reactants and their flow rates, etc. In some embodiments, such a model may be part of the EPM.

As explained, EPM takes the reaction parameter as an independent variable and produces an etch profile as a response variable through a functional relationship (functinality). In other words, the set of independent variables are the physical/chemical process parameters used as inputs to the model, and the response variables are the etch profile characteristics calculated by the model. EPM employs one or more relationships between reaction parameters and etch profiles. The relationships may include, for example, coefficients, weights, and/or other model parameters (as well as linear functions, second and higher order polynomial functions, etc. of reaction parameters and/or other model parameters) that are applied to the independent variables in a defined manner to generate response variables related to the etch profile. Such weights, coefficients, etc. may represent one or more of the reaction parameters described above. These model parameters are adjusted or tuned during the optimization techniques described herein. In some embodiments, some of the reaction parameters are model parameters to be optimized, while other reaction parameters are used as independent input variables. For example, the chemical reaction parameter may be an optimizable model parameter, and the plasma parameter may be an independent variable.

In general, a "response variable" represents an output and/or an effect, and/or is tested to see if it is an effect. "argument" represents an input and/or a reason, and/or is tested to see if it is a reason. Thus, the response variable can be studied to see if it changes with the change in the independent variable, and to what extent it changes with the change in the independent variable. The independent variables may also be referred to as "predicted variables", "regressed variables", "controlled variables", "manipulated variables", "explanatory variables" or "input variables".

As explained, some EPMs employ input variables (a type of independent variable) that can be characterized as fundamental reaction mechanical parameters, and that can be considered as the basis of basic chemical and physical processes, so experimental process engineers typically do not control these quantities. In the etch profile model, these variables are applied at each location of the grid and at multiple times separated by defined time steps. In some embodiments, the grid resolution may vary between about a few angstroms and about the micron scale. In some embodiments, the time step may vary between about 1e-15 and 1e-10 seconds. In certain embodiments, the optimization uses two types of mechanical independent variables: (1) local plasma parameters, and (2) local chemical reaction parameters. These parameters are "local" in the sense that they may vary with position (in some cases due to the resolution of the grid). Examples of plasma parameters include local plasma properties such as flux and energy of particles, e.g., ions, radicals, photons, electrons, excited species, deposition species, and the like, and energy and angular distributions thereof. Examples of chemical and physicochemical reaction parameters include rate constants (e.g., the probability that a particular chemical reaction will occur at a particular time), sticking coefficients, energy thresholds for etching, base energies, energy indices defining sputtering yield, angular yield functions (angular yield functions), and parameters thereof, among others. Further, parameterized chemical reactions include reactions in which the reactants include the material being etched and the etchant. It should be understood that the chemical reaction parameters may include various types of reactions in addition to those that directly etch the substrate. Examples of such reactions include side reactions including parasitic reactions, deposition reactions, reactions of by-products, and the like. Any of these reactions may affect the overall etch rate. It should also be understood that the model may require other input parameters in addition to the plasma and chemical reaction input parameters described above. Examples of such other parameters include temperature, partial pressure or reactants at the reaction site, etc. In some cases, these and/or other non-mechanical parameters may be input into a module that outputs some mechanical parameters.

In some embodiments, the initial (non-optimized) values of the EPM model variables as well as the independent variables (e.g., plasma parameters in some embodiments) that are fixed during optimization may be obtained from various sources, such as from the literature, by other computational modules or model calculations, and the like. In some embodiments, independent input variables, such as plasma parameters, may be determined by using a model, such as from an etch chamber plasma model for the case of plasma parameters. Such a model may calculate applicable input EPM parameters from various process parameters that the process engineer has controlled (e.g., by rotating knobs), such as chamber environment parameters, e.g., pressure, flow rate, plasma power, wafer temperature, ICP coil current, bias voltage/power, pulse frequency, pulse duty cycle, etc.

When running the EPM, some arguments are set to known or expected parameter values for performing the experiment. For example, the plasma parameters may be fixed at known or desired values at locations in the modeled domain. The parameters described herein as models or other arguments to the model parameters are those parameters selected to be adjusted by the optimization procedure described below. For example, the chemical reaction parameter may be an adjusted model parameter. Thus, in a series of runs corresponding to a given measured experimental etch profile, the model parameters are varied to illustrate how the values of these parameters are selected to best optimize the model.

The EPM can take any of a number of different forms. Finally, they provide the relationship between the independent variables and the responsive variables. The relationship may be linear or non-linear. Generally, EPM is referred to in the art as a cell-based Monte Carlo surface reaction model (a-cell-based Monte Carlo surface reaction model). These models operate in their various forms to simulate the topological evolution of wafer features over time in the context of semiconductor wafer fabrication. The model emits pseudo-particles with an energy and angular distribution generated by a plasma model or experimental diagnostics for arbitrary radial positions on the wafer. The pseudo-particles are statistically weighted to represent the flux of radicals and ions towards the surface. Model addressing results in various surface reaction mechanisms of etching, sputtering, mixing and deposition on the surface to predict profile evolution. During the monte carlo integration, the trajectories of various ions and neutral pseudo-particles are tracked within the wafer feature until they react or leave the computational domain. EPM has advanced functions for predicting etching, lift-off, atomic layer etching, ionized metal physical vapor deposition, and plasma enhanced chemical vapor deposition on various materials. In some embodiments, EPM utilizes a two-or three-dimensional rectilinear grid with a resolution fine enough to adequately address/model the dimensions of wafer features (although in principle, grids (whether 2D or 3D) may also utilize non-rectilinear coordinates). The grid may be considered a two or three dimensional grid point array. It can also be seen as an array of elements representing a 2D local area or 3D volume associated with (centered on) each grid point. Each cell within the grid may represent a different solid material or mixture of materials. The choice of 2D or 3D mesh as the basis for modeling may depend on the type/category of wafer features being modeled. For example, a 2D mesh may be used for modeling long trench features (e.g., in a polysilicon substrate), the 2D mesh depicting the reaction processes of the cross-sectional shape (i.e., for the purposes of this cross-sectional 2D model, assuming the trench is infinite, again, a reasonable assumption for trench features away from its ends) provided that the geometry of the ends of the trench is less relevant to the reaction processes occurring along a substantial portion of the length of the trench away from its ends. On the other hand, it is appropriate to model circular via features (through silicon vias (TSVs)) using a 3D grid (since the x, y horizontal dimensions of the features are the same as each other).

The grid spacing may be in the range of sub-nanometers (e.g., 1 angstrom) to a few micrometers (e.g., 10 micrometers). Typically, each grid cell is assigned a material identity, such as photoresist, polysilicon, plasma (e.g., in a spatial region not occupied by a feature), which may change during profile evolution. The solid phase substance is represented by the identity of the computing unit; the gas phase species are represented by the calculated pseudo-particles. In this manner, the grid provides a reasonably detailed representation (e.g., for computational purposes) of the wafer features and the surrounding gas environment (e.g., plasma) as the geometry/topology of the wafer features evolves over time in the reactive etch process.

Etch experiments and profile measurements

In order to train and optimize the EPM presented in the previous section, various experiments may be performed to accurately determine, as experiments allow, the actual etch profiles produced by the actual etch processes performed under various process conditions specified by various sets of etch process parameters. Thus, for example, a first set of values specifying a set of etch process parameters (e.g., etchant flow rate, plasma power, temperature, pressure, etc.), the etch chamber arrangement is set accordingly, etchant is flowed into the chamber, plasma, etc., is energized, and the first semiconductor substrate continues to be etched to produce a first etch profile. Then, a second set of values is specified for the same set of etch process parameters, a second substrate is etched to produce a second etch profile, and so on.

Various combinations of process parameters may be used to present a broad or focused process space to properly train the EPM. The same process parameter combination is then used to calculate (independent) input parameters (e.g., mechanical parameters) to the EPM to provide an etch profile output (response variable) that can be compared to the experimental results. Because experiments can be expensive and time consuming, techniques can be employed to design experiments to reduce the number of experiments that need to be performed to provide a robust training set for optimizing EPM. Techniques such as design of experiments (DOE) may be employed for this purpose. Generally, such techniques determine which sets of process parameters to use in various experiments. They select combinations of process parameters by considering statistical interactions, randomization, etc. between process parameters. For example, the DOE may determine a small number of experiments covering a limited range of parameters around the center point of the process that has been completed.

Typically, a researcher will perform all experiments early in the model optimization procedure, and use only these experiments in the optimization routine iterations until convergence. Alternatively, the experiment designer may perform some experiments for early optimization iterations and then perform additional experiments as the optimization progresses. The optimization program may inform the experiment designer of the specific parameters to evaluate and thus run the specific experiment for later iterations.

One or more in-situ or off-line metrology tools may be used to measure the experimental etch profiles generated by these experimental etch process operations. At the end of the etching process, measurements are made during the etching process or at one or more times during the etching process. The measurement method may be destructive when the measurement is made at the end of the etching process, and is typically non-destructive (and therefore does not disrupt the etching) when the measurement is made at intervals during the etching process. Examples of suitable metrology techniques include, but are not limited to, LSR, OCD, and cross-section SEM. Note that the metrology tool may directly measure the profile of the feature, such as in the case of an SEM, where the experiment essentially images the etch profile of the feature, or it may indirectly determine the etch profile of the feature, such as in the case of an OCD measurement, where some post-processing is performed to reverse the time readout (back-out) of the etch profile of the feature from the actual measured data.

In any case, the result of the etch experiment and metrology procedure is a set of measured etch profiles, each profile typically comprising a series of values at a series of coordinates or a set of grid values representing the shape of the feature profile as described above. One example is shown in figure 2. The etch profile may then be used as an input to train, optimize and refine a computerized etch profile model as described below.

Model parameter adjustment/optimization

Each measured experimental etch profile provides a basis for adjusting the computerized etch profile model. Thus, a series of calculations are performed with the etch profile model by applying the experimental etch profile to see how the model actually deviates in its prediction of the etch profile. With this information, the model can be improved.

Fig. 3 provides a flow chart illustrating a set of operations 300 for adjusting and/or optimizing an etch profile model, such as those described above. In some embodiments, such an adjusted and/or optimized model reduces (and in some cases substantially minimizes) a metric associated (indicative, quantitative, etc.) with a combined difference between a measured etch profile as a result of performing an etch experiment and a corresponding calculated etch profile generated by the model. In other words, the improved model may reduce the combined error for different experimental process conditions (set by different specified sets of values for selected process parameters that are used to calculate the individual input parameters to the EPM).

As shown in FIG. 3, the optimization program 300 begins at operation 310, where a set of model parameters to be optimized is selected. Also, these model parameters may be selected to be parameters that characterize the underlying chemical and physical processes that the process engineer has no control over. Some or all of these model parameters will be adjusted based on experimental data to improve the model. In some embodiments, these model parameters may be reaction parameters and include reaction probability and/or (thermal) rate constants, reactant sticking coefficients, etch threshold energies for physical or chemical sputtering, exponential dependence on energy, etch angle yield dependence, and parameters related to the angular yield curve, among others. Note that in general, the optimization is performed with respect to a mixture of chemistries that are specifically given/specified to flow into the etch chamber (although it should be understood that the chemistry of the etch chamber will vary as the etch process progresses). In some embodiments, the reaction parameters are fed into the EPM in an input file separate from other input parameters (e.g., plasma parameters).

In some embodiments, the model parameters may include a protocol (specification) in which a particular chemical reaction is to be simulated by an etching process. One of ordinary skill in the art will appreciate that for a given etch process, many of the ongoing reactions will occur in the etch chamber at any time. These include the main etch reaction itself, but it may also include side reactions of the main etch process, as well as reactions involving byproducts of the main etch reaction, reactions between byproducts, reactions involving byproducts of the byproducts, and the like. Thus, in some embodiments, the selection of model parameters involves selecting which reactions are to be included in the model. Presumably, the more reactions involved, the more accurate the model and the correspondingly more accurate the calculated etch profile. However, by including more reactions, the complexity of the model is increased, thereby increasing the computational cost of the simulation. It also results in more reaction parameters to optimize. This may be good if the particular reaction added is important to the overall etch kinetics. However, if the additional reactions are not critical, adding additional sets of reaction parameters may make it more difficult for the optimization program to converge. Again, the selection of which reactions to include and the rate constants or reaction probabilities associated with those reactions may be fed into the EPM in their own input file (e.g., separate from the plasma parameters). In certain embodiments, for a given set of reactant species, the probability of each species' alternative/competing reaction pathway should sum to 1. And, again, it should be understood that for a given/specified mixture of chemicals being flowed into the etch chamber to perform an etch process/reaction, a reaction protocol (e.g., in an input file) will typically be made (and optimization will typically be with respect to that given mixture) including reaction probabilities, etc. (and in some embodiments it can be seen that what is learned for one chemical mixture may be applicable to similar/related chemical mixtures).

In any case, in order to begin the optimization procedure shown in the flow chart of FIG. 3, initial values must typically be selected for the various model parameters (e.g., reaction probabilities, sticking coefficients, etc.) being optimized. This is done in operation 310. The initial values may be those present in the literature, those calculated based on other simulations, those determined experimentally or known from previous optimization procedures, and so forth.

The model parameters selected and initialized in operation 310 are optimized by the set of independent input parameters being assigned to the sets of values in operation 320. Such independent input parameters may include parameters characterizing the plasma in the reaction chamber. In some embodiments, these plasma parameters are fed into the EPM via a separate input file from the input file for reaction parameters (just described). The sets of grouped values for the independent input parameters (e.g., plasma parameters) thus specify different points in space for the selected independent input parameters. For example, if the selected input parameters to be optimized are temperature, etchant flux and plasma density, and 5 sets of values are selected for these selected input parameters, 5 unique points have been determined in a 3-dimensional input parameter space of the selected temperature, etchant flux and plasma density, with each of the 5 points in the space corresponding to a different combination of temperature, etchant flux and plasma density. As described above, a design of experiment program such as DOE may be employed to select the set of input parameters.

Once selected, for each combination of input parameters, in operation 330, an etch experiment is performed to measure an experimental etch profile. (e.g., in some embodiments, multiple etch experiments are performed for the same combination of values of the input parameters, and the resulting etch profile measurements are averaged together (e.g., possibly after discarding outliers, etc.). the set of benchmarks are then used to adjust and optimize the model, particularly as follows: in operation 335, an etch profile is calculated for each combination of values of the input parameters, and in operation 340, an error metric (an error metric) is calculated that represents (correlates to, quantifies, etc.) the difference between the experimental etch profile and the calculated etch profile for all different sets of values of the input parameters.

Note that the set of calculated etch profiles (from which the error metric was calculated) corresponds to the previously selected set of model parameters as specified in operation 310. The goal of the optimizer is to determine more efficient choices for these model parameters. Thus, in operation 350, it is determined whether the currently specified model parameters cause the error metric calculated in operation 340 to be locally minimized (in terms of the space of model parameters), and if not, one or more values of the set of model parameters are modified in operation 360 and then used to generate a new set of etch profiles (operation 335 is repeated as schematically shown in the flow chart of FIG. 3), after which a new error metric is calculated in the repetition of operation 340. Next, the process again proceeds to operation 350, where it is determined whether the new combination of model parameters represents a local minimum for all sets of input parameters evaluated by the error metric. If so, the optimization procedure ends, as shown. If not, the model parameters are again modified in operation 360 and the loop is repeated.

FIG. 4A presents a flow chart of a method 470 for refining model parameters in an etch profile model. As shown, method 470 begins by collecting experimental etch profiles generated for a series of controlled etch chamber parameter sets. At a later stage, the method compares these experimentally generated etch profiles to theoretically generated etch profiles generated using an etch profile model. By comparing the experimentally generated etch profile to the theoretically generated etch profile, the set of model parameters used by the etch profile model may be refined to improve the ability of the model to predict the etch profile.

In the depicted method, the process begins at operation 472, where a set of process parameters is selected for the calculation phase and the experimental phase. These process parameters define a series of conditions under which the comparison is made. Each set of process parameters represents a set of settings for operating the etch chamber. As described above, examples of process parameters include chamber pressure, susceptor temperature, and other parameters that can be selected and/or measured within the etch chamber. Alternatively or additionally, each set of process parameters represents a condition of the workpiece being etched (e.g., line width and line spacing formed by etching).

After selecting a set of process parameters for an experimental run (note that the set of independent input parameters for EPM optimization will correspond to (and/or be calculated from) each set of process parameters for each set of sets), the experiment begins as depicted by a loop over multiple parameter sets and includes operations 474, 476, 478, and 480. operation 474 simply represents incrementing to the next process parameter set (i)) for running a new experiment. Such as a principal component representation of the etch profile.

Each time a new set of process parameters is used in the experiment, the method determines if there are more parameter sets to consider, as shown in decision block 480. If additional parameter sets exist, the next parameter set is initialized, as shown in block 474. Finally, after all initially defined sets of process parameters are considered, decision block 480 determines that there are no more considerations to consider. At this point, the process switches to the model optimization portion of the process flow.

Initially, in the model optimization portion of the process, a set of model parameters (j)) are initialized, as shown at block 482. As explained, these model parameters are the parameters that the model uses to predict the etch profile. In the context of this process flow, these model parameters are modified to improve the predictive capabilities of the EPM. In some embodiments, the model parameter is a reaction parameter that represents one or more reactions occurring in the etch chamber. In one example, the model parameter is a reaction rate constant or a probability that a particular reaction occurs. Further, as described elsewhere herein, the etch profile model may employ other parameters that remain fixed during the optimization routine. Examples of such parameters include physical parameters, such as plasma conditions.

After initializing the model parameters at operation 482, the method enters an optimization loop in which it generates theoretical etch profiles corresponding to each set of process parameters used to generate the experimental etch profiles in the experimental loop. In other words, the method uses the EPM to predict an etch profile corresponding to each set of process parameters (i.e., for all of the different sets of parameters (i)). Note, however, that for each of these sets of process parameters, what is actually input into the EPM (to run it) is a set of independent input parameters that corresponds to a given process parameter. For some parameters, the independent input parameters may be the same as the process parameters; but for some parameters, independent input parameters (actually fed into the EPM) may be derived/calculated from the physical process parameters; they thus correspond to each other, but they may not be the same. It will thus be appreciated that in the context of this optimization loop (operations 482- > 496) in fig. 4A, the EPM (which will be very accurate with respect to it) is run against the set of independent input parameters corresponding to "parameter set (i)", whereas the experiment is run in the experiment loop (operations 472- > 480) using the process parameters corresponding to "parameter set (i)".

In any case, initially in the loop, the method increments to the next one of the sets of parameters initially set in operation 472. See block 484. Using the selected set of parameters, the method runs an etch profile model using the current set of model parameters. See block 486. The method then generates and saves a theoretical etch profile for the current combination of parameter set and model parameters (parameter set (i) and model parameter (j)). See block 488. The "generate and save etch profile" operation provides an etch profile in reduced-dimension space, e.g., a principal component representation of the etch profile.

Finally, all parameter sets are considered in this loop. Before this point, decision block 490 determines that additional parameter sets remain and passes control back to block 484 where the parameter set is incremented to the next parameter set. The process of running the model and generating and storing the theoretical etch profile is repeated for each parameter set (i)).

When there are no remaining sets of parameters to consider for the model parameter currently being considered (model parameter (j)), the process exits the loop and calculates the error between the theoretical etch profile and the experimental etch profile. See block 492. In some embodiments, the error is determined for all sets (i) of parameters of the process, not just one of them.

The method uses the error determined in block 492 to determine whether the optimization routine for the model parameters has converged. See block 494. Various convergence criteria may be used, as described below. Assuming the optimization routine has not converged, process control is directed to block 496 where the method generates a new set of model parameters (j)) that may improve the predictive capabilities of the model. With this new set of model parameters, process control returns to the loop defined by blocks 484, 486, 488, and 490. Although in this loop, the parameter set (i) is repeatedly incremented and each model run to generate a new theoretical etch profile. After all sets of parameters are considered, the error between the theoretical etch profile and the experimental etch profile is again determined at block 492, and the convergence criterion is again applied at block 494. Assuming the convergence criteria have not been met, the method generates another set of model parameters for testing in the manner just described. Finally, a set of model parameters is selected that meet the convergence criteria. The process is then complete. In other words, the method illustrated in FIG. 4A has produced a set of model parameters that improve the predictive power of the etch profile model.

The associated procedure is depicted in fig. 4B. As shown, the experimental etch profile and the theoretical etch profile are generated for different substrate features rather than for different process conditions. In other words, the basic process flow is the same. In some implementations, the feature structure and process conditions vary for experimental and theoretical operations.

Different features may include different "line" and "pitch" geometries. See FIG. 4B-1. Pitch refers to the minimum unit cell width that will be repeated multiple times over a feature being etched. The line refers to the total thickness between two adjacent sidewalls, assuming symmetry. As an example, the method may run a repeating geometry of L50P100, L100P200, L100P300, L75P150, etc. Where the numbers represent line width (line width) and pitch (pitch) in nanometers.

In the illustrated embodiment, the process 471 begins by selecting fixed and varying parameters (model parameters) of the etch profile model. In some embodiments, these may be physical and chemical reaction parameters. Additionally, a substrate feature is selected. See operation 473.

For each feature geometry (incremental set of features (k) as shown in operations 475 and 481), the method runs an etch process using the current feature geometry, generates an experimental etch profile (k)), and saves the etch profile. See operations 477 and 479. As previously described, each experimental etch profile is saved in a reduced dimensional representation.

Thereafter, the method initializes the model parameters for adjustment (model parameters (j)). See operation 483. For each feature geometry (incremented to feature set (k) in operations 485 and 491), the method runs an etch profile model, generates a theoretical etch profile (k)), and saves the etch profile. See operations 487 and 489. As previously described, each theoretical etch profile is saved in a reduced-dimension representation.

For each set of model parameters (j) considered in the loop comprising operations 487 and 489, the method compares the theoretical etch profile to the experimental etch profile to determine the error between the etch profiles for all sets of substrate features. See operation 493. If the process has converged as determined at operation 495, the process is complete and the current model parameters are selected. If the process has not converged, the method generates a new set of model parameters (j) and returns again to the loop defined by operations 485, 487, 489, and 491.

In some embodiments, a separate set of model parameters is selected for each feature set. In this case, the method may map or otherwise determine a relationship between the line/pitch ratio (or another characteristic of the feature) and the final convergence model parameters. If the converged model parameter values are fairly constant, possibly with some noise, the method uses the average model parameter values of the improved edge profile model. If the converged model parameter values exhibit a trend, the method may use polynomial fitting to derive a function that may be used to select the model parameter values for each feature set (e.g., line and pitch geometric features).

It will be apparent that the set of features, set of process parameters, or other variables are used to conduct a plurality of experiments and thus produce a plurality of experimentally determined etch profiles. In some implementations, half or some other proportion of these etch profiles (and associated parameter sets) are used for training, as shown in the above-described flow chart, and the remaining etch profiles are used for verification. Training the etch profile generates adjusted model parameters that are used in the etch profile model and verified by applying the adjusted model to predict the etch profile of the verification set. If the error between the experimental etch profile and the theoretical etch profile of the validation set is statistically higher than the error that exists when using the training set to converge, then a different training set is used to adjust the model as previously described.

Details about iterative non-linear optimization procedures

The model parameter optimization procedure just described in the context of fig. 3 is typically an iterative nonlinear optimization procedure (e.g., which optimizes a measure of error that is typically a nonlinear function of the input parameters) and, therefore, various techniques known in the art for nonlinear optimization may be used. See, for example: biggs, M.C. "Constrained Minimization Using curative generalized Quadratic Programming", "Towards Global Optimization (L.C.W.Dixon and G.P.Szergo, eds.), North-Holland, pp 341-349, (1975); conn, n.r., n.i.m.gould, and ph.l.point, "Trust-regions Methods," MPS/SIAM Series on Optimization, SIAM and MPS (2000); more, J.J.and D.C.Sorensen, "Computing a Trust Region Step," SIAM Journal on Scientific and Statistical Computing, Vol.3, pp 553-572, (1983); byrd, R.H., R.B.Schnabel, and G.A.Shultz, "application Solution of the Trust Region protocol by Minimization over Two-Dimensional Subspaces," chemical Programming, Vol.40, pp 247-; dennis, J.E., Jr., "Nonlinear least-squares," State of the Art in Numerical Analysis, D.Jacobs, Academic Press, pp 269-312 (1977); more, J.J. "The Levenberg-Marquardt Algorithm: Implementation and Theory," digital Analysis, ed.G.A.Watson, left Notes in Mathematics 630, Springer Verlag, pp 105-; powell, M.J.D., "A Fast Algorithm for nonlinear Optimization algorithms," Numerical Analysis, G.A.Watson ed., feature Notes in Mathematics, Springer Verlag, Vol.630 (1978); each of these documents is incorporated by reference herein in its entirety for all purposes. In some embodiments, these techniques optimize an objective function (here, an error function/scale) subject to certain constraints that may act on input parameters and/or the error scale. In some such embodiments, the constraint function itself may be non-linear. For example, in embodiments where the calculated etch profile is represented by a set of stacked trapezoids output by the EPM, the error metric may be defined as the difference between the area represented by the boundaries of these stacked trapezoids and the area of the measured experimental etch profile. In this case, the error metric is a non-linear function of the response variable output by the EPM, so the constrained optimization technique is selected from the techniques just described (and/or from the incorporated references) that enable specification of the non-linear constraint. Note that in the context of the flow chart shown in fig. 3, these different procedures correspond to how one or more model parameters are modified in operation 360, and how one or more potential local minima in error are detected and processed in operation 350.

In some embodiments, the iterative nonlinear optimization procedure for determining improved/adjusted model parameters as shown in fig. 3 may be divided into multiple stages, and in some such embodiments, different optimization stages may correspond to different layers of material on the surface of the semiconductor substrate being etched. The method may also reduce the computational burden by reducing the number of input parameters that are changed and simplify the calculation of the error metric. For example, if the substrate to be etched comprises a plurality of stacked layers of different sequentially deposited materials, then because the different layers typically have different material compositions, typically different chemical characteristics characterize the local etch process occurring in each layer (e.g., different etch reaction(s), different side reactions, different reactions between byproducts), or even if the same (or similar) chemical reactions occur, they may typically occur at different rates, at different stoichiometry, etc. Thus, in order to build an Etch Profile Model (EPM) corresponding to the etching of the entire multilayer stack, the input parameters fed into the model typically include different sets of parameters corresponding to the different stack layers. As mentioned above, these groups may include parameters indicating which chemical reactions are to be included in the modeling of the etch process, as well as parameters characterizing the reactions themselves (reaction probability, sticking coefficient, etc.).

However, it should be appreciated that the optimization protocol does not necessarily need to optimize each parameter simultaneously, e.g., some may remain fixed in operation 360 of fig. 3 while others are allowed to "float" and modified in one or more particular cycles/rounds of optimization as schematically illustrated in the figure. Thus, based on the observation that the chemistry occurring in each layer is local to that layer to some extent, in some embodiments, optimization can be accelerated by: the model parameters associated with one layer are adjusted individually while keeping the parameters associated with the other layers fixed, and thereafter another layer is selected so that its parameters can "float," while keeping the parameters for the other layers fixed, and so on, until all layers are adjusted individually. The layer-by-layer tuning process may then be repeated multiple times, each time looping through all layers, until a degree of optimization is obtained, and at this point, complete optimization of all layers may be performed, i.e., enabling the model parameters for all layers to be varied/"floated", based on the recognition that with the parameters associated with each layer that has been optimized separately, full optimization will converge more efficiently (and the error metric may be more than locally minimized). Still further, the entire layer-by-layer procedure may be repeated to further refine the results, i.e., perform layer-specific optimization by cycling through the layers one or more times, and then perform global optimization, thereby enabling all layers of model parameters to float. Note that in the context of fig. 3, certain model parameters are selected and made "floating" (and thus optimized individually for a particular layer), while other model parameters are kept fixed, which would be done as part of the parameter modification operations 360 of fig. 3 (both in these and similar classes of implementations).

As a specific example to illustrate the foregoing single layer-by-layer optimization procedure, consider the case of modeling the etching of a layer below an etch mask, where the etch mask layer and the layer below it are etched to some extent. This therefore constitutes a 2-layer etch model in which the parameters of each of the two layers can be optimized separately before the model parameters corresponding to the two layers are optimized completely simultaneously.

Thus, starting with specifying the values of all model parameters, the model is run to generate a calculated etch profile for all sets of values of the input parameters (representing different experimental etch conditions), and for all profiles corresponding to sets of values of the independent input parameters, an error metric is calculated that represents the difference between the experimental etch profile and the calculated etch profile. The process can then proceed by selecting a layer (e.g., dielectric layer) under the etch mask for individual layer-specific optimization, modifying one or more model parameters associated with the (dielectric) layer for optimization, re-running the model for all sets of values of independent input parameters, calculating a new error metric, again modifying one or more model parameters associated with the dielectric layer, re-running the model, re-calculating the error, and so forth, until a local minimum of error with respect to the dielectric layer is obtained.

The model parameters of the dielectric layer are then kept fixed at these values, the model parameters of the etch mask layer are selected for individual optimization, one or more of their values (of the etch mask layer) are modified, the model is re-run, the errors are re-calculated, etc., until a local minimum of errors with respect to the etch mask layer is obtained. At this point, a full optimization of the model parameters for both layers may be performed, or in some embodiments, one or more additional cycles of separate dielectric and mask layer optimizations may be performed before so performing, such that the full optimization is more efficient (e.g., converges faster, or converges to a better final local minimum of the overall error metric).

It should also be understood that in some cases, the aforementioned layer-by-layer optimization procedure does not necessarily have to be limited to adjusting only a single individual layer at a time. For example, if the etching of a 6-layer stack is being modeled, one variation of the aforementioned layer-by-layer optimization procedure would be to select pairs of layers to adjust simultaneously (i.e., to float the model parameters corresponding to pairs of adjacent layers simultaneously) and to proceed sequentially for 3 pairs, possibly repeating a 3-step cycle multiple times, then performing a fully synchronous optimization on the model parameters for all layers; as previously mentioned, optionally, the entire layer-by-layer procedure (or in this case, two layers in two) is repeated until a local minimum of error for the entire stack is determined.

It is also possible that a numerical optimization procedure (whether performed layer-by-layer prior to full optimization or directly on all layers) may result in multiple local minima in the etch profile dimension, depending on the starting point of the optimization (i.e., depending on the initial values selected for the model parameters) and other factors, so there may be many local minima, which the optimization procedure may potentially identify as representing an improved (and/or optimal) model. In many cases of local minimum error, many potential sets of model parameters may be eliminated from consideration by defining physically true upper and lower bounds for these model parameters. In some embodiments, the aforementioned numerical optimization may be performed for multiple selections of starting points (initial values of model parameters) to potentially determine multiple local minima, and thus multiple candidate sets of model parameters, from which the most preferred may be selected (which is possible in some embodiments because it has the lowest calculated error metric of all candidates that satisfy the aforementioned physically true upper and lower limits).

Deviational sum principal component analysis

In some embodiments, the etch profile model outputs values at a large number of grid points/dots (cells) at each time step during the calculation of the evolution of the etch profile. These values corresponding to each cell or grid point map out the shape of the calculated etch profile. An example of such grid points/dots representing a calculated etch profile is shown in fig. 1, where each grid point/dot has a value that indicates whether the region of space at that time was occupied by the feature during the etching process. In some embodiments, the vertical dimension (vertical dimension) of the grid representing the etched profile is at least about 5, or at least about 10, or at least about 20. Depending on the implementation, the minimum value of the vertical distance between vertically adjacent dots may be chosen to be 1 angstrom, and may be as large as a few angstroms, such as 5 angstroms, or 10 angstroms, or even 20 angstroms.

In practice, it is desirable to select the distance between adjacent mesh points/points to be small enough to provide a reasonably accurate representation of the feature shape as it evolves over time (which may depend on the complexity of the contours), but not much less than (or less than) that necessary to achieve such a reasonable representation (since more mesh points require more computation time). Based on the same considerations, the horizontal spacing (in the wafer plane) between adjacent grids/grid points will be selected, but typically the horizontal spacing and vertical spacing will be selected to be the same (i.e., a uniform grid) or approximately equivalent. However, this does not mean that the vertical and horizontal grid dimensions are necessarily the same, as the width of the modeled feature is not necessarily the same as the height of the modeled feature. Thus, the horizontal dimensions (number of horizontal points spanning a given direction, x-dimension in 2D, x-dimension and y-dimension in 3D) may depend on whether only the sidewalls of a feature are modeled, whether the entire feature is modeled (which spans from one contour edge to another contour edge), whether multiple neighboring features are modeled, etc.

As described above, the grid of values output by the etch profile model provides estimates for the location of the edges of the feature profile at different vertical heights in physical space. From this information (from these values at the mesh points), the feature widths at different heights can be calculated, or in another view, the horizontal coordinates of the edges (relative to some baseline) are calculated for each height. This is shown in fig. 2. The set of coordinates can then be considered as a point in the multi-dimensional space representing a particular feature profile. The vector space may be an orthogonal space, or it may be a non-orthogonal space, however, the representation may be linearly transformed into an orthogonal space. If so, the coordinates of the transformed points are distances relative to sets of orthogonal axes in the space. In any case, when referring to "contour coordinates" in this document, this generally refers to any suitable (approximate) mathematical representation of the contour shape.

In any case, because the etch profile model may output a large number of "profile coordinates" (hereinafter comprising a grid/mesh of points as just described), and the goal is to match them exactly to the measured experimental etch profile, reducing errors in the etch profile model (iteratively reducing errors combined under different experimental process conditions as described above with reference to fig. 3) may be a computationally demanding task. For example, if a set of m measured experimental etch profiles is to be point-by-point matched to a calculated etch profile consisting of n profile coordinates, this is equivalent to optimizing the model to fit the data set m x n data points.

However, as a result, there are potential statistical correlations in the etch profile (whether measured or calculated) and these correlations can be exploited for optimization to rewrite the optimization problem in a form that is numerically far easier to handle. For example, while a fine grid of contour coordinates may consist of many data points, the values of certain combinations of these coordinates are related to each other from a statistical standpoint. To give a simple but illustrative example, vertically adjacent coordinates will tend to correlate to each other simply because the width of the etched feature will not change excessively for the short length range associated with adjacent grid points as the profile is moved up and down. More complex examples of the correlation between the profile coordinates relate to the type of profile shape, which can typically be achieved by changing some combination of process coordinates. Figure 5 shows several embodiments. For example, certain process parameters may be adjusted, either alone or in combination with one another, as shown in fig. 5, to cause the etch profile to bow inward or outward, and the profile coordinates (or grid points) that map such bowing of the profile are thus statistically related to one another. Also, as also shown in FIG. 5, an etch profile obtained by adjusting various process parameters, either individually or in combination, may exhibit a downward or upward taper, and thus, profile coordinates may be correlated to the extent to which changing one or more process parameters tends to cause such a taper effect. Two other examples of base profile related structures are a top taper and a bottom taper, as also shown in fig. 5. Again, these base profile structures are a manifestation of the following facts: variations in process parameters tend to cause variations in the overall shape of the profile rather than having a local effect at some points on the profile without affecting other points. This is of course a consequence of the underlying physics and chemistry associated with the etching process.

As described above, due to these underlying statistical correlations, the optimization problem presented above (described with respect to the flow chart in FIG. 3) can be adapted in a form that is more suitable for iterative optimization techniques. One way to do this is to identify several types of canonical profile shapes and express the measured etch profiles and/or calculate the etch profiles from these canonical shapes, for example by writing the total profile (at each profile coordinate) as a weighted average of the set of canonical profile shapes (at each profile coordinate). That is, the set of vectors represents the canonical profile shape, and the overall profile may be approximately represented as a linear combination of these vectors. In this way, rather than modeling the changes in all individual contour coordinates, the underlying statistical correlations and model changes in the coefficients/weights representing the linear combination of contours can be utilized. For example, if bow and cone (see fig. 5) are chosen as canonical shapes, the problem of modeling the contour coordinates, e.g., m-100, is ascribed to a modeling change of 2 coefficients for bow and cone in a linear combination, i.e., resulting in a dimensionality reduction from 100 to 2. Which canonical shapes are useful may depend on the process/layer type. The depicted method provides a numerical way to extract these shapes from experimental data or from simulations performed with EPM.

For this strategy to work, the canonical shapes must provide a good (although not exact) representation of the different contour shapes involved in the analysis. The more independent the canonical shapes included in the representation, the more accurate the representation will be (in the vector space of the canonical shapes). Thus, the problem becomes what canonical shape to use, and how much to include, recognizing that including more canonical shapes makes the analysis more accurate, but also makes it more computationally expensive, and in the context of iterative optimization, it may affect the ability of the optimization to converge or converge to a desired local minimum.

One way to do this is for process engineers to identify, based on their past experience, several types of canonical profile shapes that they observed often occur in their etch experiments. The advantage of this method is simplicity. A potential drawback is that it is ad hoc (based solely on the experience and intuition of the process engineer) and it does not provide any way of determining when a sufficient number of profile shapes have been included in the analysis. In practice, any canonical profile shape identified by the process engineer will be included, but this may of course not be sufficient to provide an accurate representation. More importantly, this type of method does not identify new correlations in previously unrecognized profile data, either because the shape was not apparent in previous work, or because it is the result of a new etch process occurring with a different underlying physical and chemical process.

Another approach is to build the dimension reduction program into a statistical approach that can automatically identify important canonical contour shapes, and provide an estimate of how many shapes need to be included in order to provide a sufficiently accurate representation. One data analysis technique used to implement this is Principal Component Analysis (PCA), which uses Singular Value Decomposition (SVD), a matrix decomposition technique from numerical linear algebra. A description of PCA technology and various applications can be found, for example, in the following documents: jackson, J.E., "A User's Guide to Principal Components," John Wiley and Sons, p.592.[2] (1991); jolliffe, I.T., "Principal Component Analysis," 2nd edition, Springer (2002); krzanowski, W.J. "Principles of Multivariate Analysis: A User's Perspectral," New York: Oxford University Press (1988); each of which is incorporated by reference herein in its entirety for all purposes.

As described in the aforementioned references, PCA takes as its input a set of vectors (in this case, each vector is a series of etched profile coordinates representing a single profile), and outputs a new set of n orthogonal vectors called Principal Components (PC) that can be stored, such that PC 1-i (where i ≦ n) constitutes the "optimal" ith dimension subspace for representing the input profile vectors; "best" means statistically optimal in the least squares sense, i.e., the i-th dimension subspace of a PC determined from PCA minimizes the combined RMS error between each input vector and its linear representation in the selected subspace of the PC. Of course, the more PCs involved, the larger the size of the subspace and the better the representation of the input contour data; however, since the subspace constructed by PCA is optimal, it is expected that many PCs are not required, and the statistical variation in the underlying data captured by adding additional PCs can be evaluated by the singular values of the underlying SVD. Thus, by using PCA to identify the canonical profile shape that underlies the data set of the etch profile, a linear model can be constructed that represents the dimensionality reduction of the etch profile and does so in a manner that is automated (independent of the process engineer's expertise) and has the ability to identify new correlations in the profile data, and in a manner that statistically estimates how many of the shapes/dimensions are needed to provide a good representation.

The result of the foregoing method is that significant dimensionality reduction can be achieved without significantly compromising statistical error, and the number of data points required to accommodate the numerical optimization procedure described above can be significantly reduced. It is also noted that there are different possible strategies for implementing the reduced-dimension PCA procedure in the optimization procedure shown in fig. 3. For example, in the context of the manner in which the error metric is calculated in operation 340 of fig. 3, with reference to fig. 3, one way to employ a dimension reduction procedure is to separately project the calculated etch profile and the corresponding experimental etch profile onto a reduced-dimension subspace (which may be constructed by PCA), and then calculate the difference between the profiles projected onto that subspace. Another way is to take the difference between the calculated etch profile and the corresponding experimental etch profile, project the difference onto a reduced-dimensional subspace representing the potential difference between the experimental etch profile and the calculated etch profile, and consider the total error metric as the combined length of these vectors in this difference-subspace.

It is further noted that PCA can also be used to dimensionally reduce the number of independent input parameters' arguments in space, providing similar benefits as just described. In some embodiments, the dimension reduction procedure may be applied to both the profile coordinate space and the input parameter space simultaneously, such as, for example, by performing PCA on the input parameters and corresponding concatenated vectors of the measured etch profile.

Application of optimized computerized etch model

The optimized computerized etch model disclosed herein may be useful wherever detailed evaluation and characterization of an etch process is required in a semiconductor processing workflow. For example, if a new etch process is being developed, the model may be used to determine etch profile characteristics for many combinations of process parameters without having to enter a laboratory and perform each experiment separately. In this manner, the optimized etch profile model may enable faster process development cycles and, in some embodiments, may significantly reduce the amount of work required to fine tune the target profile.

Lithography operations and mask development can also benefit much from accurate etch profile modeling, since estimating edge placement errors is often of considerable importance in lithography work, and accurate calculation of profile shape provides this information.

The optimization model disclosed herein can also be used to solve the interaction problem: where a specific target etch profile is desired and it is desired to find one or more specific combinations of process parameters (or EPM input parameters) for achieving it. Again, this can be done by trial and error, but the exact modeling of the etch profile resulting from a given set of process parameters (or EPM input parameters) and conditions can replace the required experiments, or at least in the initial phase of exploring the process/input parameter space, until good candidates can be identified for a complete experimental study. In some embodiments, it may be feasible to actually invert the model numerically in a fully automated manner, i.e., iteratively locate the set of parameters that generate a given etch profile. Again, the dimensionality reduction of the etch profile coordinate space (via PCA) and the projection of the desired etch profile onto that space can make the numerical inversion more feasible.

In certain embodiments, the optimized EPM may be integrated with an etcher device or integrated into the infrastructure of a semiconductor manufacturing facility that deploys one or more etcher devices. The optimized EPM can be used to determine appropriate adjustments to process parameters to provide a desired etch profile or to understand the effect of changes in process parameters on the etch profile. Thus, for example, a system for processing a semiconductor substrate within a manufacturing facility may include an etcher device for etching the semiconductor substrate, the operation of the etcher device being regulated by a set of independent input parameters controlled by a controller implementing optimized EPM. As described below, a suitable controller for controlling the operation of the etcher apparatus generally includes a processor and a memory, the memory storing an optimized EPM, and the processor using the stored EPM to calculate an etch feature profile for a given set of values of a set of input process parameters. After calculating the profile, in some embodiments, the controller can adjust operation of the etcher apparatus (in response to the shape of the calculated profile) by changing one or more values of the set of independent input parameters.

In general, the etcher device that may be used with the disclosed optimized EPM may be any kind of semiconductor processing device suitable for etching a semiconductor substrate by removing material from a surface of the semiconductor substrate. In some embodiments, the etcher apparatus constitutes an Inductively Coupled Plasma (ICP) reactor; in some embodiments, it may constitute a Capacitively Coupled Plasma (CCP) reactor. Accordingly, an etcher apparatus for use with the disclosed optimized EPM may have a process chamber, a substrate holder for holding a substrate within the process chamber, and a plasma generator for generating a plasma within the process chamber. The apparatus may further comprise one or more valve-controlled process gas inlets for flowing one or more process gases into the process chamber, one or more gas outlets fluidly connected to one or more vacuum pumps for exhausting gases from the process chamber, and the like. Details regarding the etcher apparatus (also commonly referred to as an etch reactor or plasma etch reactor, etc.) are provided below.

Optimization of etch profile models by reflectance spectrum matching techniques

The Etch Profile (EP) model (EPM) optimization techniques disclosed herein may also be performed in the reflected spectral space or in a reduced-dimension subspace (RDS) generated from the spectrally reflected space. In other words, EPM optimization is performed by matching the calculated reflectance spectrum (generated with EPM), each spectrum representing the intensity of electromagnetic radiation reflected from etched features on the substrate surface at a range of wavelengths, to the experimentally measured reflectance spectrum. The set of reflectance spectra used for optimization (both the spectra generated by EPM and the experimentally measured spectra) may also correspond to a series of etch time steps (i.e., representing different time snapshots of one or more etch processes). As discussed in detail above, because the etch profile evolves over time during the etch process, EPM typically calculates a theoretical etch profile, and thus by including the reflection spectra from the different etch time steps in the optimization, the optimization model is statistically valid for the etch time series used in the optimization.

The Spectral Matching (SM) optimization process follows the general EPM optimization framework described above, for example with reference to fig. 3, except that the SM optimization operates on spectral reflectance rather than on etch profile coordinates. To this end, because the typical output of an EPM is a calculated etch profile represented by a series of etch profile coordinates, a calculated reflection spectrum is generated by simulating the reflection of electromagnetic radiation (EM) from the calculated etch profile. "rigorous coupled wave analysis" (RCWA), as known in the art, constitutes one type of computational program that may be used for this purpose, but any suitable process for simulating the interaction of EM radiation with the substrate features under consideration may be employed.

In any case, with the ability to generate a reflectance spectrum from the EPM, a general process can be implemented to optimize the EPM based on spectral reflectance. Described now with respect to fig. 6, a flow diagram is presented illustrating a set of operations 601 for adjusting and/or optimizing an etch profile model.

As described above, and in some embodiments, such adjusted and/or optimized models reduce (and in some cases substantially minimize) a metric associated with (indicating, quantifying, etc.) a combined difference between an etch profile measured as a result of performing an etch experiment and a corresponding calculated etch profile generated from the model. In other words, the improved model may reduce the combined error under different experimental process conditions (as specified by different sets of specified values for the selected process parameters that are used to calculate the individual input parameters to the EPM).

As shown in FIG. 6, a reflectance spectrum based optimization process 601 begins at operation 610, where a set of model parameters to be optimized and specifications of their initial values are selected, again, these model parameters may be selected as parameters (reaction probabilities, sticking coefficients, etc.) characterizing the underlying chemical and physical processes, some or all of which will be adjusted based on experimental data to refine the model. The initial values may be values found in the literature, which may be based on other simulation calculations, determined from experiments, or known from previous optimization processes, etc.

The model parameters selected and initialized in operation 610 are then optimized by the sets of independent input parameters selected in operation 620 and assigned to the sets of values. Such independent input parameters may include, for example, parameters characterizing the plasma within the reaction chamber: temperature, etchant flux, plasma density, etc. For each combination of values of the independent input parameters, an etch experiment is performed to measure an experimental etch reflectance spectrum in operation 630. (in some embodiments, for example, multiple etch experiments are performed for the same combination of values of the input parameters, and the resulting reflectance spectrum measurements are averaged together (possibly after discarding outliers, noise spectra, etc.)) this set of benchmarks is then used to adjust and optimize the model as follows: in an operation 635, a set of calculated reflectance spectra is generated by running the EP model to produce an etch profile, and then converting the calculated etch profile to spectral reflectance as described above (e.g., by using RCWA), which corresponds to the measured spectra from operation 630, and is thus generated for each combination of values of the input parameters. In this regard, there is a corresponding experimental reflectance spectrum and calculated reflectance spectrum generated from each set of selected values of the independent input parameters, and suitable for comparison. A comparison is made in operation 640 in which an error metric indicative of (associated with, quantified, etc.) the difference between the experimental reflectance spectrum and the calculated reflectance spectrum is calculated for all of the different sets of values of the input parameter.

Similar to that described above with reference to fig. 6, the set of calculated reflectance spectra (from which the error metric is calculated) correspond to a set of previously selected model parameters as specified in operation 610. The goal of the optimization process is to determine a more efficient selection of these model parameters. Thus, in operation 650, it is determined whether the currently specified model parameters are such that the error metric calculated in operation 640 is locally minimized (in terms of the space of model parameters), and if not, the set of one or more values of the model parameters is modified in operation 660 and then used to generate a new set of reflection spectra (operation 635 is repeated, as schematically indicated in the flowchart of fig. 6), and then a new error metric is calculated in repeated operation 640. The routine then proceeds to operation 650, where it is determined whether the new combination of model parameters represents a local minimum for all of the sets of input parameters evaluated by the error metric. If so, the optimization process ends, as shown. If not, the model parameters are modified again in operation 660 and the loop is repeated.

If it is desired (in the manner described above) to optimize the EPM for different durations of the etching process or to optimize the EPM to calculate the reflectance spectrum in a time series over the course of the etching process, then the extent to which the experimental reflectance spectrum for optimizing the EPM can be accurately determined from optical measurements over the course of the etching process is taken into account. A related issue is the rate at which these measurements can be performed during the etching process.

In general, the measurement of spectral reflectance can be performed in situ or ex situ. Ex situ measurements are typically more accurate due to the use of an external dedicated metrology tool (external to the etch chamber), but such measurements require the wafer to be removed from the etch chamber and the etch process to be stopped accordingly to take advantage of the tool. Since stopping and restarting an etch process relative to a continuous duration etch process will result in various systematic errors, accumulating reflectance spectra ex situ for a sequence of different etch times typically involves etching a sequence of different wafers, each wafer for a different desired duration, and then measuring the reflectance from each wafer separately. On the other hand, in-situ spectral reflectance measurements may be performed continuously (or substantially continuously, or at least rather rapidly) without interrupting the ongoing etch process, so that a single wafer may be used to generate a reflectance spectrum corresponding to the etch time sequence (which also eliminates (or at least reduces) the likelihood of wafer-to-wafer (wafer-to-wafer) variations that are interpreted as indicative of the etch time dependence of the reflectance spectrum). However, in addition to wafer-to-wafer variations, in-situ spectral reflectance measurements tend to be less accurate than when using dedicated external measurement tools for a variety of reasons.

However, disclosed herein are ways to obtain the advantages of both ex situ and in situ spectral reflectance measurements (at least to some extent) without their respective disadvantages. Specifically, the strategy is to optimize EPM using experimental reflectance spectra resulting from rapid in-situ spectral reflectance (optical) measurements made during an ongoing etch process (at the etch time sequence desired to optimize EPM) calibrated using ex-situ measurements made with a dedicated metrology tool.

This may be done as follows. One or more wafers are etched for a duration that encompasses the desired etch time sequence, and the spectral reflectance optical measurements are taken in situ throughout the ongoing etch process. The measurement rate may be very fast, e.g. with a frequency of 1Hz, 2Hz, 5Hz, 10Hz, 15Hz, 20Hz, 50Hz or even 100 Hz. In some embodiments, the optical measurements taken at successive etch times in at least a portion of the sequence of etch times are spaced apart by 0.01-1 second (i.e., having a frequency of 100Hz to 1 Hz), or by 0.05-0.5 seconds (i.e., having a frequency of 20Hz to 2 Hz). Separately, a group of wafers are etched for different specified etch durations, and after each etch process is complete, the wafers are removed from the process chamber in which they were etched, and the reflectance spectra are optically measured ex situ using a dedicated external measurement tool. The in-situ measurements for different etch times are then calibrated by comparing them to ex-situ measurements for corresponding durations and adjusting the in-situ reflectance spectral intensities accordingly. These reflection spectra resulting from in-situ optical measurements calibrated with ex-situ optical measurements may then be used in the EPM optimization described with reference to fig. 3.

An optimization process may also be performed with respect to a Reduced Dimension Subspace (RDS), similar to that performed with respect to an etch profile space, but in this case with respect to a reduced dimension spectral space, involving the use of RDS to calculate an error metric that is minimized (typically locally, or approximately) in the optimization. One way to construct the RDS is by PCA, whereby instead of PCA in the etched profile coordinate space as described above, PCA can be performed over the full space of spectral reflections. In doing so, a significant reduction in the dimensions of the spectral space can be achieved without significantly compromising statistical errors in the numerical optimization. Here, PCA can identify important canonical spectral shapes, and it also (as described above) provides an estimate of how many shapes should be included to obtain a degree of desired statistical accuracy. In this way, the number of data points required for fitting during numerical optimization can be significantly reduced, and convergence of numerical optimization is achieved more quickly, as when done in the etched profile coordinate space.

Similarly, and similar to the case of optimization in the etch profile coordinate space, it should also be noted that there are different possible strategies for implementing the use of RDS, such as the optimization program presented in fig. 6, whether RDS is constructed via PCA, or PLS (described below), or otherwise. Thus, for example, in the context of the manner in which the error metric is calculated in operation 640 of FIG. 6, one way to employ the dimensionality reduction process is to project the calculated spectral reflection and the corresponding experimental spectral reflection separately onto the RDS and then calculate the difference between the reflected spectra projected onto the subspace. Another way is to obtain the difference between the calculated reflectance spectrum and the corresponding experimental reflectance spectrum and then project the difference onto a reduced-dimension subspace representing possible differences between the experimental reflectance spectrum and the calculated reflectance spectrum; the total error scale is then considered as the combined length of these vectors in the difference-subspace (of the reflection spectrum).

Another way to construct an RDS is to simply select a specific set of spectral wavelengths and treat these (selected wavelengths) as the basic set of spectral wavelengths for the RDS, rather than performing PCA. In doing so, projecting the two reflection spectra onto the RDS and calculating their difference (in RDS) is equivalent to calculating the difference in intensity of the reflection spectra at these particular wavelengths, and summing the differences, for example, will make the error scale a number proportional to the Root Mean Square (RMS) error (with respect to these wavelengths). Generalizing this, the error scale may be given as a weighted sum of quantities monotonically related to the magnitude of the difference between the corresponding experimental and calculated reflection spectra at a particular selected wavelength.

Furthermore, if the experimental and calculated reflectance spectra to be compared in the optimization procedure correspond to a sequence of different etch times, an additional criterion defining the RDS may be the selection of these particular etch times. Thus, in such embodiments, the RDS is determined based on the selection of a particular spectral wavelength and the identification of a particular etch time at which that wavelength is considered. Furthermore, in some such embodiments, different wavelengths and etch times may be weighted differently in the calculation of the error metric. Thus, for example, if the spectral data at certain etch times is more probative than the spectral data at other etch times, the former(s) may be weighted more heavily (i.e., the weight of a particular wavelength at a particular etch time may be set greater than the (some of the) weights corresponding to the same wavelength at other etch times). Additionally (or alternatively), reflection spectra of different wavelengths may be weighted differently in the analysis, even at the same etch time.

Another alternative to constructing RDS is to perform a Partial Least Squares (PLS) analysis. PLS analysis utilizes the following principles: the (reflectance) spectral history of the etch profile due to its evolution during the etch process predicts the etch profile in the etch process and/or at the end of the etch process. A plot is provided in FIG. 7A, which shows a plot corresponding to 4 consecutive times (t) during an etch process₀、t₁、t₂And t_EP('EP' denotes the final etch profile of the feature)) of 4 reflection spectra that correlate with the feature (shown on the right in the figure) as it is etched down. As can be seen from the figure, the reflectance spectrum changes as the profile of the feature changes during the etching process, and therefore a statistical model can be generated by PLS analysis which relates the geometric coordinates of the feature etch profile at the end of the etch process to those at the end of the etch processThe various reflection values of a particular wavelength at a particular time earlier in time are correlated. The PLS analysis can identify which spectral wavelengths and at what time earlier in the etch process best predict the final etch profile, and the model can also evaluate the sensitivity of the final etch profile to these wavelengths and/or times. These spectral wavelengths at a particular time may then be designated as the base set of RDS for which the optimized EPM is performed. Furthermore, the determination of the relative statistical significance of these specified wavelengths at a particular time by PLS analysis provides a basis for weighting them more heavily in the numerical optimization of EPM, for example by defining statistical weights in the error metric.

In other words, PLS analysis of the relationship of the geometric etch profile coordinates to the reflection spectrum earlier in the etch process can be used to identify sensitive spectral regions during the etch process from which valid RDS can be constructed, and the relative statistical weights given to these identified wavelengths at the identified previous etch process times can be used in the calculation of the error metric for which EPM parameter optimization can be performed. Note that using such RDS for EPM optimization would likely be effective because it is for a statistically significant region of the spectral space (which is a function of etch time).

The aforementioned PLS analysis and the resulting PLS model (which provides a strategy for differentially weighting particular spectral wavelengths, etch times, etc.) would be statistically robust if constructed from etch process data (sets of reflection spectra and corresponding etch profile coordinates for different etch times) collected over many different wafers subjected to a range of etch process conditions that may roughly correspond to the range of process conditions under which the model parameters of the EPM would be optimized (using RDS). FIG. 7B schematically shows sets of reflectance spectrum data collected over a number of wafers in the form of 3-D data blocks, where the 3 indices of the data blocks correspond to the number of wafers (i), the spectral wavelength (j), and the etch process time (k). As shown, the 3-D block of data may be "unrolled" into a 2-D "X" block of data having a size K times J, where K is the number of time points and J is the number of wavelengths. (the step size of the concatenated data vector is the number of wavelengths J.) these are arguments into the PLS analysis. As shown, the dependent variable for PLS analysis is in a 2-D "Y" data block, which contains the final N geometric etch profile coordinates for each of the l number of wafers as shown. From this overcomplete training data set, PLS analysis builds a regression model to predict the dependence of the final etch profile coordinates on the reflectance spectrum data at intermediate times during the etch process.

Note that while such etch profile and spectral reflectance data may be experimentally measured (to serve as a training set for the PLS model) by performing the etch process (and measuring reflectance) on a series of different wafers, such experiments can be expensive and time consuming. However, if the EPM is already of sufficient accuracy, e.g. optimized by the above process, a more efficient process may be to generate the etching data sets using the EPM and use them to build/train the PLS model. In principle, a combination of experimental etch profiles and experimental spectral reflectance data and computer-generated etch profiles and computer-generated spectral reflectance data may also be used.

In any case, using the computer generated reflection spectrum to construct the PLS model suggests an iterative process whereby a training set of reflection spectra is generated for PLS analysis using (possibly) non-optimized EPMs, and the resulting PLS model can then be used to identify RDS (with statistical weights) for returning to the original EPM and optimizing it. The new optimized EPM can then in turn be used to generate a new set of etch data to construct a new (and better) PLS model that identifies a new RDS for further optimizing the EPM, and so on. The process may continue in this manner (back and forth between EPM optimization and PLS optimization) for some predetermined number of iterations, or until no significant improvement in PLS and/or EP models is found in subsequent iterations. The change starts with and proceeds from EPM optimized by any of the optimization techniques described above (e.g., not involving the PLS process). Another variation is to use some experimentally measured sets of etch process data to construct an initial PLS model independent of the EPM and then proceed to identify the RDS for optimizing the initial EPM. Other variations of these general subjects and combinations thereof will be apparent to those skilled in the art in view of the foregoing discussion.

This previously described iterative method is schematically illustrated in fig. 8. As shown in fig. 8, the routine 801 for generating an optimized PLS model begins with an operation 810 that receives an initial set of reflectance spectra and a corresponding set of etch profiles, both of which correspond to a series of etch process durations. The etch time series may represent different times in the course of the etch process, or the etch time series may represent an etch process of different total etch durations (in other words, an etch process performed to completion, except on different substrates at different total etch times). In any case, the initial training set of reflectance spectra (corresponding to the etch time series) may have been experimentally measured, generated with an un-optimized EPM, or generated using an EPM optimized by another process such as the one described above (e.g., a process that does not involve PLS). After receiving the training set, PLS analysis is performed to generate an initial PLS model in operation 820. The PLS model correlates the coordinates of the etch profile (received in operation 810) with the reflectance spectrum (also received in operation 810). In a particular embodiment, the PLS analysis generates a regression model that represents the dependence of the etch profile coordinates at a later etch time of the etch process or even at the end of the etch process on certain wavelengths of the reflection spectrum at a particular time earlier in the etch process, as described above, and the statistical sensitivity of that dependence.

The initial PLS model may be sufficiently accurate for some purposes and if it is determined in operation 830 that this is the case, the optimization procedure ends. However, if in operation 830 the PLS model is deemed to be insufficiently accurate, the procedure 801 continues to operation 840, where the current PLS model (as constructed in operation 820) is used to determine (statistically significant) Reduced Dimensional Subspaces (RDS) and statistical weights for defining valid error metrics (as described above). The new statistically weighted spectral error metric is then used in operation 850 to optimize the EPM model according to, for example, the EPM optimization process described with respect to fig. 6. Such a statistically weighted error measure may be used (in an optimization such as fig. 6) as an effective measure of the difference between an EPM-calculated reflection spectrum and a corresponding measured reflection spectrum in a spectral subspace (of the plenary spectral space), which difference is considered statistically significant by the PLS program.

The EPM optimization process may use the same spectral data as used in operation 820, or it may use different spectral data (but, again, it is optimized with the new spectral error metric defined in operation 840). In any case, once the EPM is optimized (at operation 850), it can be used to generate a new (perhaps very extensive) set of calculated reflectance spectra. This is accomplished by generating a set of calculated etch profiles in operation 860 and then using these calculated etch profiles in operation 865 to generate a set of calculated reflection spectra (e.g., by using RCWA as described and shown above). The spectra may then be fed back as a training set of spectra to operation 820, in which a new PLS model is generated based on the new (possibly quite extensive) training set. Evaluating the statistical accuracy of the new PLS model in operation 830; and the loop of operations (840, 850, 860, 865, 820 and 830) may continue repeatedly until, in one of the repetitions of operation 830, the PLS model is deemed to have sufficient statistical accuracy.

It should be noted that such a PLS model is useful for optimizing EPM models (by identifying "good" RDS), while it is also independently useful for etch endpoint detection processes, such as those described in co-pending U.S. patent application attorney docket No. LAMRP230 (incorporated herein by reference in its entirety for all purposes). For example, as described above, the PLS model may be viewed as a statistical determination as to which spectral regions are more/most predictive of the final etch profile resulting from the etch process during the course of the etch process. Thus, the construction of the PLS model is effectively a sensitivity analysis that identifies which spectral regions can be monitored during the course of the etch process to determine when the feature profile is sufficiently etched (i.e., for endpoint detection). It should therefore also be noted that optimizing the EPM model by statistical weighting of the optimization in favor of those spectral regions (as a function of etch time) that are important in the PLS model, in addition to potentially leading to more efficient EPM optimization, also has the benefit of enhancing the statistical accuracy of the PLS sensitivity analysis, since the PLS model is thus constructed from an etch profile data set generated from the EPM model, where the optimization of the EPM model is statistically weighted, the weighting in favor of the same regions of the spectral space (in the etch process) that are considered important by the PLS analysis.

Capacitively Coupled Plasma (CCP) reactor for etching operations

Capacitively Coupled Plasma (CCP) reactors are described in the following patents: U.S. patent No.8,552,334 entitled "adjusttable GAP CAPACITIVELY couppled RF plas REACTOR AND NON-CONTACT PARTICLE SEAL," filed on 9/2/2009, U.S. patent application No.12/367,754, AND U.S. patent application 14/539,121 entitled "ADJUSTMENT OF VUV EMISSION OF A PLASMA VIA collison resin resistance ENERGY TRANSFER TO AN engine ABSORBER GAS," filed on 12/11/2014, each OF which is incorporated herein by reference in its entirety for all purposes.

For example, fig. 9A-9C illustrate embodiments of an adjustable gap capacitively coupled confined RF plasma reactor 900. As depicted, the vacuum processing chamber 902 includes a chamber housing 904, the chamber housing 904 surrounding an interior space that houses a lower electrode 906. In the upper portion of the chamber 902, an upper electrode 908 is vertically spaced from a lower electrode 906. The planar surfaces of the upper electrode 908 and the lower electrode 906 (configured to generate plasma) are substantially parallel and orthogonal to the vertical direction between the electrodes. Preferably, the upper electrode 908 and the lower electrode 906 are circular and coaxial with respect to a vertical axis. The lower surface of the upper electrode 908 faces the upper surface of the lower electrode 906. The surfaces spaced apart from the opposing electrodes define an adjustable gap 910 therebetween. During plasma generation, lower electrode 906 is supplied with RF power from an RF power source (match) 920. RF power is supplied to the lower electrode 906 through an RF supply conduit 922, an RF strap 924, and an RF power member 926. A ground shield 936 may surround the RF power member 926 to supply a more uniform RF field to the lower electrode 906. As described in U.S. patent publication No.2008/0171444, the entire contents of which are incorporated by reference herein for all purposes, a wafer is inserted through a wafer port 982 and supported in a gap 910 above the lower electrode 906 for processing, and process gases are supplied to the gap 910 and excited into a plasma state by RF power. The top electrode 908 may be powered or grounded.

In the embodiment shown in fig. 9A-9C, the lower electrode 906 is supported on a lower electrode support plate 916. An insulating ring 914 interposed between the lower electrode 906 and the lower electrode support plate 916 insulates the lower electrode 906 from the support plate 916. An RF bias housing 930 supports the lower electrode 906 on an RF bias housing basin 932. The basin 932 is connected to a catheter support plate 938 by arms 934 of the RF bias housing 930 through openings in the chamber wall plate 918. In the preferred embodiment, the RF bias housing basin 932 and RF bias housing arm 934 are integrally formed as one piece, however, the arm 934 and basin 932 may be two separate pieces that are bolted or otherwise joined together.

The RF bias housing arm 934 includes one or more hollow passages for delivering RF power and facilities such as gaseous coolant, liquid coolant, RF energy, cables for lift pin control, electrical monitoring and activating signals from outside the vacuum chamber 902 to the space within the vacuum chamber 902 on the back side of the lower electrode 906. The RF supply conduit 922 is insulated from an RF bias housing arm 934, the RF bias housing arm 934 providing a return path for RF power to the RF power source 920. Utility conduit 940 provides access for utility components. Further details of the facility components are described in U.S. patent No.5948704 and U.S. patent publication No.2008/0171444 (both of which are incorporated by reference herein in their entirety for all purposes), and are not shown here for simplicity of description. The gap 910 is preferably surrounded by a confinement ring assembly (not shown), the details of which may be found in U.S. patent publication No.2007/0284045 (the entire contents of which are incorporated by reference herein for all purposes).

The conduit support plate 938 is coupled to an actuation mechanism 942. Details of the actuating mechanism are described in U.S. patent publication No.2008/0171444 (which is incorporated by reference herein in its entirety for all purposes). An actuation mechanism 942, such as a servo-mechanical motor, stepper motor or the like, is coupled to the vertical linear bearing 944 through, for example, a helical gear 946 (e.g., a ball screw) and a motor for rotating the ball screw. During operation to adjust the size of gap 910, actuation mechanism 942 travels along vertical linear bearing 944. Fig. 9A shows the arrangement when the actuation mechanism 942 is in a high position on the linear bearing 944 that creates a small gap 910 a. Fig. 9B shows the arrangement when the actuation mechanism 942 is in a position that is centered on the linear bearing 944. As shown, the lower electrode 906, RF bias housing 930, conduit support plate 938, and RF power source 920 are all moved downward relative to the chamber housing 904 and upper electrode 908, thereby creating a medium-sized gap 910 b.

Fig. 9C shows a large gap 910C when the drive mechanism 942 is in a low position on the linear bearing. Preferably, the upper electrode 908 and the lower electrode 906 remain coaxial during gap adjustment and the opposing surfaces of the upper and lower electrodes across the gap remain parallel.

For example, to maintain uniform etching across larger diameter substrates (e.g., 300 mm wafers or flat panel displays), the present embodiments enable the gap 910 between the upper electrode 908 and the lower electrode 906 in the CCP chamber 902 to be adjusted during a multi-step etch process recipe. In particular, this embodiment relates to a mechanism that facilitates the linear motion required to provide an adjustable gap between the lower electrode 906 and the upper electrode 908.

Fig. 9A shows the laterally deflected bellows 950 sealed proximal to the catheter support plate 938 and distal to the stepped flange 928 of the chamber wall plate 918. The inner diameter of the stepped flange defines an opening 912 in the chamber wall 918, and the RF biased housing arm 934 passes through the opening 912. The laterally deflected bellows 950 provides a vacuum seal while allowing vertical movement of the RF bias housing 930, the conduit support plate 938, and the actuation mechanism 942. The RF bias housing 930, the catheter support plate 938, and the actuation mechanism 942 may be referred to as a cantilever assembly. Preferably, the RF power source 920 moves with the cantilever assembly and may be connected to a catheter support plate 938. Fig. 9B shows the bellows 950 in an intermediate position when the boom assembly is in the intermediate position. Fig. 9C shows bellows 950 deflected laterally when the boom assembly is in the low position.

Labyrinth seal 948 provides a particle barrier between bellows 950 and the interior of plasma processing chamber housing 904. A fixed shield 956 is immovably attached within the inner wall of the chamber housing 904 at the chamber wall plate 918 to provide a labyrinth slot 960 (slit) in which a movable shield 958 moves vertically to accommodate vertical movement of the cantilever assembly. The outer portion of the movable shield 958 remains in the slot at all vertical positions of the lower electrode 906.

In the illustrated embodiment, the labyrinth seal 948 includes a fixed shield 956 attached to the inner surface of the chamber wall plate 918 at the periphery of the opening 912 of the chamber wall plate 918 that defines the labyrinth groove 960. A movable shield 958 is connected to the RF bias housing arm 934 and extends radially from the RF bias housing arm 934, wherein the arm 934 passes through the opening 912 in the chamber wall plate 918. The movable shield plate 958 extends into the labyrinth slot 960 while being spaced apart from the stationary shield 956 by a first gap and spaced apart from the inner surface of the chamber wall plate 918 by a second gap, thereby enabling vertical movement of the cantilever assembly. The labyrinth seal 948 prevents particles that flake off of the bellows 950 from entering the vacuum chamber interior and blocks the migration of radicals from the process gas plasma to the bellows 950 where the radicals can form deposits that subsequently flake off.

Fig. 9A shows the movable shield 958 at a higher position in the labyrinth groove 960 above the RF bias housing arm 934 when the cantilever assembly is in the high position (small gap 910 a). Fig. 9C shows the movable shield 958 at a lower position in the labyrinth groove 960 above the RF bias housing arm 934 when the cantilever assembly is in the low position (large gap 910C). Fig. 9B shows movable shield 958 at a middle or intermediate position within labyrinth groove 960 when the cantilever assembly is in the intermediate position (intermediate gap 910B). While the labyrinth seal 948 is shown as being symmetrical with respect to the RF bias housing arm 934, in other embodiments the labyrinth seal 948 may be asymmetrical with respect to the RF bias housing arm 934.

Inductively coupled plasma reactor for use in etching operations

Inductively Coupled Plasma (ICP) reactors are described in the following documents: U.S. patent publication No.2014/0170853 entitled "IMAGE REVERSAL WITH AHM GAP FILL FOR MULTIPLE patent", filed 12/10/2013, and U.S. patent application No.14/539,121 entitled "ADJUSTMENT OF VUV EMISSION OF A PLASMA VIA collisionary resant ENERGY TRANSFER TO AN ENERGY ABSORBER GAS", filed 11/12/2014, each OF which is incorporated herein by reference in its entirety FOR all purposes.

For example, FIG. 10 schematically illustrates a cross-sectional view of an inductively coupled plasma etching apparatus 1000, an example of which is a Kiyo (TM) reactor, produced by Lam Research Corp. of Fremont, Calif., suitable for practicing certain embodiments of the present invention. The inductively coupled plasma etching apparatus 1000 includes an overall processing chamber structurally defined by a chamber wall 1001 and a window 1011. The chamber wall 1001 may be made of stainless steel or aluminum. The window 1011 may be made of quartz or other dielectric material. Optional internal plasma grid 1050 divides the overall etch chamber into upper subchamber 1002 and lower subchamber 1003. In most embodiments, plasma grid 1050 can be removed, thereby utilizing the chamber space made up of subchambers 1002 and 1003. The chuck 1017 is positioned in the lower subchamber 1003 near the bottom interior surface. Chuck 1017 is configured to receive and hold a semiconductor wafer 1019 on which an etching process is performed. Chuck 1017 can be an electrostatic chuck for supporting wafer 1019 when wafer 1019 is present. In some embodiments, an edge ring (not shown) surrounds chuck 1017 and has an upper surface that is substantially planar with a top surface of wafer 1019 (when a wafer is present above chuck 1017). Chuck 1017 further comprises an electrostatic electrode for clamping and releasing the wafer. A filter and DC clamp power source (not shown) may be provided for this purpose. Other control systems may also be provided for lifting the wafer 1019 off the chuck 1017. Chuck 1017 can be charged with RF power source 1023. The RF power source 1023 is connected to the matching circuit 1021 through connection 1027. Matching circuit 1021 is connected to chuck 1017 through connection 1025. In this manner, the RF power source 1023 is connected to the chuck 1017.

The components for plasma generation include a coil 1033 located above the window 1011. The coil 1033 is made of a conductive material and includes at least one full turn. The example of the coil 1033 shown in fig. 10 includes three turns. The coil 1033 is shown symbolically in cross-section, and the coil 1033 with the "X" extends rotationally into the page, while the coil 1033 with the "·" extends rotationally out of the page. The components for plasma generation also include an RF power source 1041 configured to provide RF power to the coil 1033. Typically, the RF power source 1041 is connected to the matching circuit 1039 through a connection 1045. The matching circuit 1039 is connected to the coil 1033 through a connection member 1043. In this manner, the RF power source 1041 is connected to the coil 1033. An optional faraday shield 1049 is positioned between the coil 1033 and the window 1011. The faraday shield 1049 is maintained in a spaced relationship relative to the coil 1033. A faraday shield 1049 is disposed directly above the window 1011. The coil 1033, faraday shield 1049, and window 1011 are each configured substantially parallel to one another. The faraday shield can prevent metal or other materials from depositing on the dielectric window of the plasma chamber.

Process gases (e.g., helium, neon, etchant, etc.) may flow into the process chamber through one or more main gas flow inlets 1060 located in the upper chamber and/or through one or more side gas flow inlets 1070. Also, although not explicitly shown, similar gas flow inlets may be used to supply process gases to the capacitively-coupled plasma processing chamber as shown in FIGS. 9A-9C. Vacuum pumps, such as one-stage or two-stage dry mechanical pumps and/or turbomolecular pumps 1040, may be used to draw process gases from the process chamber and maintain pressure within the process chamber. The valve-controlled conduit may be used to fluidly connect a vacuum pump to the process chamber to selectively control the application of the vacuum environment provided by the vacuum pump. During an operating plasma process, this may be done using a closed loop controlled flow restriction device such as a throttle valve (not shown) or a pendulum valve (not shown). Similarly, the vacuum pumps and valves of FIGS. 9A-9C that are controllably fluidly connected to the capacitively coupled plasma processing chamber may also be used.

During operation of the apparatus, one or more process gases may be supplied through gas flow inlets 1060 and/or 1070. In certain embodiments, the process gas may be supplied only through the main gas flow inlet 1060 or only through the side gas flow inlet 1070. In some cases, the gas flow inlets shown in the figures may be replaced by more complex gas flow inlets, for example, by one or more showerheads. Faraday shield 1049 and/or optional grid 1050 can include internal channels and holes that allow process gas to be delivered to the chamber. One or both of faraday shield 1049 and optional grid 1050 can act as a showerhead for delivering process gas.

Radio frequency power is supplied to the coil 1033 from an RF power source 1041 to cause RF current to flow through the coil 1033. RF current flowing through the coil 1033 generates an electromagnetic field around the coil 1033. The electromagnetic field generates an induced current within the upper sub-chamber 1002. The physical and chemical interactions of the various ions and radicals generated with the wafer 1019 selectively etch features on the wafer.

If a plasma grid is used so that there are both the upper sub-chamber 1002 and the lower sub-chamber 1003, an induced current acts on the gas present in the upper sub-chamber 1002 to generate an electron-ion plasma in the upper sub-chamber 1002. Optional internal plasma grid 1050 limits the amount of hot electrons in lower sub-chamber 1003. In some embodiments, the apparatus is designed and operated such that the plasma present in the lower sub-chamber 1003 is an ion-ion plasma.

Both the upper electron-ion plasma and the lower ion-ion plasma may contain cations and anions, although the ion-ion plasma will have a greater anion to cation ratio. Volatile etch byproducts may be removed from lower subchamber 1003 through port 1022.

The disclosed chuck 1017 may operate at an elevated temperature range between about 10 ℃ to about 250 ℃. The temperature will depend on the etching process operation and the specific recipe. In some embodiments, chamber 1001 may also operate at a pressure in a range between about 1 mtorr and about 95 mtorr. In some embodiments, the pressure may be higher, as disclosed above.

The chamber 1001 may be coupled to a facility (not shown) when installed in a clean room or manufacturing facility. The facility includes piping that provides process gas, vacuum, temperature control, and environmental particulate control. These facilities are coupled to the chamber 1001 when installed at the target manufacturing plant. Additionally, the chamber 1001 may be coupled to a transfer chamber, allowing semiconductor wafers to be transferred in and out of the chamber 1001 by a robot using typical automation.

Fig. 10 also shows a system controller 1051. As described further below, such a system controller 1051 may control some or all of the operations of an etcher device, including adjusting the operation of an etcher in response to calculating an etch profile using optimized EMP generation as described herein.

System controller

The system controller can be used to control the etching operation (or other processing operation) in any of the above-described processing apparatuses, such as the CCP etcher apparatus shown in fig. 9A-9C, and/or the ICP etcher apparatus shown in fig. 10. In particular, the system controller may implement an optimized EPM as described above, and adjust the operation of the etcher apparatus in response to a calculated etch profile generated using the optimized EPM (as described above).

An example of a system controller in communication with an etcher apparatus is schematically illustrated in FIG. 10. As shown in fig. 10, the system controller 1051 includes one or more memory devices 1056, one or more mass storage devices 1054, and one or more processors 1052. Processor 1052 may include one or more CPUs, ASICs, general and/or special purpose computers, one or more analog and/or digital input/output connections, one or more stepper motor controller boards, and the like.

In some embodiments, the system controller (1051 in fig. 10) controls some or all of the operations of the processing tool (etcher apparatus 1000 in fig. 10), including the operations of its individual processing stations. Machine readable system control instructions 1058 may be provided to implement/perform the film deposition and/or etching processes described herein. The instructions may be provided on a machine-readable non-transitory medium that is couplable to and/or readable by the system controller. The instructions may be executed on processor 1052, and in some embodiments, system control instructions are loaded into memory device 1056 from mass storage device 1054. The system control instructions may include instructions for controlling timing, mixing of gas and liquid reactants, chamber and/or station pressure, chamber and/or station temperature, wafer temperature, target power level, RF power level (e.g., DC power level, RF bias power level), RF exposure time, substrate pedestal, chuck and/or susceptor position, and other parameters of a particular process performed by the processing tool.

Semiconductor substrate processing operations can employ various types OF processes including, but not limited TO, processes related TO film etching on a substrate, such as by plasma activated Atomic Layer Etching (ALE) operations involving surface adsorbed etchants, see U.S. patent application No.14/539,121 entitled "advanced OF VUV emisson OF A PLASMA VIA colossinal residues ENERGY TRANSFER TO AN ENERGY substrate GAS", filed 11, 12, 2014, which is incorporated by reference herein in its entirety for all purposes, deposition processes, such as Atomic Layer Deposition (ALD), which is performed by plasma activated surface adsorbed film precursors, and other types OF substrate processing operations.

Thus, for example, with respect to a substrate processing apparatus for performing a plasma-based etch process, machine readable instructions executed by the system controller may include instructions for generating a calculated etch profile from the optimized EPM and adjusting operation of the plasma generator in response to the calculated etch profile.

The system control instructions 1058 may be configured in any suitable manner. For example, various process tool component subroutines or control objects may be written to control the operation of the process tool components required to execute the processes of the various process tools. The system control instructions may be encoded in any suitable computer readable programming language. In some embodiments, the system control instructions are implemented in software, in other embodiments the instructions may be implemented in hardware, for example, hard-coded as logic in an ASIC (application specific integrated circuit), or, in other embodiments, as a combination of software and hardware.

In some embodiments, system control software 1058 may include input/output control (IOC) sequencing instructions for controlling the various parameters described above. For example, each stage of one or more deposition and/or etching processes may include one or more instructions for execution by a system controller. Instructions for setting the process conditions of the film deposition process stage and/or the etch process stage may be included in the respective deposition recipe stage and/or etch recipe stage, for example. In some embodiments, the recipe phase may be arranged in a sequence such that all instructions of a processing phase are executed concurrently with the processing phase.

In some implementations, other computer readable instructions and/or programs stored on the mass storage device 1054 and/or the memory device 1056 associated with the system controller 1051 can be employed. Examples of the programs or program segments include a substrate positioning program, a process gas control program, a pressure control program, a heater control program, and a plasma control program.

The substrate positioning program may include instructions for a processing tool assembly for loading a substrate onto the pedestal and controlling the spacing between the substrate and other components of the processing tool. The positioning procedure may include instructions for appropriately moving the substrate into and out of the reaction chamber as needed to deposit and/or etch a film on the substrate.

The process gas control program may include instructions for controlling gas composition and flow rates and optionally for flowing gases into volumes surrounding one or more processing stations prior to deposition and/or etching to stabilize pressures in these volumes. In some embodiments, a process gas control program may include instructions for introducing certain gases into a volume surrounding one or more processing stations in a processing chamber during deposition and/or etching operations on a substrate. The process gas control program may also include instructions to deliver these gases at the same rate for the same period of time, or at different rates and/or for different periods of time, depending on the composition of the film to be deposited and/or the characteristics of the etching process involved. The process gas control program may also include instructions for atomizing/vaporizing the liquid reactant in the presence of helium or some other carrier gas in the heated spray module.

The pressure control program may include instructions for controlling the pressure within the processing station by adjusting, for example, a throttle valve in an exhaust system of the processing station, a gas flow into the processing station, and the like. The pressure control program may include instructions for maintaining the same or different pressures during deposition of various types of films on the substrate and/or etching of the substrate.

The heater control program may include instructions for controlling current to a heating unit for heating the substrate. Alternatively or additionally, the heater control program may control the delivery of a heat transfer gas (e.g., helium) onto the substrate. The heater control program may include instructions for maintaining the same or different temperatures within the reaction chamber and/or the volume surrounding the processing station during deposition of various types of films on the substrate and/or etching of the substrate.

The plasma control program may include instructions for setting the RF power level, frequency, and number of exposures within one or more processing stations in accordance with an embodiment of the present invention. In some embodiments, the plasma control program may include instructions for using the same or different RF power levels and/or frequencies and/or exposure times during deposition of a film on a substrate and/or etching of a substrate.

In some implementations, there may be a user interface associated with the system controller. The user interface may include a display screen, a graphical software display of the apparatus and/or process conditions, and user input devices such as a pointing device, keyboard, touch screen, microphone, and the like.

In some embodiments, the parameter adjusted by the system controller may relate to a process condition. Non-limiting examples include process gas composition and flow rate, temperature (e.g., substrate holder and showerhead temperature), pressure, plasma conditions (e.g., RF bias power level and exposure times), and the like. These parameters may be provided to the user in the form of a recipe, which may be entered using the user interface.

Signals for monitoring the process can be provided from various process tool sensors by analog and/or digital input connections of the system controller. The signals used to control the process may be output through analog and/or digital output connections of the processing tool. Non-limiting examples of process tool sensors that can be monitored include Mass Flow Controllers (MFCs), pressure sensors (e.g., pressure gauges), temperature sensors such as thermocouples, and the like. Suitably programmed feedback and control algorithms can be used with the data from these sensors to maintain process conditions.

The various apparatus and methods described above may be used in conjunction with lithographic patterning tools and/or processes, for example, for the manufacture or production of semiconductor devices, displays, light emitting diodes, photovoltaic panels, and the like. Typically, but not necessarily, such tools will be used together and/or simultaneously in a common manufacturing facility, or such processes will be performed together and/or simultaneously in a common manufacturing facility.

In some embodiments, the controller is part of a system, which may be part of the above-described embodiments. Such systems may include semiconductor processing equipment including one or more process tools, one or more chambers, one or more platforms for processing, and/or specific processing components (wafer susceptors, gas flow systems, etc.). These systems may be integrated with electronic devices to control the operation of these systems before, during, or after processing of semiconductor wafers or substrates. The electronics may be referred to as a "controller," which may control various components or sub-portions of one or more systems. Depending on the process requirements and/or type of system, the controller can be programmed to control any of the processes disclosed herein, including controlling the delivery of process gases, the setting of temperatures (e.g., heating and/or cooling), the setting of pressures, the setting of vacuums, the setting of powers, the setting of Radio Frequency (RF) generators, the setting of RF matching circuits, the setting of frequencies, the setting of flow rates, the setting of fluid delivery, the setting of locations and operations, the transfer of wafers into and out of tools and other transport tools and/or load locks connected to or interfaced with a particular system.

In a broad sense, a controller may be defined as an electronic device having various integrated circuits, logic, memory, and/or software that receives instructions, issues instructions, controls operations, enables cleaning operations, enables endpoint measurements, and the like. The integrated circuit may include a chip storing program instructions in firmware, a Digital Signal Processor (DSP), a chip defined as an Application Specific Integrated Circuit (ASIC), and/or one or more microprocessors or microcontrollers executing program instructions (e.g., software). The program instructions may be instructions that communicate with the controller in the form of various individual settings (or program files) that define the operating parameters on or for a semiconductor wafer or to perform specific processes on the system. In some embodiments, the operating parameters may be part of a recipe defined by a process engineer to complete one or more process steps in the fabrication of one or more layer(s), material, metal, oxide, silicon dioxide, surface, circuit, and/or die of a wafer.

In some embodiments, the controller may be part of or coupled to a computer that is integrated with, coupled to, or otherwise connected to the system via a network, or a combination thereof. For example, the controller may be in the "cloud" or be all or part of a fab (fab) host system, which may allow remote access to the wafer process. The computer may enable remote access to the system to monitor a current process of the manufacturing operation, check a history of past manufacturing operations, check a trend or performance criteria of a plurality of manufacturing operations, change a parameter of the current process, set a process step to follow the current process, or start a new process. In some examples, a remote computer (e.g., a server) may provide the process recipe to the system over a network, which may include a local network or the internet. The remote computer may include a user interface that allows parameters and/or settings to be input or programmed, which are then communicated from the remote computer to the system. In some examples, the controller receives instructions in the form of data that specify parameters for each process step to be performed during one or more operations. It should be understood that these parameters may be specific to the type of process to be performed and the type of tool that the controller is configured to interface with or control. Thus, as described above, a controller may be distributed, for example, by including one or more discrete controllers that are networked together and operate toward a common goal (e.g., the processes and controls described herein). An example of a distributed controller for these purposes may be one or more integrated circuits within a room that communicate with one or more remote integrated circuits (e.g., at the platform level or as part of a remote computer) that combine to control the in-room process.

Example systems may include, but are not limited to, plasma etch chambers or modules (using inductively or capacitively coupled plasma), deposition chambers or modules, spin rinse chambers or modules, metal plating chambers or modules, cleaning chambers or modules, bevel edge etch chambers or modules, Physical Vapor Deposition (PVD) chambers or modules, Chemical Vapor Deposition (CVD) chambers or modules, Atomic Layer Deposition (ALD) chambers or modules, Atomic Layer Etch (ALE) chambers or modules, ion implantation chambers or modules, track chambers or modules, and any other semiconductor processing system that may be associated with or used in the preparation and/or fabrication of semiconductor wafers.

As described above, the controller may communicate with one or more other tool circuits or modules, other tool components, cluster tools, other tool interfaces, adjacent tools, tools located throughout the factory, a host computer, another controller, or tools used in the handling of containers of wafers to and from tool locations and/or load ports in a semiconductor manufacturing facility, depending on the process step or steps to be performed by the tool.

Other embodiments

Although the foregoing disclosed techniques, operations, processes, methods, systems, devices, tools, membranes, chemicals, and compositions have been described in detail in the context of specific embodiments for the purposes of promoting clarity and understanding, it will be apparent to those of ordinary skill in the art that there are many alternative ways of practicing the foregoing embodiments that fall within the spirit and scope of the invention. The described embodiments of the invention are therefore to be considered as illustrative of the inventive concepts disclosed herein, rather than limiting, and should not be used as an impermissible basis for unduly limiting the scope of any claims that ultimately point to the subject matter of this invention.

48页详细技术资料下载

Method and apparatus for optimizing etch profile via reflected light matching and surface dynamics modeling

相关技术

网友询问留言