Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD OF USING SENSOR-BASED MACHINE LEARNING TO COMPENSATE ERROR IN MASS METROLOGY
Document Type and Number:
WIPO Patent Application WO/2022/122845
Kind Code:
A1
Abstract:
Computer-implemented methods and systems of determining systematic error in a mass measurement of a substrate are disclosed. The methods involve providing sensor readings to a machine learning model as inputs and receiving as an output a value for systematic error and/or a corrected mass measurement. The machine learning model may be continuously checked during production and retrained or replaced based on data collected during production.

Inventors:
FENG YE (US)
SUBRAMANIAN PRIYADARSINI (US)
OWENS SAM (GB)
HARRISON PAUL (GB)
ALDEN EMILY ANN (US)
ELLIOTT GREGOR ROBERT (GB)
Application Number:
PCT/EP2021/084828
Publication Date:
June 16, 2022
Filing Date:
December 08, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
METRYX LTD (GB)
International Classes:
G01G9/00; G03F7/20; G06N7/00; H01L21/66
Foreign References:
US20160148850A12016-05-26
US7020577B22006-03-28
US9818658B22017-11-14
Other References:
BIGGS, M.C.: "Towards Global Optimization", 1975, NORTH-HOLLAND, article "Constrained Minimization Using Recursive Quadratic Programming", pages: 341 - 349
CONN, N.R.N.I.M. GOULDPH.L. TOINT: "Trust-Region Methods", MPS/SIAM SERIES ON OPTIMIZATION, SIAM AND MPS, 2000
MORE, J.J.D.C. SORENSEN: "Computing a Trust Region Step", SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, vol. 3, 1983, pages 553 - 572
BYRD, R.H.R.B. SCHNABELG.A. SHULTZ: "Approximate Solution of the Trust Region Problem by Minimization over Two-Dimensional Subspaces", MATHEMATICAL PROGRAMMING, vol. 40, 1988, pages 247 - 263
DENNIS, J.E., JR.: "State of the Art in Numerical Analysis", 1977, ACADEMIC PRESS, article "Nonlinear least-squares", pages: 269 - 312
POWELL, M.J.D.: "Numerical Analysis", vol. 630, 1978, SPRINGER VERLAG, article "A Fast Algorithm for Nonlinearly Constrained Optimization Calculations", pages: 105 - 116
Attorney, Agent or Firm:
MEWBURN ELLIS LLP (GB)
Download PDF:
Claims:
38

CLAIMS

1. A system for correcting systematic error in a mass measurement, comprising: one or more processors; and program instructions for executing on the one or more processors, the program instructions defining a machine learning model configured to: receive a mass measurement of a substrate from a mass metrology tool, receive a plurality of sensor readings associated with the mass measurement, and output a value of systematic error of the mass measurement and/or a corrected mass measurement that has been corrected for systematic error.

2. The system of claim 1, further comprising the mass metrology tool, wherein the mass metrology tool is configured to measure a weight of the substrate.

3. The system of claim 2, wherein the mass metrology tool comprises a plurality of sensors and at least one of the sensor readings of the plurality of sensor readings are received from the plurality of sensors.

4. The system of claim 3, wherein the mass metrology tool comprises a weighing chamber configured to obtain a weight measurement of a substrate and the plurality of sensors comprise a weighing chamber temperature sensor, a weighing chamber barometric pressure sensor, a weighing chamber relative humidity sensor, a loadcell horizontal position sensor, or any combination thereof.

5. The system of claim 3, wherein the mass metrology tool comprises an active thermal plate, and the plurality of sensors includes one or more of a temperature sensor for the active thermal plate and a temperature sensor for the substrate when it is on the active thermal plate

6. The system of claim 3, wherein the mass metrology tool comprises a passive thermal plate, and the plurality of sensors includes a temperature sensor for the passive thermal plate. 39

7. The system of claim 3, wherein the plurality of sensors includes a tilt sensor and/or a barometric pressure sensor.

8. The system of claim 1, wherein the program instructions further comprise instructions for controlling the one or more processors to determine the mass measurement by removing a buoyant force component from a weight measurement of the substrate.

9. The system of claim 8, wherein the program instructions further comprise instructions for controlling the one or more processors to determine the buoyant force based at least in part on a calculated density of air within the mass metrology tool at the time the weight measurement is collected.

10. The system of claim 1, wherein the plurality of sensor readings includes the temperature of the substrate, the barometric pressure within the mass metrology tool, the relative humidity within the mass metrology tool, or any combination thereof.

11. The system of claim 1, wherein the plurality of sensor readings includes a mass measurement of a reference mass associated with the mass metrology tool.

12. The system of claim 1, wherein the plurality of sensor readings includes a difference between a mass measurement of a reference mass and a true mass of the reference mass.

13. The system of claim 1, wherein the plurality of sensor readings includes a difference between a mass measurement of a reference mass and the mass measurement of the substrate.

14. The system of claim 12, wherein the mass measurement of the reference mass was obtained by the mass metrology tool within the last ten mass measurements by the mass metrology tool. 40

15. The system of claim 1, wherein the plurality of sensor readings includes positional information regarding the substrate in the mass metrology tool.

16. The system of claim 1, wherein the plurality of sensor readings includes sensor readings from one or more components selected from the group consisting of: a front opening unified pod (FOUP), a process chamber, and vacuum transfer module.

17. The system of claim 1, wherein the mass metrology tool is associated with one or more process chambers in an IC fabrication environment.

18. The system of claim 1, wherein the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof.

19. The system of claim 1, wherein the program instructions further comprise instructions for controlling the one or more processors to: periodically check a performance of the machine learning model; and determine that the machine learning model is not meeting a control limit.

20. The system of claim 19, wherein the control limit is a value for a root means squared error associated with the machine learning model.

21. The system of claim 19, wherein the program instructions further comprise instructions for controlling the one or more processors to: train one or more candidate models; compare the one or more candidate models against the machine learning model to determine which model provides the best mass measurement correction; and replace the machine learning model with a replacement machine learning model that is one of the one or more candidate models that provides the best mass measurement correction. 22. The system of claim 1, wherein the program instructions further comprise instructions for controlling the one or more processors to: train the machine learning model using a set of training wafers.

23. The system of claim 22, wherein the set of training wafers includes: one or more mass measurements for each training wafer, true mass values of each training wafer, and one or more sensor readings associated with each mass measurement.

24. A non-transitory machine-readable storage medium that includes instruction which, when executed by one or more processors, cause the one or more processors to: receive a mass measurement of a substrate from a mass metrology tool, receive a plurality of sensor readings associated with the mass measurement, and apply a machine learning model to the mass measurement and the plurality of sensor readings generate a value of systematic error of the mass measurement and/or a corrected mass measurement that has been corrected for systematic error.

25. The non-transitory machine-readable storage medium of claim 24, wherein the mass metrology tool comprises a plurality of sensors and at least one of the sensor readings of the plurality of sensor readings are received from the plurality of sensors.

26. The non-transitory machine-readable storage medium of claim 25, wherein the mass metrology tool comprises a weighing chamber configured to obtain a weight measurement of a substrate and the plurality of sensors comprises a weighing chamber temperature sensor, a weighing chamber barometric pressure sensor, a weighing chamber relative humidity sensor, a loadcell horizontal position sensor, or any combination thereof. 27. The non-transitory machine-readable storage medium of claim 25, wherein the mass metrology tool comprises an active thermal plate, and the plurality of sensors includes one or more of a temperature sensor for the active thermal plate and a temperature sensor for the substrate when it is on the active thermal plate.

28. The non-transitory machine-readable storage medium of claim 25, wherein the mass metrology tool comprises a passive thermal plate, and the plurality of sensors includes a temperature sensor for the passive thermal plate.

29. The non-transitory machine-readable storage medium of claim 24, wherein the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof.

30. The non-transitory machine-readable storage medium of claim 24, wherein the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof.

31. The non-transitory machine-readable storage medium of claim 24, comprising further instructions for causing the one or more processors to: periodically check a performance of the machine learning model; and determine that the machine learning model is not meeting a control limit.

Description:
METHOD OF USING SENSOR-BASED MACHINE LEARNING TO COMPENSATE ERROR IN MASS METROLOGY

BACKGROUND

[0001] Mass metrology systems can be used to measure the mass of a semiconductor substrate to a sub-milligram accuracy. Driven by ever smaller technology nodes, semiconductor device fabrication systems constantly advance. As the critical dimensions of semiconductor devices continues to shrink, the measurement error of mass metrology systems must also shrink. Various techniques may be used to increase the accuracy and precision of mass metrology systems.

[0002] Background and contextual descriptions contained herein are provided solely for the purpose of generally presenting the context of the disclosure. Much of this disclosure presents work of the inventors, and simply because such work is described in the background section or presented as context elsewhere herein does not mean that it is admitted to be prior art.

SUMMARY

[0003] Disclosed herein are systems and methods for correcting errors in a mass measurement of a substrate. In one aspect of the embodiments herein, a system for correcting systematic error in a mass measurement is provided, the system including including: one or more processors; and program instructions for executing on the one or more processors, the program instructions defining: a machine learning model configured to: receive a mass measurement of a substrate from a mass metrology tool, receive a plurality of sensor readings associated with the mass measurement, and output a value of systematic error of the mass measurement and/or a corrected mass measurement that has been corrected for systematic error.

[0004] In some embodiments, the system further includes the mass metrology tool, wherein the mass metrology tool is configured to measure a weight of the substrate. In some embodiments, the mass metrology tool includes a plurality of sensors and at least one of the sensor readings of the plurality of sensor readings are received from the plurality of sensors. In some embodiments, the mass metrology tool includes a weighing chamber configured to obtain a weight measurement of a substrate and the plurality of sensors include a weighing chamber temperature sensor, a weighing chamber barometric pressure sensor, a weighing chamber relative humidity sensor, a loadcell horizontal position sensor, or any combination thereof. In some embodiments, the mass metrology tool includes an active thermal plate, and the plurality of sensors includes one or more of a temperature sensor for the active thermal plate and a temperature sensor for the substrate when it is on the active thermal plate. In some embodiments, the mass metrology tool includes a passive thermal plate, and the plurality of sensors includes a temperature sensor for the passive thermal plate.

[0005] In some embodiments, the plurality of sensors includes a tilt sensor and/or a barometric pressure sensor. In some embodiments, the program instructions further include instructions for controlling the one or more processors to determine the mass measurement by removing a buoyant force component from a weight measurement of the substrate. In some embodiments, the program instructions further include instructions for controlling the one or more processors to determine the buoyant force based at least in part on a calculated density of air within the mass metrology tool at the time the weight measurement is collected. In some embodiments, the plurality of sensor readings includes the temperature of the substrate, the barometric pressure within the mass metrology tool, the relative humidity within the mass metrology tool, or any combination thereof.

[0006] In some embodiments, the plurality of sensor readings includes a mass measurement of a reference mass associated with the mass metrology tool. In some embodiments, the plurality of sensor readings includes a difference between a mass measurement of a reference mass and a true mass of the reference mass. In some embodiments, the plurality of sensor readings includes a difference between a mass measurement of a reference mass and the mass measurement of the substrate. In some embodiments, the mass measurement of the reference mass was obtained by the mass metrology tool within the last ten mass measurements by the mass metrology tool. In some embodiments, the plurality of sensor readings includes positional information regarding the substrate in the mass metrology tool. In some embodiments, the plurality of sensor readings includes sensor readings from one or more components selected from the group consisting of: a front opening unified pod (FOUP), a process chamber, and vacuum transfer module. In some embodiments, the mass metrology tool is associated with one or more process chambers in an IC fabrication environment.

[0007] In some embodiments, the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof. In some embodiments, the program instructions further include instructions for controlling the one or more processors to: periodically check a performance of the machine learning model; and determine that the machine learning model is not meeting a control limit. In some embodiments, the control limit is a value for a root means squared error associated with the machine learning model. In some embodiments, the program instructions further include instructions for controlling the one or more processors to: train one or more candidate models; compare the one or more candidate models against the machine learning model to determine which model provides the best mass measurement correction; and replace the machine learning model with a replacement machine learning model that is one of the one or more candidate models that provides the best mass measurement correction. In some embodiments, the program instructions further include instructions for controlling the one or more processors to: train the machine learning model using a set of training wafers. In some embodiments, the set of training wafers includes: one or more mass measurements for each training wafer, true mass values of each training wafer, and one or more sensor readings associated with each mass measurement.

[0008] In another aspect of the embodiments herein, a non-transitory machine-readable storage medium is disclosed, the non-transitory machine-readable storage medium including instruction which, when executed by one or more processors, cause the one or more processors to: receive a mass measurement of a substrate from a mass metrology tool, receive a plurality of sensor readings associated with the mass measurement, and apply a machine learning model to the mass measurement and the plurality of sensor readings generate a value of systematic error of the mass measurement and/or a corrected mass measurement that has been corrected for systematic error.

[0009] In some embodiments, the mass metrology tool includes a plurality of sensors and at least one of the sensor readings of the plurality of sensor readings are received from the plurality of sensors. In some embodiments, the mass metrology tool includes a weighing chamber configured to obtain a weight measurement of a substrate and the plurality of sensors includes a weighing chamber temperature sensor, a weighing chamber barometric pressure sensor, a weighing chamber relative humidity sensor, a loadcell horizontal position sensor, or any combination thereof.

[0010] In some embodiments, the mass metrology tool includes an active thermal plate, and the plurality of sensors includes one or more of a temperature sensor for the active thermal plate and a temperature sensor for the substrate when it is on the active thermal plate. In some embodiments, the mass metrology tool includes a passive thermal plate, and the plurality of sensors includes a temperature sensor for the passive thermal plate. In some embodiments, the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof. In some embodiments, the machine learning model is a random forest model, regularized linear model, a kernel ridge model, or any combination thereof. In some embodiments, including further instructions for causing the one or more processors to: periodically check a performance of the machine learning model; and determine that the machine learning model is not meeting a control limit.

[0011] These and other features of the disclosed embodiments will be described in detail below with reference to the associated drawings.

BRIEF DESCRIPTION OF DRAWINGS

[0012] Figure 1-3 presents schematic diagrams of example mass metrology tools used in accordance with disclosed embodiments.

[0013] Figure 4 presents a rear view of an example metrology tool and various sensors that may be associated with a metrology tool.

[0014] Figure 5 presents a process flow diagram for determining mass correction values.

[0015] Figure 6 presents a process flow diagram for training a machine learning model.

[0016] Figures 7 and 8 provide process flow diagrams for maintenance of a machine learning model.

[0017] Figure 9 presents an example computer system that may be employed to implement certain embodiments described herein.

DETAILED DESCRIPTION

Terminology

[0018] The following terms are used through the instant specification:

[0019] The terms “semiconductor wafer,” “wafer,” “substrate,” “wafer substrate” and “partially fabricated integrated circuit” may be used interchangeably. Those of ordinary skill in the art understand that the term “partially fabricated integrated circuit” can refer to a semiconductor wafer during any of many stages of integrated circuit fabrication thereon. A wafer or substrate used in the semiconductor device industry typically has a diameter of 200 mm, or 300 mm, or 450 mm. This detailed description assumes the embodiments are implemented on a wafer. However, the disclosure is not so limited. The work piece may be of various shapes, sizes, and materials. Besides semiconductor wafers, other work pieces that may take advantage of the disclosed embodiments include various articles such as printed circuit boards, magnetic recording media, magnetic recording sensors, mirrors, optical elements, micro-mechanical devices and the like.

[0020] Environmental parameters - parameters that describe conditions in a metrology tool. In the context of a mass metrology tool, environmental parameters may include parameters that affect the air density or buoyance of the wafer as measured. Examples of parameters affecting buoyancy include air temperature, air density, air humidity, air pressure, wafer temperature, system temperature, and wafer position.

[0021] Wafer position parameters - position of a wafer in a load cell or other wafer handling or supporting portion of a mass metrology tool. Such position may be assessed using optical methods.

[0022] Production wafer - a wafer containing fully or partially fabricated integrated circuits. A production wafer may be undergoing one or more fabrication processes such as lithography, an additive process such deposition, or a subtractive process such as polishing or etching. Production wafers are used to produce commercial integrated circuits.

[0023] Training wafer - a wafer used to train a machine learning model. A set of training wafers may be used to create a set of data for training, validation, and testing of a machine learning model. Each training wafer may be used to obtain one or more input/output pairs of data for supervised learning by obtaining mass measurements along with various sensor readings. In some embodiments, training wafers may be measured at various temperatures and/or each training wafer may be measured at various temperatures to create a set of data for training a machine learning model. Training wafers may experience large induced temperature swings to facilitate measurements that include mass error due to temperature differences.

[0024] Machine learning model - A machine learning model is a trained computational model that takes a metrology reading and data from various sensors and outputs a corrected metrology reading. In some embodiments, the metrology reading is a mass measurement. Examples of machine learning models include random forests models, including deep random forests, neural networks, including recurrent neural networks and convolutional neural networks, restricted Boltzmann machines, recurrent tensor networks, and gradient boosted trees.

[0025] In the art, some machine learning models are characterized as “deep learning” models. Unless otherwise specified, any reference to “machine learning” herein includes deep learning embodiments. A deep learning model may be implemented in various forms such as by a neural network (e.g., a convolutional neural network), etc. In general, though not necessarily, it includes multiple layers. Each such layer includes multiple processing nodes, and the layers process in sequence, with nodes of layers closer to the model input layer processing before nodes of layers closer to the model output. In various embodiments, one layers feeds to the next, etc. The output layer may include nodes that represent a corrected mass measurement or an error of the mass measurement. In some implementations, a deep learning model is a model that takes data with very little preprocessing and outputs a correction for a mass measurement.

[0026] In various embodiments, a deep learning model has significant depth and can receive as inputs a wide array of data. In some embodiments, the model has more than two (or more than three or more than four or more than five) layers of processing nodes that receive values from preceding layers (or as direct inputs) and that output values to succeeding layers (or the final output). Interior nodes are often “hidden” in the sense that their input and output values are not visible outside the model. In various embodiments, the operation of the hidden nodes is not monitored or recorded during operation.

[0027] The nodes and connections of a deep learning model can be trained and retrained without redesigning their number, arrangement, interface with image inputs, etc. and yet provide a correction for a mass measurement.

[0028] As indicated, in various implementations, the node layers may collectively form a neural network, although many deep learning models have other structures and formats. Some embodiments of deep learning models do not have a layered structure, in which case the above characterization of “deep” as having many layers is not relevant.

[0029] In some embodiments, as explained elsewhere in the disclosure, a machine learning model may include multiple components: e.g., it may be on ensemble model or a combination model.

[0030] A machine learning model is separate from an engine that does the training of a model from a training data set. The training is performed by a different algorithm that, in the terminology of this patent application, is not a machine learning model. The machine learning model is the result of the training operation. It is a trained model that can take one or more inputs and provide a desired output during production.

Mass Metrology Tools and Their Operation

[0031] Various embodiments described herein relate to training, use, and maintenance of a machine learning model to assess errors such as systematic errors and/or to provide error corrections for a metrology tool, e.g., a mass metrology tool. U.S. Pat. No. 7,020,577, titled APPARATUS AND METHOD FOR INVESTIGATING SEMICONDUCTOR WAFER, issued March 28, 2006, and U.S. Pat. No. 9,818,658, titled SEMICONDUCTOR WAFER PROCESSING METHODS AND APAPRATUS, issued November 14, 2017, is incorporated herein by reference in its entirety. These patents describe example mass metrology tools and methods of correcting a weight measurement to remove effects such as buoyancy effects.

[0032] Figures 1-3 provide different illustrations of example mass metrology tools. Figure 1 shows a weighing apparatus according to one embodiment of a mass metrology tool. The weighing apparatus comprises a weighing balance 1 having a weighing pan 3 for receiving a semiconductor wafer. The weighing balance 1 is configured to provide measurement outputs indicative of the weight of a semiconductor wafer loaded on the weighing pan 3.

[0033] The weighing balance 1 is located within a weighing chamber 5, which forms an enclosed environment around the weighing balance 1, e.g. to maintain a substantially uniform air density, air pressure and air temperature of the air around the weighing balance and to prevent draughts and provide electromagnetic shielding. The weighing chamber 5 has an opening (not shown), e.g. a suitably sized slot in a side-wall of the weighing chamber 5, to allow a semiconductor wafer to be transported into the weighing chamber 5, e.g. by a robotic arm, and positioned on the weighing pan 3. When not in use, the opening may be covered by an openable door or covering (not shown) to allow the weighing chamber 5 to be substantially closed or sealed when performing measurements using the weighing balance 1. [0034] In some embodiments one or more humidity sensors 50 and/or temperature sensors 52 may be located within the weighing chamber 5. In some embodiments there may be two, three, or four temperature sensors in the weighing chamber 5. In some embodiments one or more of the temperature sensors in the weighing chamber 5 may be included in the base of the weighing chamber. The output of the temperature sensors 52 and the humidity sensors 50 may be used as inputs to a machine learning model as described herein to determine a mass correction for a weight measurement of a wafer.

[0035] A thermal transfer plate 7 (a “second temperature changing unit,” a “passive thermal plate,” or a PTP) is positioned on top of the weighing chamber 5. The thermal transfer plate 7 comprises a block of material having a good thermal conductivity (for example Al). The thermal transfer plate may also have a high thermal mass, so that its temperature changes slowly and little when it is supplied with heat, and a good lateral thermal conductivity, so that it maintains a substantially uniform temperature across its upper surface. In this embodiment, the thermal transfer plate 7 is made from aluminum, but in other embodiments any other material with a good thermal conductivity may be used. In some embodiments a PTP temperatures sensor 54 may be coupled to and measure the temperature of the thermal transfer plate 7.

[0036] The thermal transfer plate 7 is positioned directly on top of the weighing chamber 5, so that there is thermal contact between the thermal transfer plate 7 and the weighing chamber 5. The thermal transfer plate 7 is in direct physical contact with the weighing chamber 5. The thermal transfer plate 7 may be attached or fixed to the weighing chamber 5, for example using one or more bolts (not shown) and/or a thermally conductive bonding layer (not shown).

[0037] As a result of the good thermal contact between the thermal transfer plate 7 and the weighing chamber 5, the thermal transfer plate 7 may be substantially in thermal equilibrium with the weighing chamber 5 and therefore may have substantially the same temperature as the weighing chamber 5. The weighing balance 1 may also be in thermal equilibrium with the weighing chamber 5 and therefore may also have substantially the same temperature as the weighing chamber 5. As such, the thermal transfer plate 7 may be substantially in thermal equilibrium with the weighing balance 1 and therefore may have substantially the same temperature as the weighing balance 1.

[0038] The weighing balance 1 and the weighing pan 3 may be considered as comprising a measurement area of the weighing apparatus. Alternatively, the weighing chamber 5 may be considered as comprising a measurement area of the weighing apparatus.

[0039] The weighing apparatus of Figure 1 further comprises a further thermal transfer plate 9 (A “first temperature changing unit,” an “active thermal plate,” or ATP). In the embodiments of Figure 1, a plurality of Peltier devices 11 are attached to a bottom side of the thermal transfer plate 9. Each Peltier device 11 has a heat sink 13 attached to the bottom side thereof. An air flow 15 can be provided in a region 17 beneath the bottom side of the thermal transfer plate 9 in order to remove heat from the Peltier devices 11 and from the heat-sinks 13. Of course, the configuration of the air flow may be different to that shown in Figure 1, for example, air may be blown out of the bottom the region 17 by a fan. In some embodiments an ATP temperature sensor 56 may be coupled to and measure the temperature of the thermal transfer plate 7. In addition to ATP temperature 56, there may be a separate temperature sensor (not shown) which detects the temperature of a wafer placed on the ATP, for example an IR temperature sensor.

[0040] In Figure 1 the thermal transfer plate 9 is shown as being positioned to the right-hand side of the weighing chamber 5. However, in other embodiments the thermal transfer plate 9 can be positioned differently, for example to a different side, above or below the weighing chamber 5, or closer or further away from the weighing chamber 5 than illustrated in Figure 1. In other embodiments, the thermal transfer plate 9 may be attached or connected, directly or indirectly, to the thermal transfer plate 7.

[0041] In use, a wafer transporter, for example an end effector of a robotic arm of a Equipment Front End Module (EFEM), is used to remove a semiconductor wafer from a Front Opening Unified Pod (FOUP) (not shown), or alternatively from another processing apparatus (not shown), and to transport the semiconductor wafer to the thermal transfer plate 9 and position the semiconductor wafer on the thermal transfer plate 9. When the semiconductor wafer is removed from the FOUP (or the other processing apparatus) it may have a temperature of approximately 70° C. For example, the semiconductor wafer may have been processed at a processing station of a semiconductor device production line, which may have heated the semiconductor wafer to a temperature of 400 to 500° C., before the semiconductor wafer was loaded into the FOUP.

[0042] When the semiconductor wafer is positioned on the thermal transfer plate 9, heat is conducted from the semiconductor wafer to the thermal transfer plate 9 so that the temperature of the semiconductor wafer is decreased. Depending on how long the semiconductor wafer is positioned on the thermal transfer plate 9, the semiconductor wafer and the thermal transfer plate 9 may achieve thermal equilibrium (e.g. so that they have substantially the same temperature). Transfer of heat from the semiconductor wafer to the thermal transfer plate 9 would act to increase the temperature of the thermal transfer plate 9. In that case, the thermal equilibrium temperature of the semiconductor wafer and the thermal transfer plate 9 may be different to a desired temperature of the semiconductor wafer. In order to prevent the temperature of the thermal transfer plate 9 from increasing due to the heat load from the semiconductor wafer, the thermal transfer plate 9 is operable to actively dissipate the heat load removed from the semiconductor wafer. In particular, the Peltier devices 11 are operated to actively remove heat from the thermal transfer plate 9. In other words, electrical power is supplied to the Peltier devices 11 to cause them to act as active heat pumps that transfer heat from their upper surfaces in contact with the thermal transfer plate 9 to their lower surfaces to which the heat-sinks 13 are attached.

[0043] In some embodiments an air-flow 15 is provided in the region 17 beneath the thermal transfer plate 9 in which the Peltier devices 11 and the heat-sinks 13 are positioned in order to remove heat from the Peltier devices 11 and the heat-sinks 13. The heat removed from the semiconductor wafer using the thermal transfer plate 9 is therefore transported and dissipated away from the weighing chamber 5 of the weighing apparatus by the air-flow 15, so that this heat has no effect on the temperature of the weighing apparatus. The air-flow 15 may be generated by one or more fans, for example posited in, or at the edges of, the region 17. In other words, heat is actively dissipated from the thermal transfer plate 9.

[0044] As mentioned above, actively dissipating heat from the thermal transfer plate 9 will prevent heat from building up in the thermal transfer plate 9, which would cause an increase in the temperature of the thermal transfer plate 9. In this embodiment, the heat removed from the semiconductor wafer is effectively/efficiently disposed of by being dissipated by the thermal transfer plate 9. This may enable the temperature of the semiconductor wafer to be more precisely/accurately controlled using the thermal transfer plate 9.

[0045] In some embodiments, the thermal transfer plate 9 may be cooled to a temperature below the desired temperature of the semiconductor wafer, so that the semiconductor wafer is cooled more rapidly. In this case, a pyrometer or other temperature sensor may be used to monitor the temperature of the semiconductor wafer and it may be removed from the thermal transfer plate 9 when it reaches an appropriate temperature. This may increase the speed with which the semiconductor wafer can be cooled by the thermal transfer plate 9. Alternatively, the thermal transfer plate 9 may be held at a first temperature below the desired temperature of the semiconductor wafer for a period of time, in order to rapidly cool the semiconductor wafer, and then held at the desired temperature after that time (so that it is not necessary to precisely monitor the temperature of the semiconductor wafer and remove it at the appropriate time).

[0046] The thermal transfer plate 9 is operated to remove a bulk of a heat load from the semiconductor wafer, so that the temperature of the semiconductor wafer is reduced to close to the desired temperature of the semiconductor wafer when it is positioned on the weighing pan 3. The thermal transfer plate 9 may remove over 90%, or over 95%, of the heat that needs to be removed to reduce the temperature of the semiconductor wafer to the desired temperature. Put another way, the thermal transfer plate 9 may cause over 90%, or over 95%, of the temperature change required to decrease the temperature of the semiconductor wafer from its initial temperature to the desired temperature when it is positioned on the weighing pan 3.

[0047] In this embodiment, it is desired to substantially match the temperature of the semiconductor wafer to the temperature of the weighing chamber 5, so that there is substantially no temperature difference between the semiconductor wafer and the weighing chamber 5 (and therefore substantially no temperature difference between the semiconductor wafer and the weighing balance 1) when the semiconductor wafer is loaded on the weighing pan 3. In this embodiment, the thermal transfer plate 9 may cool the semiconductor wafer to within ±1° C. of the temperature of weighing chamber 5. For example, where the weighing chamber has a temperature of 20° C., the thermal transfer plate 9 may cool the semiconductor wafer to a temperature of (20±l°) C. However, in other embodiments the amount of cooling provided by the thermal transfer plate 9 may be different to this, provided that as a minimum the thermal transfer plate 9 provides over 50% of the required temperature change of the semiconductor wafer, and preferably over 80%. In some embodiments the thermal transfer plate 9 may be used to adjust the temperature of a wafer to a temperature that is within about 1°C, about 0.5°C, or about 0.2-0.3°C of the temperature of the weighing chamber 5.

[0048] Once the semiconductor wafer has been cooled to a temperature close to the desired temperature using the thermal transfer plate 9, it is transported to the thermal transfer plate 7 using a wafer transporter. In some embodiments a different wafer transporter is used to transport the semiconductor wafer to the thermal transfer plate 7 than was used to transport the semiconductor wafer to the thermal transfer plate 9. In this embodiment, two different end effectors of a robotic arm of an EFEM are used to perform the two different transportation steps. The end effector that transports the semiconductor wafer to the thermal transfer plate 9 may be heated by the semiconductor wafer 9. If the same end effector is used to transport the cooled semiconductor wafer from the thermal transfer plate 9 to the thermal transfer plate 7 it may transfer heat back to the semiconductor wafer, thereby changing its temperature. This problem may be avoided by using a different end effector for the second transportation step.

[0049] The end effector(s) may be configured so that there is a minimal or reduced thermal contact area between the end effector(s) and the semiconductor wafer, in order to minimize heat transfer between the end effector(s) and the semiconductor wafer. For example, the end effector(s) may contact the semiconductor wafer solely at the edge of the semiconductor wafer. Alternatively, or in addition, the end effector(s) may be made out of a material(s) with a poor thermal conductivity, i.e. a thermal insulator, to minimize heat transfer between the end effector(s) and the semiconductor wafer.

[0050] As discussed above, when the semiconductor wafer is positioned on the thermal transfer plate 7 there is good thermal contact between the semiconductor wafer and the thermal transfer plate 7. Therefore, the semiconductor wafer is cooled by heat being conducted from the semiconductor wafer to the thermal transfer plate 7. Depending on the length of time that the semiconductor wafer is positioned on the thermal transfer plate 7, the semiconductor wafer and the thermal transfer plate 7 may become substantially in thermal equilibrium, so that they have substantially the same temperature (i.e. the temperature of the semiconductor wafer is matched or equalised to the temperature of the thermal transfer plate 7 and therefore to the temperature of the weighing chamber 5). In this embodiment, the semiconductor wafer may be positioned on the thermal transfer plate 7 for a period of up to 60 seconds.

[0051] The semiconductor wafer has already had the bulk of its heat load removed by the thermal transfer plate 9 before it is positioned on the thermal transfer plate 7. Therefore, the thermal load on the thermal transfer plate 7 during the temperature equalization is very low, and the temperature of the thermal transfer plate 7 and the weighing chamber 5 (which have a high thermal mass) therefore remains substantially constant during the temperature equalization. In addition, relatively little heat has to be exchanged to bring the semiconductor wafer into thermal equilibrium with the thermal transfer plate 7.

[0052] Therefore, with the present embodiment it may be possible to more accurately/precisely equalize the temperature of the semiconductor wafer to the desired temperature, because the steps of removing the bulk of the heat load from the semiconductor wafer and equalizing the temperature of the semiconductor wafer have been separated. For example with the present embodiment it may be possible to match the temperature of the semiconductor wafer to the temperature of the weighing chamber 5 to an accuracy of less than 0.1° C, or to an accuracy of less than 0.01° C., or even to an accuracy of the order of 0.001° C.

[0053] When the temperature of the semiconductor wafer is substantially equalized to the temperature of the weighing chamber 5 (e.g. when the semiconductor wafer has been on the thermal transfer plate 7 for a predetermined period of time) the semiconductor wafer is transported by a wafer transporter from the thermal transfer plate 7 to the weighing pan 3. The weighing balance 1 is then used to provide measurement output indicative of the weight of the semiconductor wafer. Because the temperature of the semiconductor wafer has been substantially matched to the temperature of the weighing chamber, and without significantly changing the temperature of the weighing chamber (as the heat load on the weighing chamber is very small), any temperature errors in the measurement output may be substantially minimal. For example, there may be no significant convection currents generated in the weighing chamber 5, no significant changes in the buoyancy force on the semiconductor wafer (which would be caused by heating of the air in the weighing chamber 5), and no significant temperature changes (e.g. temperature increase or temperature non-uniformity) in the weighing balance 1 due to the presence of the semiconductor wafer on the weighing pan 3. However, as described herein, further correction may be performed to further reduce errors in the measurement output. The weight readings are continually monitored until the reading has settled, usually within 15 to 45 seconds.

[0054] Similarly to above, a different end effector may be used to transport the semiconductor wafer from the thermal transfer plate 7 to the weighing pan 3.

[0055] Figure 2 shows a weighing apparatus according to a second embodiment of the present invention. Similar or corresponding features to those present in Figure 1 are indicated using the same reference numerals as used in Figure 1, and description of those features is not repeated.

[0056] The primary difference between embodiments of Figure 1 and Figure 2 is the positioning of the thermal transfer plate 9. In the second embodiment, the thermal transfer plate 9 is stacked above the thermal transfer plate 7. A thermal gap 19, for example an air gap or a layer of insulating material, is positioned between the thermal transfer plate 7 and the thermal transfer plate 9, so that the thermal transfer plates 7, 9 are substantially thermally insulated from each other so that substantially no heat can pass between the thermal transfer plates 7, 9.

[0057] Floor space is often limited in semiconductor device production facilities. Therefore, it may be advantageous to stack the thermal transfer plate 9 above the thermal transfer plate 7, i.e. similarly to the arrangement shown in FIG. 2, in order to save floor space. In this arrangement, the semiconductor wafer is transported vertically between the thermal transfer plates 7, 9 and the weighing pan 3.

[0058] Figure 3 shows a weighing apparatus according to a third embodiment of the present invention including a wafer transportation system. Similar or corresponding features to those present in Figure 1 are indicated using the same reference numerals as used in Figure 1, and description of those features is not repeated. As shown in Figure 3, the weighing apparatus and wafer transportation system are enclosed in an EFEM enclosure 20, within which a clean room environment is maintained/generated. As shown in Figure 3, the wafer transportation system comprises one or more end effectors 21 of a robotic arm, for example a robotic arm of an EFEM. The end effector 21 is used to hold or support a semiconductor wafer 23 and to transport the semiconductor wafer 23 between the different parts of the weighing apparatus, e.g. between the thermal transfer plate 9, the thermal transfer plate 7 and the weighing pan 3.

[0059] As shown in Figure 3, when the end effector 21 transports the semiconductor wafer 23 it transports the semiconductor wafer 23 vertically through some or all of a region 25, which in this embodiment is a vertical region, e.g. a vertical column of air or gas. A plurality of fans 27 are arranged to blow air (or gas) into or through the region 25 in order to generate an air flow 29 through the region 25.

[0060] The semiconductor wafer may be positioned on the thermal transfer plates 7, 9 or on the weighing pan 3 by passing the semiconductor wafer through a lid, opening or slot in an enclosure or chamber (for example the weighing chamber 5). There may be a cover, door or similar covering the lid, opening or slot, which may be opened and closed where necessary.

[0061] In this embodiment, the wafer transportation system is part of an EFEM and the air flow 29 through the region 25 is an air flow of the EFEM that is used to maintain clean-room conditions within the EFEM enclosure 20 (including in the weighing chamber 5). A filter (22) is included downstream of the plurality of fans 27 in order to filter the air flow 29 so as to maintain clean-room conditions.

[0062] Heaters 31 are positioned downstream of the fans 27 (i.e. below the fans 27 in the arrangement illustrated in Figure 3). Each of the heaters 31 has a heat sink 32 attached to an upper surface (an upstream surface) thereof, for facilitating transfer of heat from the heaters 31 to the air flow 29 generated by the fans 27. The heaters 31 are arranged so that the air flow 29 passes over or around or through the heat sinks 32.

[0063] The heaters 31 can be operated to heat the air flow 29, so as to heat the air in the region 25 through which the semiconductor wafer 23 is transported.

[0064] The specific number of heaters 31, heat sinks 32 and fans 27 can be varied from the numbers shown in Figure 3. For example, it is not essential to have the same number of fans 27 as heaters 31, or for each heater 31 to have a heat sink 32, or for there to be more than one heater 31, heat sink 32 or fan 27.

[0065] A temperature sensor 33 is positioned in the region 25. In this embodiment, the temperature sensor 33 is positioned part way between the thermal transfer plate 7 and the weighing pan 3 in order to measure the temperature at that location. In other embodiments, the temperature sensor may be located elsewhere in the region 25, and/or there may be more than one temperature sensor in the region 25. For example, there may be one temperature sensor for each heater 31 and/or fan 27.

[0066] A controller (not shown) is in communication with the temperature sensor 33 and also in communication with the plurality of heaters 31 and is arranged to control the operation (e.g. the temperature) of the plurality of heaters 31 , so as to control the temperature of the air flow 29 based on the output of the temperature sensor 33.

[0067] One or more heaters (not shown) may also be provided to heat the weighing chamber 5 to a temperature above ambient temperature, by e.g. a few ° C. (for example by less than 4° C.) The heaters may control the temperature of the weighing chamber 5 so that it is a constant or uniform temperature.

[0068] An advantage of heating the weighing chamber 5 is that the temperature of the air flow 29 in the region 25 can then easily/simply be controlled to be the same as the temperature of the weighing chamber 5 (or to have a fixed relationship to the temperature of the weighing chamber 5) by heating the air flow 29 with the heaters 31 so that it is the same temperature as the temperature of the weighing chamber 5.

[0069] In use, the fans 27 are operated to provide the air flow 29, in order to generate clean room conditions. The temperature sensor 33 is operated to detect the temperature of the air flow 29 at the location of the temperature sensor 33, e.g. proximal to the thermal transfer plate 7 and the weighing pan 3. If the temperature detected by the temperature sensor 33 is less than the desired temperature of the semiconductor wafer 23 when it is loaded on the weighing pan 3, e.g. less than the temperature of the weighing chamber 5, then the controller controls the heaters 31 to generate heat (or to generate more heat) so that the temperature of the air flow 29 is increased until the temperature detected by the temperature sensor 33 is substantially equal to the desired temperature. The temperature of the air flow 29 at the location of the temperature sensor 33 may be controlled to be substantially equal to the temperature of the weighing chamber 5.

[0070] Figure 4 presents a rear view of an example metrology tool, which may have features in common with metrology tools illustrated in Figures 1-3. Sensors 405 provide various sensors that may be part of a metrology tool according to various embodiments herein. For example, there may be sensors within the weighing chamber, associated with a passive thermal plate, and/or associated with an active thermal plate. In some embodiments a tilt sensor may be associated with the module to detect environmental effects external to the metrology tool, e.g., earthquakes or strong winds affecting a building within which the metrology tool is located. A metrology tool, in addition to various sensors noted above, may have a tilt sensor. A tilt sensor may be used to detect movement of the metrology tool, which may be caused by, e.g., earthquakes, strong gusts of wind against a building housing the metrology tool, etc. A metrology tool may an ATP temperature sensor to measure a temperature of the ATP and a wafer temperature sensor at the ATP to measure the temperature of the wafer at the ATP. A metrology tool may have a PTP temperature sensor to measure the temperature of the PTP. A metrology tool may have an enclosure temperature sensor to measure the temperature of the enclosure where weighing occurs. In some embodiments, there may be 1, 2, 3, 4, 5, or more temperature sensors within the enclosure. This may be beneficial to determine various temperature-related properties within the enclosure, e.g., temperature distributions within the enclosure. In various embodiments, temperature sensors may be IR sensors or resistance temperature detectors (RTD). In various embodiments temperature sensors may include thermocouples, resistance temperature detectors, or thermistors. In some embodiments, within a weighing chamber 5 there may be a variety of sensors, including e.g., one or more temperature sensors, a humidity sensor, an air temperature sensor, an air pressure, and a loadcell temperature sensor. In some embodiments, a metrology tool may have an active wafer centering (AWC) system that determines X/Y offsets for placing wafers in the metrology tool. In some embodiments, the metrology tool may record the time spent on the ATP and/or the PTP as well as the time spent on the loadcell. In some embodiments the metrology tool may record the time from the wafer being placed on the weighing pan 3 to the weight measurement stabilizing.

[0071] In some embodiments, a temperature sensor has an accuracy of about 0.1 °C or about 0.04°C. In some embodiments, a humidity sensor has an accuracy of about ± 0.8. In some embodiments, a pressure sensor has an accuracy of about Ippm, or about 0.001 millibar.

[0072] To achieve an accurate weighing result, environmental effects, such as variations in air density due to ambient conditions, may be corrected for. The air density may be determined and then used to calculate the buoyancy effect on the semiconductor wafer. The air density may be determined based on the pressure, temperature, and relative humidity in the upper balance enclosure. The buoyancy effect may then be calculated based on the air density, the weight of the measured wafer, the density of the measured wafer, and the density of a calibration weight using the following formula:

Pair > Pair P c Hair Pw

[0073] Where B is the buoyancy effect in grams, W w is weight of the wafer in grams, p w is the density of the wafer in g/cm 3 , p a ir is the density of the air in g/cm 3 , and p c is the density of the calibration weight used to calibrate the weighing balance in g/cm 3 . In some embodiments, and as noted below, the thin films on the measured wafer may affect its density and contribute a source of systematic error. Once the buoyant force is determined, the mass of a wafer may be calculated using the formula:

M w — W w + B

[0074] As may be clear from the above formulas, a mass measurement may be produced as a result of multiple measurements, including at least the weight of a measured wafer. Each of these measurements and the final mass calculation are susceptible to systemic errors. In some embodiments, a machine learning model described herein may be used to account for systemic errors.

Systematic Error and its Sources in Mass Metrology

[0075] Systematic errors are not randomly generated. Typically, they are introduced by measuring instruments or conditions under which measurements are made. Certain systematic errors may be manifest by an offset or zero crossing error in which the metrology tool does not read zero when the quantity to be measured has a value of zero. Some systematic errors may be manifest as scale factor error in which the metrology tool consistently reads changes in the quantity to be measured at a value that is greater or lesser than the actual changes.

[0076] Systematic errors do not include errors that can be quantified and corrected by known physical relationships based on measurable quantities, such as environmental effects, that are detected by sensors or other components within a metrology tool. For example, in a mass metrology system, a mathematical function may correct deviations from ideal gas law in particular regions of physical parameter space (air composition, V, T, P). A trained model may be used to correct for systematic errors.

[0077] Potential sources of systematic error in a mass metrology tool include (1) temperature gradients proximate the wafer being measured, (2) composition gradients in the air proximate the wafer, (3) the position or variation in position of the wafer in a measurement module (e.g., a load cell), (4) changes in ambient conditions, including air properties, temperature, pressure, humidity, magnetic fields, and vibrations, (5) sensor accuracy and precision, (6) errors due to nonoptimal sensor location, (7) loadcell variations in performance (repeatability, linearity, sensitivity to temperature, and variations in setup), (8) wafer bow, charge, wafer-wafer variability, and temperature, (9) electrical charge on the wafer, (10) wafer density variations due to, e.g., thin films deposited on the surface of the wafer, and (11) variations of pressure and humidity due to weather patterns external to a production facility housing the metrology tool.

Example Use of a ML Model in Mass Metrology During Production

[0078] Mass correction may be performed to account for environmental effects and systematic errors of a wafer mass measurement. In some embodiments, a weight measurement may be taken for a wafer under consideration after taking a zero reading in the loadcell, and an absolute mass correction (AMC) for that weight measurement may be determined based on various sensor data. In some embodiments, a weight measurement may be obtained relative to a reference mass, rather than a zero reading.

[0079] A machine learning model may be used to improve the mass correction to account for systematic error and/or environmental effects such as bouyancy. In some embodiments, a trained machine learning model determines a mass correction without using an explicit formula, such as the buoyant force formula provided above. In this manner, the machine learning model is trained to take various inputs, namely one or more sensor readings and a mass measurement, and output a mass correction. Figure 5 provides a process flowchart for a method of using a trained model during production of semiconductor devices to correct for systematic error, optionally an absolute mass correction. In an operation 502, sensor reading values of selected environmental and/or wafer position parameters are obtained for a wafer under consideration, such as a production wafer. In some embodiments, a machine learning model has been trained to use particular combinations of sensor readings as inputs. In some embodiments discussed further below, sensor readings that may, either alone or in combination with other sensor readings, correlate to errors in a mass measurement may be obtained for input to the machine learning model to determine a mass measurement correction.

[0080] In an operation 506, the weight of the wafer under consideration is measured and a value obtained.

[0081] In an operation 510, the sensor reading values of the selected environmental and/or wafer position parameters for a wafer under consideration are input to a machine learning model. As discussed elsewhere herein, a machine learning model is a trained model that may accept as inputs various sensor reading values. The sensor reading values provided as inputs depend on what the model is trained to use as inputs. In some embodiments, the inputs are determined as a result of a training process to identify inputs that correlate with systematic errors in a mass metrology measurement. In some embodiments, the inputs may additionally include the mass measurement.

[0082] In an operation 512, the machine learning model outputs a value of systematic error in the measured weight obtained in operation 508. The machine learning model may operate on the inputs to determine a correction to the weight for the wafer under consideration. In embodiments where the mass difference is provided to the model, the model may simply provide the corrected mass difference rather than a correction to the mass difference.

[0083] In some embodiments, part of operation 512 may optionally including using an algorithm for identifying outliers. Outlier detection may be used during production to identify outliers so that they may be disregarded as mass correction values. For example, a seismic event may affect the mass measurement such that the mass correction value should be discarded. In such embodiments, the wafer may be optionally re-measured or the mass measurement/mass correction value may be noted as an outlier.

[0084] In some embodiments, outliers may be detected using separate sensors at a production facility, such as two separate accelerometers. For example, two accelerometers at different locations, e.g., one at a FOUP and one on a metrology tool, could both register seismic events. When those two sensors at different locations register the same event, the algorithm may conclude, under certain circumstances, that a seismic event has occurred, and the mass readings should be discarded for the purposes of training and/or measurement.

[0085] In an operation 514, a final value of mass of the wafer under consideration or mass difference from the reference wafer is determined for the wafer under consideration. The mass difference may be added/subtracted from the measured mass of the wafer under consideration to determine the final measured mass of the wafer.

[0086] The process of Figure 5 may be performed for mass measurements of a plurality of wafers processed in sequence or in parallel. These measurements may be performed on production wafers. In some cases, a metrology system is periodically checked. The purpose of the check may be to gauge the current accuracy of the machine learning models. If the check fails, the model may be updated or replaced as described herein. For example, a metrology system may collect more data and retrain the machine learning model after deployment of the machine learning model to a metrology tool for wafer production.

Example Training of an Error Correction Machine Learning Model for Mass Metrology

[0087] Various techniques for training a machine may be employed, and some are presented in this section. The training may be conducted using computational resources that are different from those used for running the trained machine learning model, which may be resident on the metrology tool, in a fabrication facility where the tool and associated wafer processing apparatus are located, or at a remote location, such leased resources providing cloud resources. Training may be conducted at, for example, the facilities of the entity providing the metrology tool and/or a process apparatus that is associated with the metrology tool.

[0088] Figure 6 provides a process flowchart of an example process for training a machine learning model. In some embodiments, each operation of the process of Figure 6 may be performed automatically by a processor. In some embodiments, one or more operations may be performed or partially performed manually by a person assisting in training the model. Starting in an operation 602, a collection of wafers that may be used as part of a training set are selected or otherwise identified. In some embodiments, one or more training wafers may be “pretend production wafers,” or wafers that have a variety of semiconductor processing operations performed thereon in order to simulate a production environment. In some embodiments, the processing techniques used to create training wafers may follow a design of experiment (DOE) methodology. Generally, such techniques determine which sets of process parameters to use in various experiments. They choose the combinations of process parameters by considering statistical interactions between process parameters, randomization, and the like. DOE techniques will create an appropriately populated range of conditions for machine learning model training.

[0089] In the context of retraining or replacing a machine learning model that has been deployed in production, as discussed elsewhere herein, the training set may include mass measurements and sensor readings collected during production. In some embodiments, the training set used for such retraining comprises mostly or solely measurements and sensor readings obtained during production in an IC fabrication factory.

[0090] An optional operation 604 identifies a range of parameter values associated with wafer metrology; examples include wafer position information and environmental conditions (e.g., temperature, pressure, humidity). For example, one or more training wafers may be measured under various environmental conditions to simulate the fluctuations that may occur during production. The identified parameter values may be used as part of the training data. In some embodiments, the environmental conditions employed when performing metrology (e.g., measuring the mass) on test wafers may follow a design of experiment (DOE) methodology. In some embodiments operation 604 may be performed automatically, for example by applying a dimensionality reduction algorithm to identify parameters that correlate with systematic error. In some embodiments, operation 604 may also be performed manually by reviewing the parameters, or some combination of automatic and manual selection. In some embodiments, operation 604 may not be performed. In some embodiments, all of the parameters may be used for training a model.

[0091] In an operation 606, for each training wafer identified in operation 602 and optionally for each parameter value combination in the range of parameter values identified in operation 604, (a) mass measurements are conducted for each training wafer, and (b) identified parameters associated with the measurement are measured (e.g., environmental conditions and wafer position).

[0092] In operation 608, for each training wafer and parameter value combination in 606, a “true” mass of the training wafer is obtained. The true mass may be a long-term average measurement or a median measurement of the system, or an average or median across multiple systems. In some embodiments, a more accurate system is used to determine the true mass.

[0093] In an optional operation 610, outliers are identified. In some embodiments, outliers may be identified using unsupervised learning. Outliers are measured values that significantly diverge from a norm or pattern in a sample or training set. Outliers may be unreliable for various reasons, including measurement errors, experimental errors, intentional errors, or natural errors. Recognizing and removing outliers and/or other untrustworthy readings can prevent them from affecting the training of a machine learning model for a metrology tool. In some embodiments, outliers may also be removed by a manual or semi-manual process that includes review by a person. In some embodiments, operation 610 may not be performed.

[0094] Detection of outliers, for either or both of model training and wafer production can accomplished with an appropriate outlier detection routine. Outliers may be identified by various types of routines including box plots, scatter plots, or Grubbs’ Test. In some embodiments, outlier detection algorithms are trained to detect earthquakes or significant weather events such as typhoons that have been observed to interfere with readings of mass metrology tools. In certain embodiments, a barometric pressure sensor, an accelerometer, or other sensor or combinations of sensors is used to provide data that a trained algorithm recognizes as signatures of an earthquake or other event that suggests that the mass metrology reading will likely be an outlier and should not be trusted.

[0095] Following removal of outliers in operation 610, in an operation 612, using data obtained during operations 606 and 608 and filtered for outliers during operation 610, supervised learning is conducted to generate a machine learning model that outputs a true mass, a true mass difference, and/or a systematic error value. In some embodiments, a plurality of candidate models are trained based on the training data.

[0096] To train a machine learning model, as used herein, is to improve the ability of the machine learning model to predict the systematic error of a metrology measurement, such as a mass metrology measurement. In some embodiments, machine learning models as described herein may be trained using supervised learning techniques with sets of training data. Training data may comprise input-output pairs of various potential input parameters, e.g., temperature sensor readings, weight measurement, stabilization time, along with one or more output parameters, e.g., a mass error correction or a true mass measurement. Training data may be split into two sets, including training and test sets. Machine learning models are trained on the training set. The test set is typically not used until a machine learning model is selected for production, and may be used to determine the performance of the selected machine learning model.

[0097] A training routine may iteratively adjust the value of one or more model parameters of the machine learning model. During training, a computationally predicted systematic error of a mass measurement may be compared with the true error of the mass measurement (calculated by the difference between the true mass and the measured mass of a training wafer). The comparison may provide a cost value that reflects the magnitude of the difference(or agreement) between the predicted error and the true error. Training involves using the cost value to at least (i) determine whether the value(s) of the model parameters value(s) have converged, and (ii) if the value(s) have not converged, determining how to adjust the current value(s) of the model parameters for the next iteration. In some embodiments, the process uses not only the cost value of the current iteration, but the prior cost values of all or some of the historical iterations, to search for a global optimum. [0098] The model parameter values “converge,” as used herein, when a machine learning model configured with them performs adequately for determining systematic error. Various convergence criteria are known in the art and may be applied. Some of them are described below. Generally, cost values are evaluated in each iteration of a training routine. A cost value produced during a single iteration may be evaluated in isolation or in conjunction with cost values from other iterations. Such evaluation allows the training routine to conduct a convergence check. If the cost value or cost values indicate the current value of the model parameter provides a systematic error value that is acceptable and/or is no longer improving significantly, the training routine terminates the process and deems the current value of the model parameter to be the final value. The training routine has converged. Thus, in certain embodiments, the convergence method determines when the error of parameter estimation (cost function) can no longer be improved. This allows a Bayesian view to the termination problem. The convergence check may search for local or global minimum in the cost value (or maximum depending on the structure of the cost value).

[0099] The model parameter optimization procedure described above may be an iterative nonlinear optimization procedure — e.g., it optimizes an error metric or cost value that is, in general, a non-linear function of the input parameters — and, as such, various techniques known in the art for non-linear optimization may be employed. See, for example: Biggs, M.C., “Constrained Minimization Using Recursive Quadratic Programming,” Towards Global Optimization (L.C.W. Dixon and GP. Szergo, eds.), North-Holland, pp 341-349, (1975); Conn, N.R., N.I.M. Gould, and Ph.L. Toint, “Trust-Region Methods,” MPS/SIAM Series on Optimization, SIAM and MPS (2000); More, J.J. and D.C. Sorensen, “Computing a Trust Region Step,” SIAM Journal on Scientific and Statistical Computing, Vol. 3, pp 553-572, (1983); Byrd, R.H., R.B. Schnabel, and GA. Shultz, “Approximate Solution of the Trust Region Problem by Minimization over Two- Dimensional Subspaces,” Mathematical Programming, Vol. 40, pp 247-263 (1988); Dennis, J.E., Jr., “Nonlinear least-squares,” State of the Art in Numerical Analysis ed. D. Jacobs, Academic Press, pp 269-312 (1977); More, J.J., “The Levenberg-Marquardt Algorithm: Implementation and Theory,” Numerical Analysis, ed. G. A. Watson, Lecture Notes in Mathematics 630, Springer Verlag, pp 105-116 (1977); Powell, M.J.D., “A Fast Algorithm for Nonlinearly Constrained Optimization Calculations,” Numerical Analysis, GA. Watson ed., Lecture Notes in Mathematics, Springer Verlag, Vol. 630 (1978); each of which is hereby incorporated by reference in its entirety.

[0100] In some embodiments, the training set may be further split to create a validation set. In some embodiments cross-validation may be performed, where the training set is repeatedly split to create multiple training/validation sets from a single training set of data. One or more models may be trained on each training set. Each model may then be evaluated, compared, and tuned on the corresponding validation set. Cross-validation may be advantageous to compare multiple models without overfitting to a single training set and validation set. In some embodiments, hyperparameters for models may be adjusted and training/validation repeated until a machine learning model is chosen for testing. In embodiments where multiple candidate models are tested, the machine learning models may be compared to determine a single model that best accounts for systematic error. The best model may be determined using any of various metrics, including a root mean squared error (RMSE) or a residual error (RE). Generally, a lower RMSE or RE value indicates the model outputs a more accurate value.

[0101] Finally, in operation 614, the chosen model is tested using data that was not used to train the model. The chosen machine learning model may be evaluated on the test set to determining a generalization error, which is a measure of how well the machine learning model predicts outputs for previously unseen data. In embodiments where a new machine learning model is being trained to replace a model already used in production, the generalization error may be used to determine if the new model outperforms the old model. In some embodiments, the model may be trained/validated on a portion of putative training data, e.g., 80% or 70% of the total training set. The remaining portion of the data may be used to test the model, where the model was not trained based on the remaining portion. In other embodiments, data from wafers separate from the training wafers is used to test the machine learning model, for example wafers provided by a user of the machine learning model.

[0102] Testing is used to check that the model accurately predicts mass correction and/or error in measurements. If the model is over-fitted or under-fitted to the training data, then the test check may fail, e.g., a generalization error of the model exceeds a threshold, and operation 612 may be repeated to generate a new machine learning model.

Machine Learning Models

[0103] As noted above, a machine learning model is a trained model that receives a mass measurement for a wafer along with various sensor readings and determines a mass correction or final mass measurement for the wafer. Input Parameters for a Trained Machine Learning Model

[0104] There are many possible inputs that can be used in a machine learning model for this disclosure. Such inputs may include various sensor readings, including mass readings and positioning information for the wafer in the metrology tool, as well as temperature, pressure, and relative humidity within the metrology tool. Figure 4, discussed above, provides various sensors that a mass metrology system may have. Each of these sensors may be used as an input to a machine learning model. In some embodiments, multiple readings from a single sensor may be used, for example the sensor readings for the most recent two or more weight measurements.

[0105] The sensor reading may be taken from sensors on or proximate the mass metrology tool. In certain embodiments, sensor readings are taken outside the metrology tool. These external readings may be taken from sensors or other components that are associated with the metrology tool. For example, they may be taken from a FOUP or EFEM that provides wafers directly to the metrology tool or indirectly to the metrology tool via a process chamber that handles wafers from the FOUP or EFEM. As with sensors configured to collect data on the metrology tool, the sensors of the associated component may provide readings of temperature, pressure, relative humidity, and the like.

[0106] Any one or more of various sensors readings that may be used as inputs for a machine learning model as described herein include (in addition to any other sensors described herein):

• Temperature of the passive thermal plate

• Temperature of the air within the weighing chamber

• Temperature of the loadcell controller

• Stabilization time, which is the time from the wafer being placed on a weighing pan to the weight measurement stabilizing

• Temperature of a coil of a loadcell within the weighing chamber

• Relative humidity measured within the weighing chamber

• the X and Y dimensions of a correction applied by an active wafer centering (AWC) system

• Temperature of the weighing chamber, which may include one or more readings from 1, 2, 3, or 4 temperature sensors within the weighing chamber or in the base of the weighing chamber. In some embodiments these temperature sensors may measure the interior temperature of the material, e.g., aluminum, that comprises the weighing chamber.

• Barometric pressure within the weighing chamber

• Temperature of the active thermal plate

• Temperature of a wafer placed on the active thermal plate

• Air density calculated according to techniques described herein

[0107] Some combinations of sensor readings may be particularly advantageous for use as inputs. For example, temperature, pressure, and relative humidity within the metrology tool are used to calculate the buoyancy portion of a weight measurement. Other combinations may include the readings from one or more temperature sensors such as, for example, temperature sensors of the ATP and the PTP, and from one or more of the temperature sensors within the weighing chamber.

[0108] While sensor readings may be referred to as singular values herein, it should be understood that sensor readings may be taken continuously during measurement of a wafer and all measurements may be provided to a machine learning model. In some embodiments the inputs to a machine learning model may include multiple measurements from any one of the sensors described herein. In some embodiments the inputs may include measurements from prior wafer mass measurements, e.g., the immediately prior wafer mass measurement.

[0109] Another potential input parameter is the state of the mass metrology tool when the wafer under consideration enters the tool. The state may be influenced by a prior wafer or multiple prior wafers that were measured. In some cases, the state is determined by the length of time between when a prior wafer was measured and when the current wafer entered the metrology tool. The prior wafer’s impact may be based on thermal heat transfer due to convection or conduction from that wafer. The impact may be caused by the prior wafer while it is within the tool and/or after it is removed or is moved.

Output Parameters for a Trained Machine Learning Model

[0110] In certain embodiments, a machine learning model is trained to provide a correction to any one or more of the following parameter values: a total mass of the wafer, a differential mass of the wafer compared to an arbitrary reference mass, and/or the buoyant force in the metrology tool. In some embodiments, the machine learning model outputs an absolute value of the systematic error. In such embodiments, other parts of the algorithm, downstream of the machine learning model, can apply the calculated value of systematic error to determine a mass value of the wafer.

Form of Model

[0111] Various forms of machine learning models may be used. Example models may include ridge kernel models, regularized linear models, polynomial kernel models, radial basis functions, Lasso, elastic models, random forests, extreme gradient boosting, gradient boosting regression, neural networks, convolutional neural networks, and combination or ensemble models. In some embodiments a machine learning model to provide mass correction as described herein may preferably be a random forest, regularized linear model, or kernel ridge model.

[0112] In some embodiments a combination model and/or ensemble model may be employed. As an example, the machine learning model may include an ensemble of different models, optionally, each of the same type (e.g., each a neural network or each a random forest model), where each model of the ensemble was separately trained. In production, each model of the ensemble receives its respective inputs and generates its respective output. The outputs are then collectively considered by a separate algorithm that provides a single output. In other words, each value output by each model of the ensemble serves as an input to a downstream algorithm, that is configured to operate on the ensemble model outputs and, from them, generate a single output.

[0113] Another approach uses a combination of models. While there are many different ways different models can be combined to provide a single output, one approach involves providing a feature engineering front end which receives a number of different sensor inputs; i.e., from multiple sensor channels, and combines and or reduces the number of sensor inputs before feeding the combined or reduced inputs to a separate algorithm referred to as a predictive engine. The predictive engine may be a neural network or other machine learning model trained to receive inputs that have first been preprocessed with a feature engineering front end.

Training Data for Model

[0114] As discussed throughout the present application, a machine learning model is trained on data including mass measurements and related sensor readings for a plurality of training wafers. The sensor readings are associated with each mass measurement for each training wafer. In some embodiments, a mass measurement and associated sensor readings may be collected multiple times for a single training wafer. This may be useful to as part of a DOE process to vary environmental conditions while maintaining the same true mass. In some embodiments, the training data may grow to include data collected during production, such as mass measurements of production wafers, along with associated sensor readings.

[0115] In some embodiments, the sensor readings may include, temperature, pressure, and relative humidity within a loadcell of a mass metrology tool where a weight measurement is performed. In some embodiments, the temperature data used in the training set has a precision of about +/- 0.04 °C or better. In certain embodiments, the pressure data used in the training set has a precision of about +/- 0.001 millibar or better. In some embodiments, the relative humidity data used in the training set has a precision of about +/- 0.8 or better.

[0116] In some embodiments, the training data may be filtered to remove outliers, as discussed above. In certain embodiments, dimensionality reduction analysis may be performed prior to or as part of training a machine learning model to determine which sensor readings to train a machine learning model to use as inputs. For example, sensor readings or combinations of sensor readings that correlate with systematic error may be identified. In some embodiments, sensor readings that are independent of each other may be identified by, e.g., a random forest feature importance analysis. This may be performed to remove sensor readings that have little to no significance to the model.

System Dependent Model

[0117] In certain embodiments, a unique machine learning model is generated for a single mass metrology tool. In other words, the machine learning model may be valid for only a single mass metrology tool. Other machine learning models are trained for other mass metrology tools. In other embodiments, a machine learning model accurately accounts for systematic error in two or more mass metrology tools. In certain embodiments, a single machine learning model accurately accounts for systematic error in numerous mass metrology tools. For example, the model may be used for a fleet of mass metrology tools.

[0118] In some embodiments, a two machine learning models may be pipelined together, where one machine learning model is specific to a tool and calibrates sensor values on that tool. Each tool may employ a machine learning model that outputs an adjusted value for sensor readings from that tool A second machine learning model may then be used for a fleet of tools that accepts the adjusted values as inputs and provides a mass correction value.

Maintenance of Machine Learning Models During Operation

[0119] In certain embodiments, a machine learning model for determining systematic error is retrained after deployment in a production setting. In some implementations, such machine learning model is regularly or even continuously checked, validated, tested, retrained, and/or replaced. In some embodiments, as a metrology tool using a machine learning model continues to collect data, the data may be added to an existing training set (optionally one that was used during the initial training) or may serve as an entirely new training set. By growing a training set by adding new data obtained in production or via further research, one may improve the accuracy and/or precision of a model for identifying systematic error in a metrology tool. Thus, in certain embodiments, a routine periodically or continuously checks a model to determine whether it is operating within control limits.

[0120] Figure 7 provides a process flow diagram for maintenance of a machine learning model. In some embodiments, a machine learning model has already been trained as described in relation to Figure 6, correlating with operations 702-706. In an operation 708, a machine learning model is deployed to a metrology tool or module, such as a mass metrology tool, for use in production. The machine learning model may receive various inputs, including a weight measurement of a wafer being measured, and outputs a mass correction value to compensate for various systematic errors. The deployment of a machine learning model is exemplified with reference to Figure 5.

[0121] In operation 710, the performance of the machine learning model is checked. The check operation may be performed at the customer site and is intended to determine how well the model is performing. It may be conducted using one or more wafers provided by the metrology tool vendor or the company performing the process operations. In some embodiments, quality maintenance (QM) wafers may be used for the check. QM wafers are distinct from production wafers in that they are used to evaluate the performance of metrology tools, and metrics such as the true mass of each QM wafer may be known. Using a machine learning model to correct measurements for QM wafers may allow a user of the metrology tool or the machine learning model to determine if the machine learning model is providing good corrections. The corrections output by a machine learning model may become less accurate over time due to various reasons, including changes in the environmental conditions, changes in the sensors that affects the sensor readings, or other changes that affect the accuracy of the correction output by a machine learning model. Measuring QM allows for changes in the accuracy of the mass correction output by the machine learning model to be determined, triggering additional training or revisions to the model.

[0122] Once a machine learning model fails a check in an operation 710, the process returns to operations 702-706 to collect additional data for additional training and validation. The system or the process flow collects more data beyond the data used in the initial training operation to retrain the existing model or train a new model. The data used for this retraining or new training may come, in certain embodiments, from the factory where the tool is used. Such data may be provided in the form of one or more production wafers that are independently evaluated by other metrology tools or have their characteristics repeatedly measured. Or the check may be performed with separate wafers held at the factory that are not used for production. A potential benefit of using data from the factory or production facility where the metrology tool is deployed is that the readings taken from the metrology tool represent readings that are relevant to the model as deployed in the production environment.

[0123] In some embodiments, the additional data used to retrain the existing model or develop new models is obtained from systems or sites that are remote from the production environment. For example, the data may be collected at the site of the vendor of the metrology tool or an entity responsible for developing and/or maintaining the machine learning models for the metrology tools. In this environment, the team responsible for maintaining or improving the model may develop and use training data that simulates existing or expected situations in the production environment. In this way, the machine learning model can be retrained and improved in a well- controlled setting while using a robust set of training conditions.

[0124] Figure 8 presents a view of a process flow for maintenance of a model that has been deployed. In an operation 802, a performance check is used to assess whether a trained model is performing sufficiently. For example, it may check whether the mass metrology tools, or the machine learning model in particular, are performing within a control limit. In some embodiments, the control limit is a threshold value of RMSE and/or residual performance, such that whether a model fails the performance check is based on whether the RMSE or residual performance values for a model exceed the threshold value. The control limit is set to determine whether the wafers tested by the metrology system are being accurately read. This check may be periodically performed, e.g. once a day, once a week, once every other week. In some embodiments, the check may be performed using separate check wafers or other verification components. If the model passes, then it continues to be used without retraining or other modification and another check is performed later. If the model fails, then the process proceeds to operation 804.

[0125] In an operation 804, one or more other candidate models to replace the current model are generated by retraining the current model or training a different model. These candidates for replacement may include retrained versions of the existing machine learning model and or fresh models that are trained from prior or updated training sets. Such fresh models may not be of the same type as the model currently in use. For example, the current model might be a random forest model, while one of the candidate new models might be a neural network. These various candidate models are trained and validated for performance before the next step comparing them for the purpose of determining which is better.

[0126] In an operation 806, the candidate models are validated. The validation may be performed on data that was not used for training. Both the current model and the candidate models are validated. Validation may be a similar process during model maintenance as when training the model: the current model and the candidate models are provided a set of data as inputs, and the model with the least error in the outputs is considered the best. In some embodiments, the candidate models do not outperform the current model. In such embodiments, the validation fails and the current model remains in use until at least the next performance check. In other embodiments, a candidate model does perform better than the current model. In such embodiments, in operation 808 the candidate model is deployed to the tool as a new model for use in production, and the current model is no longer used. From this point forward, the new model is used to reduce the error readings in the metrology data taken from the tool until a new performance check is performed which might trigger the generation of a retrained or replacement model take the place of the currently existing model.

[0127] Retraining a model as exemplified in repeats of operation 704 and 804 may be accomplished in various ways. In some cases, the training may take the form of a wholly new training operation in which a systematic error determining model is trained without using any prior machine learning. In some cases, such new model may preserve, or not, the architecture of the original model. For example, if it is a neural network or autoencoder, it may preserve the layer arrangement (including the number of hidden layers and the numbers of nodes on each layer) as well as the form of the activation functions.

[0128] In some cases, retraining employs transfer learning. Processes that start with a first trained model and adopt that model’s architecture and current parameter values but then change the model’s parameter values using new or different training data are sometimes referred to as transfer learning processes. In certain embodiments, transfer learning is used to efficiently retrain the metrology, systematic error models described herein. In this context, the new training sets used in transfer learning may include additional data collected during production, as described above.

Performance Improvement (Reduction in Systematic Error)

[0129] The table below provides Standard Deviations of mass correction for various methods. No Temperature Compensation is the mass reading without adjusting for temperature. DT Temperature Compensation accounts for a difference between the temperature of the PTP and the measurement enclosure. Machine Learning Model Temperature Compensation uses a machine learning model as described herein to correct the temperature. As can be seen, use of a machine learning model reduces the standard deviation of mass correction over a DT technique or no temperature compensation. [0130]

Context for Disclosed Computational Embodiments

[0131] Certain embodiments disclosed herein relate to computational systems for generating and/or using machine learning models. Certain embodiments disclosed herein relate to methods for generating and/or using a machine learning model implemented on such systems. A system for generating a machine learning model may be configured to analyze data for calibrating or optimizing the expressions or relationships used to correct a mass measurement. A system for generating a machine learning model may also be configured to receive data and instructions such as program code representing physical processes occurring during the semiconductor device fabrication operation. In this manner, a machine learning model is generated or programmed on such system. A programmed system for using a machine learning model may be configured to (i) receive input such sensor readings and weight measurements, and (ii) execute instructions that determine a correction for the mass measurement.

[0132] Many types of computing systems having any of various computer architectures may be employed as the disclosed systems for implementing machine learning models and algorithms for generating and/or optimizing such models. For example, the systems may include software components executing on one or more general purpose processors or specially designed processors such as Application Specific Integrated Circuits (ASICs) or programmable logic devices (e.g., Field Programmable Gate Arrays (FPGAs)). Further, the systems may be implemented on a single device or distributed across multiple devices. The functions of the computational elements may be merged into one another or further split into multiple sub-modules.

[0133] In some embodiments, code executed during generation or execution of a machine learning model on an appropriately programmed system can be embodied in the form of software elements which can be stored in a nonvolatile storage medium (such as optical disk, flash storage device, mobile hard disk, etc.), including a number of instructions for making a computer device (such as personal computers, servers, network equipment, etc.).

[0134] At one level a software element is implemented as a set of commands prepared by the programmer/developer. However, the module software that can be executed by the computer hardware is executable code committed to memory using “machine codes” selected from the specific machine language instruction set, or “native instructions,” designed into the hardware processor. The machine language instruction set, or native instruction set, is known to, and essentially built into, the hardware processor(s). This is the “language” by which the system and application software communicates with the hardware processors. Each native instruction is a discrete code that is recognized by the processing architecture and that can specify particular registers for arithmetic, addressing, or control functions; particular memory locations or offsets; and particular addressing modes used to interpret operands. More complex operations are built up by combining these simple native instructions, which are executed sequentially, or as otherwise directed by control flow instructions.

[0135] The inter-relationship between the executable software instructions and the hardware processor is structural. In other words, the instructions per se are a series of symbols or numeric values. They do not intrinsically convey any information. It is the processor, which by design was preconfigured to interpret the symbols/numeric values, which imparts meaning to the instructions.

[0136] The models used herein may be configured to execute on a single machine at a single location, on multiple machines at a single location, or on multiple machines at multiple locations. When multiple machines are employed, the individual machines may be tailored for their particular tasks. For example, operations requiring large blocks of code and/or significant processing capacity may be implemented on large and/or stationary machines.

[0137] In addition, certain embodiments relate to tangible and/or non-transitory computer readable media or computer program products that include program instructions and/or data (including data structures) for performing various computer-implemented operations. Examples of computer-readable media include, but are not limited to, semiconductor memory devices, phasechange devices, magnetic media such as disk drives, magnetic tape, optical media such as CDs, magneto-optical media, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM). The computer readable media may be directly controlled by an end user or the media may be indirectly controlled by the end user. Examples of directly controlled media include the media located at a user facility and/or media that are not shared with other entities. Examples of indirectly controlled media include media that is indirectly accessible to the user via an external network and/or via a service providing shared resources such as the “cloud.” Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.

[0138] In various embodiments, the data or information employed in the disclosed methods and apparatus is provided in an electronic format. Such data or information may include design layouts, fixed parameter values, floated parameter values, feature profiles, metrology results, and the like. As used herein, data or other information provided in electronic format is available for storage on a machine and transmission between machines. Conventionally, data in electronic format is provided digitally and may be stored as bits and/or bytes in various data structures, lists, databases, etc. The data may be embodied electronically, optically, etc.

[0139] In certain embodiments, a machine learning model can each be viewed as a form of application software that interfaces with a user and with system software. System software typically interfaces with computer hardware and associated memory. In certain embodiments, the system software includes operating system software and/or firmware, as well as any middleware and drivers installed in the system. The system software provides basic non-task-specific functions of the computer. In contrast, the modules and other application software are used to accomplish specific tasks. Each native instruction for a module is stored in a memory device and is represented by a numeric value.

[0140] An example computer system 800 is depicted in Figure 9. As shown, computer system 800 includes an input/output subsystem 802, which may implement an interface for interacting with human users and/or other computer systems depending upon the application. Embodiments of the invention may be implemented in program code on system 800 with I/O subsystem 802 used to receive input program statements and/or data from a human user (e.g., via a GUI or keyboard) and to display them back to the user. The I/O subsystem 802 may include, e.g., a keyboard, mouse, graphical user interface, touchscreen, or other interfaces for input, and, e.g., an LED or other flat screen display, or other interfaces for output. Other elements of embodiments of the disclosure, such as the order placement engine 208, may be implemented with a computer system like that of computer system 800, perhaps, however, without I/O.

[0141] Program code may be stored in non-transitory media such as persistent storage 810 or memory 808 or both. One or more processors 804 reads program code from one or more non- transitory media and executes the code to enable the computer system to accomplish the methods performed by the embodiments herein, such as those involved with generating or using a process simulation model as described herein. Those skilled in the art will understand that the processor may accept source code, such as statements for executing training and/or modelling operations, and interpret or compile the source code into machine code that is understandable at the hardware gate level of the processor. A bus couples the I/O subsystem 802, the processor 804, peripheral devices 806, memory 808, and persistent storage 810.

Conclusion

[0142] Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Embodiments disclosed herein may be practiced without some or all of these specific details. In other instances, well-known process operations have not been described in detail to not unnecessarily obscure the disclosed embodiments. Further, while the disclosed embodiments will be described in conjunction with specific embodiments, it will be understood that the specific embodiments are not intended to limit the disclosed embodiments. It should be noted that there are many alternative ways of implementing the processes, systems, and apparatus of the present embodiments. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the embodiments are not to be limited to the details given herein.