Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
OVERLAY ESTIMATION BASED ON OPTICAL INSPECTION AND MACHINE LEARNING
Document Type and Number:
WIPO Patent Application WO/2024/097023
Kind Code:
A1
Abstract:
One or more optical images of a portion of a semiconductor wafer are obtained. The one or more optical images show a first structure in a first process layer and a second structure in a second process layer. The one or more optical images are provided to a machine-learning model trained to estimate an overlay offset between the first structure and the second structure. An estimated overlay offset between the first structure and the second structure is obtained from the machine-learning model.

Inventors:
REDDY NIREEKSHAN K (IL)
JAYARAMAN ARVIND (CA)
PANDEV STILIAN IVANOV (US)
MANASSEN AMNON (IL)
OPHIR BOAZ (IL)
SHUSTERMAN UDI (IL)
GUTMAN NADAV (IL)
Application Number:
PCT/US2023/035459
Publication Date:
May 10, 2024
Filing Date:
October 19, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KLA CORP (US)
International Classes:
G03F7/20; G03F7/00; G06T7/00; G06V20/00
Domestic Patent References:
WO2021262778A12021-12-30
WO2016086138A12016-06-02
Foreign References:
US20210035282A12021-02-04
US20220269184A12022-08-25
EP3693795A12020-08-12
Attorney, Agent or Firm:
MCANDREWS, Kevin et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1 . A method, comprising: obtaining one or more optical images of a portion of a semiconductor wafer showing a first structure in a first process layer and a second structure in a second process layer; providing the one or more optical images to a machine-learning model trained to estimate an overlay offset between the first structure and the second structure; and obtaining an estimated overlay offset between the first structure and the second structure from the machine-learning model.

2. The method of claim 1 , wherein obtaining the one or more optical images comprises optically inspecting the portion of the semiconductor wafer.

3. The method of claim 2, wherein: the one or more optical images comprise a plurality of optical images; and obtaining the plurality of optical images comprises optically inspecting the portion of the semiconductor wafer using a plurality of optical modes; each optical mode of the plurality of optical modes has a distinct set of optical conditions for inspecting the portion of the semiconductor wafer; and each optical image of the plurality of optical images is obtained using a respective optical mode of the plurality of optical modes.

4. The method of claim 1, wherein: obtaining the one or more optical images comprises obtaining a single optical image showing the first structure in the first process layer and the second structure in the second process layer; providing the one or more optical images to the machine-learning model comprises providing the single optical image to the machine-learning model; and obtaining the estimated overlay offset between the first structure and the second structure comprises using the machine-learning model to calculate the estimated overlay offset based on the single optical image.

5. The method of claim 1 , wherein: obtaining the one or more optical images comprises obtaining a first optical image focused on the first process layer and a second optical image focused on the second process layer; providing the one or more optical images to the machine-learning model comprises providing the first optical image and the second optical image to the machine-learning model; and obtaining the estimated overlay offset between the first structure and the second structure comprises using the machine-learning model to calculate the estimated overlay offset based on the first optical image and the second optical image.

6. The method of claim 1, wherein the machine-learning model is a first machine-learning model of an ensemble of distinct machine-learning models trained to estimate an overlay offset between the first structure and the second structure, the method comprising: providing the one or more optical images to the ensemble of distinct machine-learning models; and obtaining a plurality of estimated overlay offsets between the first structure and the second structure from the ensemble of distinct machine-learning models, wherein each estimated overlay offset of the plurality of estimated overlay offsets is from a respective machine-learning model of the ensemble of distinct machine-learning models.

7. The method of claim 6, further comprising determining an overlay offset between the first structure and the second structure based at least in part on the plurality of estimated overlay offsets.

8. The method of claim 6, wherein the ensemble of machine-learning models is a plurality of distinct machine-learning models of the same type.

9. The method of claim 8, wherein the type is selected from the group consisting of neural- network models, gradient-boosting models, support vector machines, and principal-component- regression models.

10. The method of claim 1 , wherein: the machine-learning model is a first machine-learning model; the one or more optical images further show a third structure in a third process layer; and the method comprises: providing the one or more optical images to the first machine-learning model and to a second machine-learning model trained to estimate an overlay offset between the second structure and the third structure; and obtaining an estimated overlay offset between the second structure and the third structure from the second machine-learning model.

11. The method of claim 1 , wherein the one or more optical images are a first set of one or more optical images of a plurality of sets of one or more optical images of portions of semiconductor wafers, each set of the plurality of sets showing the first structure and the second structure, the method further comprising: obtaining the plurality of sets from multiple optical inspection tools, wherein each set of the plurality of sets is generated by a respective optical inspection tool of the multiple optical inspection tools; providing the plurality of sets to the machine-learning model; and obtaining a respective estimated overlay offset between the first structure and the second structure for each set of the plurality of sets from the machine-learning model, comprising using the machine-learning model to calculate the respective estimated overlay offset.

12. The method of claim 1 1 , wherein: the first set of one or more optical images is obtained from a first optical inspection tool of the multiple optical inspection tools; the method further comprises specifying to the machine-learning model that the first set is from the first optical inspection tool; and obtaining the respective estimated overlay offset between the first structure and the second structure for the first set comprises using the machine-learning model to calculate the respective estimated overlay offset based on the first set and based further on the first set being from the first optical inspection tool.

13. The method of claim 1 1 , further comprising, before providing the plurality of sets to the machine-learning model, training the machine learning model using a training set of optical images from the multiple inspection tools, wherein: the training set of optical images is different from the plurality of sets of one or more optical images; and the optical images of the training set are annotated with respective predetermined overlay offset values.

14. The method of claim 13, wherein the respective predetermined overlay offset values for the optical images of the training set are measured using scanning electron microscopy.

15. The method of claim 1, further comprising, before providing the one or more optical images to the machine-learning model, training the machine learning model using a training set of optical images, wherein: the training set of optical images is different from the one or more optical images; and the optical images of the training set are annotated with respective predetermined overlay offset values.

16. The method of claim 15, wherein the respective predetermined overlay offset values for the optical images of the training set are measured using scanning electron microscopy.

17. A non-transitory computer-readable storage medium storing one or more programs for execution by one or more processors, the one or more programs comprising instructions for: obtaining one or more optical images of a portion of a semiconductor wafer showing a first structure in a first process layer and a second structure in a second process layer; providing the one or more optical images to a machine-learning model trained to estimate an overlay offset between the first structure and the second structure; and obtaining an estimated overlay offset between the first structure and the second structure from the machine-learning model.

18. A system, comprising: one or more optical inspection tools to optically inspect semiconductor wafers; one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs comprising instructions for: obtaining, from the one or more optical inspection tools, one or more optical images of a portion of a semiconductor wafer showing a first structure in a first process layer and a second structure in a second process layer; providing the one or more optical images to a machine-learning model trained to estimate an overlay offset between the first structure and the second structure; and obtaining an estimated overlay offset between the first structure and the second structure from the machine-learning model.

Description:
Overlay Estimation Based on Optical Inspection and Machine Learning

RELATED APPLICATION

[0001] This application claims priority to U.S. Provisional Patent Application No.

63/420,683, filed on October 31, 2022, which is incorporated by reference in its entirety for all purposes.

TECHNICAL FIELD

[0002] This disclosure relates to measuring overlay (i.e., overlay offsets) for semiconductor devices, and more specifically to measuring overlay offsets using optical inspection and machine learning.

BACKGROUND

[0003] In semiconductor fabrication, misalignment between process layers causes unwanted shifts in the positions of structures in one process layer as compared to the positions of structures in another process layer. Such a shift (i.e., displacement) is referred to as overlay, or equivalently, as an overlay offset. Measuring these shifts is referred to as overlay metrology. Accurate overlay metrology is important for measuring process drift and thus for establishing and maintaining process control, especially during the ramp-up period for a new semiconductor process and/or device. Accurate overlay metrology becomes increasingly important as feature sizes (e.g., line widths) on semiconductor wafers shrink, because the reduction in feature sizes reduces the available overlay budget accordingly: less overlay error can be accommodated as features shrink.

[0004] Scanning electron microscopes (SEMs) are useful tools for overlay metrology because they have high resolution and can image a structure regardless of its shape or size. Scanning electron microscopy is slow, however, and thus has low throughput. Accordingly, it is desirable to use optical inspection for overlay metrology. But optical inspection does not provide the accuracy that SEMs provide for overlay metrology. For example, optical inspection may be unable to resolve the structures used in overlay metrology, because the size of the structures is below the diffraction limit. Overlay thus cannot typically be measured simply by looking at images generated by optical inspection, or by applying object-recognition algorithms to such images. Optical images still contain information about overlay encoded within them, however, and algorithms exist to extract this information from the images. But the accuracy of measurements taken using these classical algorithms is limited.

SUMMARY

[0005] To improve the accuracy of overlay metrology based on optical inspection, machine-learning is used to analyze optical images of semiconductor structures.

[0006] In some embodiments, a method includes obtaining one or more optical images of a portion of a semiconductor wafer showing a first structure in a first process layer and a second structure in a second process layer. The one or more optical images are provided to a machine- learning model trained to estimate an overlay offset between the first structure and the second structure. An estimated overlay offset between the first structure and the second structure is obtained from the machine-learning model.

[0007] In some embodiments, a non-transitory computer-readable storage medium stores one or more programs for execution by one or more processors. The one or more programs include instructions for performing the above method.

[0008] In some embodiments, a system includes one or more optical inspection tools to optically inspect semiconductor wafers, one or more processors, and memory storing one or more programs for execution by the one or more processors. The one or more programs include instructions for performing the above method, with the one or more optical images being obtained from the one or more optical inspection tools.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] For a better understanding of the various described implementations, reference should be made to the Detailed Description below, in conjunction with the following drawings.

[0010] Figure 1 is a cross-sectional side view of a portion of a semiconductor wafer that is to be subjected to optical imaging to perform overlay metrology and that includes a first structure in a first process layer and a second structure in a second process layer, in accordance with some embodiments.

[0011] Figure 2 is a cross-sectional side view of a portion of a semiconductor wafer that is to be subjected to optical imaging to perform overlay metrology and that includes a first structure in a first process layer, a second structure in a second process layer, and a third structure in a third process layer, in accordance with some embodiments.

[0012] Figure 3 is a flowchart showing a method of measuring an overlay offset using machine learning, in accordance with some embodiments.

[0013] Figure 4 is a flowchart showing a method of measuring overlay offsets for optical images from multiple optical inspection tools using machine learning, in accordance with some embodiments.

[0014] Figure 5 is a flowchart showing a method of measuring an overlay offset using an ensemble of machine-learning models, in accordance with some embodiments.

[0015] Figure 6 is a block diagram of a semiconductor inspection system in accordance with some embodiments.

[0016] Like reference numerals refer to corresponding parts throughout the drawings and specification.

DETAILED DESCRIPTION

[0017] Reference will now be made in detail to various embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the various described embodiments. However, it will be apparent to one of ordinary skill in the art that the various described embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.

[0018] Figure 1 is a cross-sectional side view of a portion 100 of a semiconductor wafer to be subjected to optical imaging to perform overlay metrology, in accordance with some embodiments. The semiconductor wafer may be fully or partially processed. The portion 100 includes a first structure 104 in a first process layer and a second structure 110 in a second process layer. In the example of Figure 1, the first structure 104 is a first grating that includes a plurality of grating elements 106, and the second structure 110 is a second grating that includes a plurality of grating elements 1 12. The plurality of grating elements 106 may be arranged periodically in the first structure 104, with successive grating elements 106 being separated by the same distance and thus having the same pitch. Similarly, the plurality of grating elements 112 may be arranged periodically in the second structure 110, with successive grating elements 112 being separated by the same distance and thus having the same pitch. The pitch of the grating elements 112 in the second structure 110 may be the same as the pitch of the grating elements 106 in the first structure 104.

[0019] The first process layer, in which the first structure 104 is situated, is on a substrate 102 of the semiconductor wafer. Alternatively, the first process layer may be above other process layers. The grating elements 106 of the first structure 104 are separated from each other by a fill material (e.g., a dielectric). The grating elements 112 of the second structure 110 may be separated from each other by a fill material (e.g., a dielectric) or by air, depending on how far processing of the semiconductor wafer has proceeded. A fill layer 108 (e.g., an inter-layer dielectric) separates the second process layer from the first process layer. In some embodiments, the grating elements 106 and/or the grating elements 112 are metal lines. In the nomenclature of overlay metrology, the second process layer, which is above the first process layer, is referred to as the “outer layer” and the first process layer, which is below the second process layer, is referred to as the “inner layer.”

[0020] Figure 1 shows both intended and actual positioning of the grating elements 112 in the second structure 110 with respect to the grating elements 106 of the first structure 104. The intended positioning is shown with solid lines and a fill pattern, while the actual positioning is shown with dashed lines. There is an intended (i.e., programmed) overlay offset 114 between the second structure 110 and the first structure 104. The intended overlay offset 114 is the overlay offset that would exist if the grating elements 112 in the second structure 110 were fabricated exactly where they were intended to be fabricated with respect to the grating elements 106 in the first structure 104. In the example of Figure 1, the intended overlay offset 1 14 is non- zero. Alternatively, the intended overlay offset may be zero. [0021 ] In practice, due to imprecision and variability in photolithographic processing, neither the grating elements 112 in the second structure 110 nor the grating elements 106 in the first structure 104 can generally be fabricated exactly where intended. The difference between the actual and intended relative positions is an additional, unintended (i.e., erroneous) overlay offset 116, which may be positive or negative. A total overlay offset 1 18 equals the sum of the intended overlay offset 114 and the unintended overlay offset 116.

[0022] The portion 100 may be a test pattern on the semiconductor wafer. The test pattern may be separate from the dies on the semiconductor wafer (e.g., may be in a scribe line) or may be within a die on the semiconductor wafer. The semiconductor wafer may have multiple test patterns like the portion 100 at various locations, such that overlay metrology may be performed at multiple location on the wafer. Alternatively, the portion 100 may be part of the circuitry of the die itself.

[0023] Figure 2 is a cross-sectional side view of a portion 200 of a semiconductor wafer to be subjected to optical imaging to perform overlay metrology, in accordance with some embodiments. The semiconductor wafer may be fully or partially processed. The portion 200 includes the first structure 104 in the first process layer, the second structure 110 in the second process layer, and a third structure 202 in a third process layer. In the example of Figure 2, the third structure 202 is a third grating that includes a plurality of grating elements 204. The plurality of grating elements 204 may be arranged periodically in the third structure 202, with successive grating elements 204 being separated by the same distance and thus having the same pitch, which is the same pitch as the pitch of the grating elements 106 in the first structure 104 and the pitch of the grating elements 112 in the second structure 110.

[0024] The grating elements 204 of the third structure 202 may be separated from each other by a fill material (e.g., a dielectric) or by air, depending on how far processing of the semiconductor wafer has proceeded. The grating elements 1 12 of the second structure 110 in the portion 200 are separated from each other by a fill material (e.g., a dielectric). A fill layer 206 (e.g., an inter-layer dielectric) separates the third process layer from the second process layer. In some embodiments, the grating elements 204 are metal lines (e.g., as are the grating elements 106 and/or the grating elements 112). [0025] In addition to showing both intended and actual positioning of the grating elements 112 in the second structure 110, Figure 2 shows both intended and actual positioning of the grating elements 204 in the third structure 202. The intended positioning is shown with solid lines and a fill pattern, while the actual positioning is shown with dashed lines. There is an intended (i.e., programmed) overlay offset 208 between the third structure 202 and the second structure 110. The intended overlay offset 208 is the overlay offset that would exist if the grating elements 204 in the third structure 202 were fabricated exactly where they were intended to be fabricated with respect to the grating elements 112 in the second structure 110. In the example of Figure 1 , the intended overlay offset 208 is non-zero. Alternatively, the intended overlay offset may be zero. But because neither the grating elements 204 in the third structure 202 nor the grating elements 112 in the second structure 110 can generally be fabricated exactly where intended, an actual (i.e., total) overlay offset 210 exists that is different from the intended overlay offset 208. The difference between the total overlay offset 210 and the intended overlay offset 208 is an unintended (i.e., erroneous) overlay offset, which may be positive or negative.

[0026] When using the portion 200 to perform overlay metrology between the third structure 202 and the second structure 110, the third process layer is the outer layer and the second process layer is the inner layer, with the third process layer being above the second process layer. When using the portion 200 to perform overlay metrology between the second structure 110 and the first structure 104, the second process layer is the outer layer and the first process layer is the inner layer. The portion 200 may also be used to perform overlay metrology between the third structure 202 and the first structure 104, with the third process layer being the outer layer and the first process layer being the inner layer.

[0027] The portion 200 may be a test pattern on the semiconductor wafer, as described for the portion 100 (Figure 1), or may be part of the circuitry of the die itself.

[0028] Figure 3 is a flowchart showing a method 300 of measuring an overlay offset using machine learning, in accordance with some embodiments. The method 300 may be performed by one or more computer systems (e.g., the computer system of the semiconductor inspection system 600, Figure 6).

[0029] In some embodiments, the method 300 includes training (302) a machine learning model to estimate an overlay offset between a first structure in a first process layer and a second structure in a second process layer, using a training set of optical images showing the first structure and the second structure. For example, the first structure is the first structure 104 (Figure 1 ) and the second structure is the second structure 110 (Figure 1). In other examples, the first structure and the second structure are respective components of another periodic grating or of an aperiodic grating, either of which may be a test pattern on semiconductor wafers (e.g., as implemented in scribe lines). In yet other examples, the first structure and the second structure are respective components of a different test pattern (e.g., box-in-box), which may be implemented in scribe lines. In still other examples, the first structure and the second structure are respective components of circuitry on semiconductor die as wholly or partially fabricated on semiconductor wafers.

[0030] The optical images of the training set are annotated with respective predetermined overlay offset values (e.g., total overlay offset values and/or unintentional overlay offset values): the predetermined overlay offset values are stored in association with their respective optical images. The respective predetermined overlay offset values for the optical images of the training set have been previously measured using, for example, scanning electron microscopy: the portions of semiconductor wafers that were optically inspected to generate the optical images in the training set are also inspected using a scanning electron microscope (SEM) (e.g., SEM 634, Figure 6), or using multiple SEMs, and the overlay offset values are measured from the resulting SEM images. These predetermined overlay offset values serve as ground truth for the training 302. Examples of the machine-learning model include, without limitation, a neural-network model, a gradient-boosting model, a support vector machine, and a principal -component- regression model.

[0031] To perform the training 302, the training set of optical images is obtained along with the associated predetermined overlay offset values. The machine-learning model is initialized (e.g., in a manner involving randomization). Optical images of the training set are provided as input to the machine-learning model, and the resulting output of the machine- learning model is compared to the predetermined overlay offset values for respective images, which are the expected output. The optical images of the training set may be provided directly to the machine-learning model, or feature extraction may be performed and the extracted features (e.g., in the form of feature vectors for respective optical images) provided to the machine- learning model. The machine-learning model is adjusted based on differences between the resulting output (i.e., the actual output) and the expected output, and the training process continues until the resulting output converges with the expected output. Once this convergence occurs, the machine-learning model may be tested and then deployed for use in semiconductor manufacturing. The machine-learning model may be deployed to one or more semiconductor fabrication facilities, referred to as fabs, or to a computer system communicatively coupled with one or more fabs.

[0032] Alternatively, training the machine-learning model is omitted from the method 300. For example, the method 300 is performed using a machine-learning model that has already been trained to estimate overlay offsets between the first structure and the second structure and has been deployed for use in semiconductor manufacturing. Again, the first structure and the second structure may respectively be the first structure 104 (Figure 1) and the second structure 110 (Figure 1); may be respective components of another periodic grating or of an aperiodic grating, either of which may be a test pattern on semiconductor wafers (e.g., as implemented in scribe lines); may be respective components of a different test pattern (e.g., box-in-box), which may be implemented in scribe lines; or may be respective components of circuitry on semiconductor die as wholly or partially fabricated on semiconductor wafers.

[0033] One or more optical images of a portion of a semiconductor wafer are obtained (304) showing the first structure in the first process layer and the second structure in the second process layer (i.e., showing a particular instance of the first and second structures at a particular location in a particular semiconductor die). The one or more optical images are different from the training set of optical images. For example, none of the one or more optical images were included in the training set. The instance of the first and second structures in the one or more optical images thus may be different (e.g., may be at a different location, on a different semiconductor die, and/or on a different semiconductor wafer) from the instances of the first and second structures in respective optical images of the training set. In some embodiments, obtaining the one or more optical images includes optically inspecting (306) the portion of the semiconductor wafer (e.g., using an optical inspection tool 632, Figure 6). Alternatively, the one or more optical images are obtained from a computer memory (e.g., database) storing optical images previously generated by optical inspection. [0034] The one or more optical images are provided (314) to the machine-learning model (e.g., a neural-network model, gradient-boosting model, support vector machine, or principal- component-regression model). In some embodiments, the one or more optical images are provided to the machine-learning model directly (e.g., in accordance with deep learning), such that the machine- learning model receives the one or more optical images as input. Alternatively, the one or more optical images are provided to the machine-learning model indirectly. For example, feature extraction is performed on the one or more images, and the extracted features (e.g., in the form of a feature vector) are provided as input to the machine-learning model.

[0035] An estimated overlay offset between the first structure and the second structure (e.g., for the particular instance of the first and second structures) is obtained (322) from the machine-learning model: the machine-learning model detennines (i.e., calculates) the estimated overlay offset using the one or more optical images and provides the estimated overlay offset as output. The estimated overlay offset may be an estimated total overlay offset (e.g., total overlay offset 118, Figure 1) or an estimated unintended overlay offset (e.g., unintended overlay offset 116, Figure 1 ). The estimated overlay offset may be received from the machine-learning model in response to providing the one or more optical images to the machine-learning model. The machine-learning model may determine the estimated overlay offset based only on the one or more optical images, or based on the one or more optical images and based further on additional data (e.g., metadata for the one or more optical images and/or additional metrology data for the first and second structures).

[0036] In some embodiments, the one or more optical images are a single optical image showing the first structure in the first process layer and the second structure in the second process layer. The single optical image is obtained (308) and provided (316) to the machine- learning model, directly or indirectly. The machine-learning model is used to calculate (324) the estimated overlay offset based on the single optical image.

[0037] In some other embodiments, the one or more optical images include a first optical image focused on the first process layer and a second optical image focused on the second process layer. The first and second optical images are obtained (310) and provided (318) to the machine-learning model, directly or indirectly. The first and second optical images may be obtained, for example, using respective first and second cameras in an optical inspection system, with the first and second cameras having a known alignment. A stitching process may be performed on the first and second images to align them in accordance with the known alignment. The machine-learning model is used to calculate (326) the estimated overlay offset based on the first optical image and the second optical image.

[0038] In some embodiments, the one or more optical images include a plurality of optical images obtained (312) by optically inspect the portion of the semiconductor wafer using a plurality of optical modes. Each optical mode of the plurality of optical modes has a distinct set of optical conditions for inspecting the portion of the semiconductor wafer, with one or more the optical conditions in the set being different from the other sets of optical conditions for the plurality of optical modes. Examples of optical conditions include, without limitation, the light source, range of wavelengths, polarization, focus, transmission distribution in the illumination aperture, transmission distribution in the collection aperture, and phase-shift distribution in the collection aperture. Respective optical modes may also or alternatively correspond to respective Mueller elements from an imaging Mueller ellipsometer and/or respective harmonics signals from an imaging ellipsometer. Each optical image of the plurality of optical images is obtained using a respective optical mode of the plurality of optical modes. Each optical mode may be used to generate a single optical image showing both the first and second structures or to generate multiple optical images (e.g., a first image focused on the first process layer and a second image focused on the second process layer). The plurality of optical images is provided (320) to the machine -learning model, directly or indirectly. The machine-learning model is used to calculate (328) the estimated overlay offset based on the plurality of optical images.

[0039] The estimated overlay offset obtained in the method 300 may be used directly for monitoring and controlling a semiconductor fabrication process (e.g., for statistical process control). Alternatively, the estimated overlay offset may be one of multiple inputs subsequently used to determine an overlay offset between the first and second structures. For example, the estimated overlay offset may be averaged or otherwise combined with one or more other overlay-offset values calculated for the first and second structures using different techniques (e.g., techniques not involving machine learning).

[0040] The method 300 may be extended to determine an overlay offset between the second structure (e.g., the second structure 110, Figure 2) and a third structure in a third process layer (e.g., the third structure 202, Figure 2). For example, the one or more optical images further show the third structure in the third process layer (i.e., show a particular instance of the third structure). The machine-learning model is a first machine-learning model and the method 300 further includes a second machine-learning model (e.g., a neural-network model, gradient- boosting model, support vector machine, or principal-component-regression model) trained to estimate an overlay offset between the second structure and the third structure. The second machine-learning model may be trained as part of the method 300, by analogy to the training 302, or may have been previously trained and deployed for use in semiconductor manufacturing. The one or more optical images are provided, directly or indirectly, to the first machine-learning model, per step 314, and to the second machine-learning model. In an example, the one or more optical images are a single optical image showing the first structure, the second structure, and the third structure. The single optical image is provided to the first machine-learning model, per step 316, and also to the second machine-learning model. In another example, the one or more optical images include a first optical image focused on the first process layer, a second optical image focused on the second process layer, and a third optical image focused on the third process layer. The first and second optical images are provided to the first machine-learning model, per step 318, and the second and third optical images are provided to the second machine-learning model. Other examples are possible. An estimated overlay offset between the second structure and the third structure (e.g., for the particular instance of the second and third structures) is obtained from the second machine-learning model: the second machine-learning model is used to calculate the estimated overlay offset between the second structure and the third structure. The estimated overlay offset may be a total overlay offset (e.g., total overlay offset 210, Figure 2) or an unintended overlay offset (e.g., the difference between the total overlay offset 210 and the intended overlay offset 208, Figure 2).

[0041] In addition to or instead of using the second machine-learning model to obtain the estimated overlay offset between the second structure and the third structure, a third machine- learning model (e.g., a neural-network model, gradient-boosting model, support vector machine, or principal-component-regression model) may be used to obtain an estimated overlay offset between the first structure (e.g., the first structure 104, Figure 2) and the third structure (e.g., the third structure 202, Figure 2). The third machine-learning model may be trained as part of the method 300, by analogy to the training 302, or may have been previously trained and deployed for use in semiconductor manufacturing. The one or more optical images are provided, directly or indirectly, to the first machine-learning model, per step 314, and to the third machine-learning model (and also to the second machine-learning model, in accordance with some embodiments). In an example, the one or more optical images are a single optical image showing the first structure, the second structure, and the third structure. The single optical image is provided to the first machine-learning model, per step 316, as well as to the third machine-learning model (and also to the second machine-learning model, in accordance with some embodiments). In another example, the one or more optical images include a first optical image focused on the first process layer, a second optical image focused on the second process layer, and a third optical image focused on the third process layer. The first and second optical images are provided to the first machine-learning model, per step 318, and the first and third optical images are provided to the third machine-learning model (and the second and third optical images are provided to the second machine-learning model, in accordance with some embodiments). The estimated overlay- offset between the first structure and the third structure (e.g., for the particular instance of the first and third structures) is obtained from the third machine-learning model: the third machine- learning model is used to calculate the estimated overlay offset between the first structure and the third structure. The estimated overlay offset may be a total overlay offset or an unintended overlay offset.

[0042] The first, second, and or third machine-learning models may be combined into a single machine- learning model with multiple outputs. The single machine-learning model is trained to estimate offset overlays between the first and second structures, between the second and third structures, and/or between the first and third structures. The multiple outputs include a first output to provide the estimated offset overlay between the first and second structures, a second output to provide the estimated offset overlay between the second and third structures, and a third output to provide the estimated offset overlay between the first and third structures.

[0043] The first, second, and third structures may respectively be the first structure 104 (Figure 2), the second structure 110 (Figure 2), and the third structure 202 (Figure 2); may be respective components of another periodic grating or of an aperiodic grating, either of which may be a test pattern on semiconductor wafers (e.g., as implemented in scribe lines); may be respective components of a different test pattern (e.g., box-in-box), which may be implemented in scribe lines; or may be respective components of circuitry on semiconductor die as wholly or partially fabricated on semiconductor wafers.

[0044] Figure 4 is a flowchart showing a method 400 of measuring overlay offsets for optical images from multiple optical inspection tools, in accordance with some embodiments. The method 400 may be performed by one or more computer systems (e.g., the computer system of the semiconductor inspection system 600, Figure 6). The method 400 may include the method 300 (Figure 3 ), which may be performed as part of the method 400.

[0045] In some embodiments, a machine learning model is trained (402) to estimate an overlay offset between a first structure in a first process layer (e.g., the first structure of the method 300, Figure 3) and a second structure in a second process layer (e.g., the second structure of the method 300, Figure 3), using a training set of optical images showing the first structure and the second structure. The optical images of the training set are annotated with respective predetermined overlay offset values (e.g., total overlay offset values and/or unintentional overlay offset values): the predetermined overlay offset values are stored in association with their respective optical images. The respective predetermined overlay offset values for the optical images of the training set are measured, for example, using scanning electron microscopy (e.g., as described for the training 302 in the method 300, Figure 3). Examples of the machine- learning model include, without limitation, a neural-network model, a gradient-boosting model, a support vector machine, and a principal-component-regression model. The training set of optical images may be obtained from a plurality of optical inspection tools: respective optical inspection tools of the plurality of optical inspection tools may generate respective optical images of the training set by optically inspecting respective semiconductor wafers. The training 402 may be performed as described for the training 302 in the method 300 (Figure 3 ).

[0046] Alternatively, training the machine-learning model is omitted from the method 400. For example, the method 400 is performed using a machine-learning model that has already been trained to estimate overlay offsets between the first structure and the second structure (e.g., the first and second structures of the method 300, Figure 3) and has been deployed for use in semiconductor manufacturing.

[0047] A plurality of sets of one or more optical images of portions of semiconductor wafers is obtained (404) from multiple optical inspection tools (e.g., optical inspection tools 632, Figure 6). Each set shows the first structure and the second structure (e.g., shows a respective instance of the first and second structures at a particular location in a particular semiconductor die on a particular semiconductor wafer. Each set is generated by a respective optical inspection tool of the multiple optical inspection tools. The one or more optical images are different from the training set of optical images. For example, none of the one or more optical images were included in the training set. The respective instances of the first and second structures for the plurality of sets thus may be different (e.g., may be at different locations, on different semiconductor die, and/or on different semiconductor wafers) from each other and from the instances of the first and second structures in respective optical images of the training set. The multiple inspection tools may be the same as, or a subset or superset of, the plurality of inspection tools from which the training set was obtained. The one or more optical images of the method 300 (Figure 3) may be a respective set of the plurality of sets.

[0048] In some embodiments, obtaining the one or more optical images includes optically inspecting (406) the portions of the semiconductor wafers using the multiple optical inspection tools. For example, if the multiple inspection tools include two tools, portions of some of the semiconductor wafers are inspected using the first tool to obtain some of the sets and portions of the remaining semiconductor wafers are inspected using the second tool to obtain the remaining sets. If the multiple inspection tools include three or more tools, portions of some of the semiconductor wafers are inspected using the first tool to obtain some of the sets, portions of other semiconductor wafers are inspected using the second tool to obtain other sets, portions of still other semiconductor wafers are inspected using the third tool obtain still other sets, and so on.

[0049] The plurality of sets is provided (414) to the machine-learning model (e.g., a neural-network model, gradient-boosting model, support vector machine, or principal- component-regression model). For example, the sets are successively provided to the machine- learning model, with each set being provided to the machine-learning model in turn, or when the set becomes available. Each set may be provided to the machine-learning model directly or indirectly, as described for step 314 of the method 300 (Figure 3).

[0050] For each set of the plurality of sets, a respective estimated overlay offset between the first structure and the second structure (e.g., for the respective instance of the first and second structures) is obtained (418) from the machine-learning model: the machine-learning model is used to calculate the respective estimated overlay offset. The respective estimated overlay offset may be an estimated total overlay offset (e.g., total overlay offset 118, Figure 1) or an estimated unintended overlay offset (e.g., unintended overlay offset 116). The machine-learning model determines the respective estimated overlay offset using the one or more optical images of the set and provides the estimated overlay offset as output. The respective estimated overlay offset may be received from the machine-learning model in response to providing the set to the machine- learning model. The machine-learning model may determine the estimated overlay offset based only on the one or more optical images of the set, or based on the one or more optical images of the set and based further on additional data (e.g., metadata for the set and/or additional metrology data for the instance of the first and second structures corresponding to the set).

[0051] In some embodiments, providing (414) the plurality of sets to the machine- learning model includes specifying (416) to the machine-learning model the respective optical inspection tools from which respective sets were obtained (e.g., from which each set of the plurality of sets was obtained). The machine-learning model thus knows which optical inspection tool generated which set, and may use this information along with the sets themselves to calculate the estimated overlay offsets. For example, a first set of one or more optical images is obtained from a first optical inspection tool of the multiple optical inspection tools. The first set is provided to the machine-learning model, as is (e.g., along with) a specification that the first set is from the first optical inspection tool. The machine-learning model calculates the respective estimated overlay offset for the first set based on the first set and based further on the first set being from the first optical inspection tool.

[0052] Alternatively, information about the optical inspection tools the sets were obtained from is not provided to the machine-learning model, which calculates the estimated overlay offsets without knowing this information.

[0053] In some embodiments, each set is (408) a single optical image showing the first structure and the second structure. Each set is provided to the machine-learning model, which calculates the estimated overlay offset for each set based on the single optical image, as described for steps 316 and 324 of the method 300 (Figure 3). [0054] In some other embodiments, each set includes (410) a first optical image focused on the first process layer and a second optical image focused on the second process layer. Each set is provided to the machine-learning model, which calculates the estimated overlay offset for each set based on the first and second optical images, as described for steps 318 and 326 of the method 300 (Figure 3).

[0055] In some embodiments, each set includes (412) a plurality of optical images obtained by optically inspecting the portions of the semiconductor wafers using a plurality of respective optical modes. Each set is provided to the machine-learning model, which calculates the estimated overlay offset for each set based on the plurality of optical images, as described for steps 320 and 328 of the method 300 (Figure 3).

[0056] The respective estimated overlay offsets obtained in the method 400 may be used directly for monitoring and controlling a semiconductor fabrication process (e.g., for statistical process control). Alternatively, a respective estimated overlay offset may be one of multiple inputs subsequently used to determine a respective overlay offset between a particular instance of the first and second structures. For example, the respective estimated overlay offset may be averaged or otherwise combined with one or more other respective overlay-offset values calculated for the instance of the first and second structures using different techniques (e.g., techniques not involving machine learning).

[0057] Figure 5 is a flowchart showing a method of measuring an overlay offset using an ensemble (i.e., a plurality) of machine-learning models, in accordance with some embodiments. The method 500 may be performed by one or more computer systems (e.g., the computer system of the semiconductor inspection system 600, Figure 6). The method 500 may include the method 300 (Figure 3), which may be performed as part of performing the method 500.

[0058] In some embodiments, an ensemble of distinct machine-learning models is trained (502) to estimate an overlay offset between a first structure in a first process layer (e.g., the first structure of the method 300, Figure 3) and a second structure in a second process layer (e.g., the second structure of the method 300, Figure 3), using a training set of optical images showing the first structure and the second structure. The optical images of the training set are annotated with respective predetermined overlay offset values (e.g., as measured using scanning electron microscopy). Each machine-learning model in the ensemble may be trained as described for step 302 of the method 300 (Figure 3) and/or step 402 of the method 400 (Figure 4).

[0059] The ensemble of machine-learning models may be (504) a plurality of distinct machine-learning models of the same type. For example, the machine-learning models may all be neural-network models, gradient-boosting models, support vector machines, or principal- component-regression models, the type thus being selected from the group consisting of neural- network models, gradient-boosting models, support vector machines, and principal-component- regression models. The training of each machine-learning model in the ensemble may involve randomization (e.g., randomized initialization). This randomization may result in each machine- learning model in the ensemble being distinct. Alternatively, different machine-learning models in the ensemble may be of different types.

[0060] Training the machine-learning model may alternatively be omitted from the method 500. For example, the method 500 is performed using an ensemble of machine-learning models (e.g., of the same type, such as neural -network models, gradient-boosting models, support vector machines, or principal-component-regression models) that have already been trained to estimate overlay offsets between the first structure and the second structure (e.g., the first and second structures of the method 300, Figure 3) and have been deployed for use in semiconductor manufacturing.

[0061] One or more optical images of a portion of a semiconductor wafer are obtained (304) showing the first structure in the first process layer and the second structure in the second process layer, as described for the method 300 (Figure 3). In some embodiments, obtaining the one or more optical images includes optically inspecting (306) the portion of the semiconductor wafer (e.g., using an optical inspection tool 632, Figure 6).

[0062] The one or more optical images are provided (506) to the ensemble of distinct machine-learning models. The one or more optical images may be provided to each machine- learning model of the ensemble as described for step 314 of the method 300 (Figure 3). For example, the one or more optical images may be provided to each machine-learning model of the ensemble as described for step 316 or step 318 of the method 300 (Figure 3), and/or for step 320 of the method 300 (Figure 3). [0063] A plurality of estimated overlay offsets between the first structure and the second structure (e.g., for the particular instance of the first and second structures) is obtained (508) from the ensemble of distinct machine-learning models. Each estimated overlay offset of the plurality of estimated overlay offsets may be an estimated total overlay offset (e.g.. total overlay offset 118, Figure 1) or an estimated unintended overlay offset (e.g., unintended overlay offset 116). For example, all of the estimated overlay offsets of the plurality of estimated overlay offsets may be estimated total overlay offsets, or may be estimated unintended overlay offsets. Each estimated overlay offset of the plurality of estimated overlay offsets is from a respective machine-learning model of the ensemble of distinct machine-learning models. Each estimated overlay offset may be obtained as described for step 322 of the method 300 (Figure 3). For example, each estimated overlay offset may be obtained as described for step 324 or step 326 of the method 300 (Figure 3), and/or for step 328 of the method 300 (Figure 3).

[0064] An overlay offset is determined (510) between the first structure and the second structure (e.g., for the particular instance of the first and second structures) based at least in part on the plurality of estimated overlay offsets. For example, the overlay offset is determined by averaging (e.g., calculating the mean of) the plurality of estimated overlay offsets or by otherwise combining the plurality of estimated overlay offsets. In some embodiments, the overlay offset is determined using the plurality of estimated overlay offsets and additional information. For example, the overlay offset is determined using the plurality of estimated overlay offsets and one or more additional overlay-offset values for the first and second structure determined without using the ensemble of machine-learning models. The determined overlay' offset may be an estimated total overlay offset (e.g., total overlay offset 118, Figure 1) or an estimated unintended overlay offset (e.g., unintended overlay offset 116).

[0065] Overlay offsets determined using the methods 300, 400, and/or 500 (Figures 3-5) may be used to measure process drift and thus to establish and maintain process control. For example, one or more process parameters (e.g., photolithographic exposure and/or dose) may be modified based at least in part on determined overlay offsets. Establishing and maintaining process control in this manner is especially important during the ramp-up period for a new semiconductor process and/or device, but is also important for maintaining yields during ongoing production. [0066] Figure 6 is a block diagram of a semiconductor inspection system 600 in accordance with some embodiments. The semiconductor inspection system 600 may be used for overlay metrology (e.g., for performing the methods 300, 400, and/or 500, Figures 3-5). The semiconductor inspection system 600 includes one or more (e.g., multiple) optical inspection tools 632, one or more SEMs 634, and a computer system with one or more processors 602 (e.g., CPUs and/or GPUs), user interfaces 606, memory 610, and communication bus(es) 604 interconnecting these components. In some embodiments, the optical inspection tools 632 and SEM(s) 634 are communicatively coupled to the computer system through one or more wired and/or wireless networks 630. The computer system may include one or more network interfaces for communicating with the optical inspection tools 632, SEM(s) 634, and other remote devices through the one or more networks 630.

[0067] The user interfaces 606 may include a display 607 and one or more input devices 608 (e.g., a keyboard, mouse, touch-sensitive surface of the display 607, etc.). The display 607 may display results, including overlay offsets.

[0068] Memory 610 includes volatile and/or non-volatile memory. Memory 610 (e.g., the non-volatile memory within memory 610) includes a non-transitory computer-readable storage medium. Memory 610 optionally includes one or more storage devices remotely located from the processors 602 and/or a non-transitory computer-readable storage medium that is removably inserted into the computer system of the semiconductor inspection system 600.

[0069] In some embodiments, memory 610 (e.g., the non-transitory computer-readable storage medium of memory 610) stores the following modules and data, or a subset or superset thereof: an operating system 612 that includes procedures for handling various basic system sendees and for performing hardware-dependent tasks, optical images 614 taken with the optical inspection tool(s) 632, SEM images 620 taken with the SEM(s) 634, a training module 622, one or more machine-learning models 624 (e.g., an ensemble of machine-learning models), an overlay-determination module 626, and a reporting module 628 to report results from the one or more machine-learning models 624 and/or the overlay-determination module 626.

[0070] The optical images 614 include images for analysis 616, which are to be provided to the one or more machine-learning models 624. The one or more machine-learning models 624 are executable to use the images for analysis 616 to determine estimated overlay offsets. The optical images 614 may also include a training set 618 to be provided to the training module 622 to train the one or more machine-learning models 624. The SEM images 620 include images of the same structures as the optical images in the training set 618. Overlay offsets for these structures are measured in the SEM images 620 and used to annotate the optical images in the training set 618, such that the overlay offsets measured in the SEM images 620 are provided to the training module 622 in association with respective optical images in the training set 618.

[0071] The overlay-determination module 626 is executable to determine overlay offsets based at least in part on estimated overlay offsets from the machine-learning model(s) 624.

Overlay offsets determined by the overlay-determination module 626 may be provided to the reporting module 628. In some embodiments, the overlay-determination module 626 is omitted and estimated overlay offsets from the machine-learning model(s) 624 are provided to the reporting module 628.

[0072] The memory 610 (e.g., the non-transitory computer-readable storage medium of the memory 610) may include instructions for performing all or a portion of the methods 300, 400, and/or 500 (Figures 3-5). Each of the modules stored in the memory 610 (including the machine-learning model(s) 624) corresponds to a set of instructions, executable by the one or more processors 602, for performing one or more functions described herein. Separate modules need not be implemented as separate software programs. The modules and various subsets of the modules may be combined or otherwise re-arranged. In some embodiments, the memory 610 stores a subset or superset of the modules and/or data structures identified above.

[0073] Figure 6 is intended more as a functional description of various features that may be present in a semiconductor inspection system than as a structural schematic. For example, the functionality of the computer system in the semiconductor inspection system 600 may be split between multiple devices. A portion of the modules stored in the memory 610 may instead be stored in one or more other computer systems communicatively coupled with the computer system of the semiconductor inspection system 600 through one or more networks. For example, the training module 622, training set 618, and SEM images 620 may be stored on a first computer system that trains the machine-learning model(s) 624. The machine-learning model(s) 624 may then be deployed to a second computer system that stores the machine-learning model(s) 624 along with the images for analysis 616, overlay-determination module 626, and/or reporting module 628.

[0074] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the scope of the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen in order to best explain the principles underlying the claims and their practical applications, to thereby enable others skilled in the art to best use the embodiments with various modifications as are suited to the particular uses contemplated.