Manufacturing Machine Learning Research: Semiconductor Virtual Metrology

Background:

Many innovations in manufacturing are now in the field of machine learning, and are well served by both mechanical and software engineering experience. As someone with a diverse background in mechanical/software/data science, the main focus of my undergraduate research became advanced machine learning in manufacturing environments. The potential impacts of work in this field also drew me to it. Manufacturing can be an incredibly wasteful market with drastic financial and environmental impacts. A report by Bain & Co estimates that 20 cents of every dollar is wasted in manufacturing (through wasted material, process time, etc.). Given that manufacturing makes up more than half of the worlds GDP this is a tenth of it. The EPA claims that in the United States alone, 7.6 billion tons of industrial waste is created each year. As someone who is passionate about building a sustainable future, reducing waste in manufacturing is a strong interest of mine.

The research I performed was done at the Unviversity of Texas at Austin, under the mentorship of Dr. Djurdjanovic. There, my focus was on building novel machine learning models from scratch, for the semiconductor industry. The level of data collection done at the largest companies in the semiconductor industry is practically unrivaled in other industries. My main project was to create a Growing Structure Multiple Model System for Virtual Metrology, something that a graduate student before me had tested out, but needed further proving out and debugging.

Growing Structure Multiple Model System for
Quality Estimation in Manufacturing Processes
Alexander Bleakie and Dragan Djurdjanovic

Virtual Metrology is a relatively new field that has so far been most prevalent in the semiconductor manufacturing industry. It is the practice of estimating the characteristics of a product’s quality, directly from production process sensor data with machine learning algorithms. These sensors are located on production machinery, and are often used for closed loop process control of different factors (temperature, pressure, flow rate, etc.)

Motivation

In any production process, it is a requirement to evaluate the quality of the products being made. It is preferred to do so throughout the production process. This catches failures earlier on and allows more responsiveness to changes in the health of the production line. However, in many industries certain quality characteristics (the factors you want to evaluate) can be very expensive and time consuming to measure. As a result many industries implement batch sampling, where the quality of only one out of a certain number of products is measured after a process (for example one of every fifty semiconductor wafers). This gives a broad overview of the health of the production system, but many failures slip through the cracks and can propogate down the line. Batch sampling also leads to slower response time to things going wrong on the line (for example machinery being shifted) than measuring the quality of every part would. As a result, a significant portion of waste (both material and financial) is generated by the inability to have a comprehensive overview of product quality.

Virtual metrology is significant, because if the model performs well, it can use callibration sensor data to get live insights of the quality of every part throughout the line. This means, that as soon as a machine starts to break down, the model can alert engineers on the scene. It also catches failures that would slip through batch sampling (depending on industry most factories have a baseline failure rate from anywhere between 2%-20%). It is an improvement over statistical process control, as it is informed by the relationship between sensor input and quality output, rather than just the distribution of inputs. This allows it to estimate the actual quality of the product, not just identify when abnormal behavior is happening. It can also better inform rework and maintenance procedures. Finally it can identify areas withing acceptable sensor inputs that may lead to poor quality.

(PDF) estimation and control in Semiconductor Etch. (n.d.). https://www.researchgate.net/publication/224092403_Estimation_and_Control_in_Semiconductor_Etch_Practice_and_Possibilities

The standard VM models used today are Multivariate Linear Regression and Partial Least Squares Regression. Though these do provide interpretable information and are computationally efficient, they can only model linear relationships and can be thrown off by outliers or shifts in input domains, which occurs regularly and is labeled as manufacturing drift. Nueral network models have also been tried. They are able to model complex relationships. However, they are often too computationally demanding for live integration, and don’t have tractable insights into the relationship between the production line state and quality, which is important for process improvement. They also require more data, which can be a challenge with batch sampling, and are prone to overfitting in noisy production environments.