![]() Implementing an Xbar Control SchemeĪs a data scientist working for the hospital, you started to work to develop this control chart. It may indicate a shift in the distribution of the mean squared error. If the mean squared error is outside those bounds, then we say that that point is out of control. Using these bounds, we can monitor the mean squared error across different points in time. Where xbar is the average of the mean squared errors for each time point, sbar the average of the standard deviations for each time point, c4 a bias control constant, n the sample size for each time point (i.e., how many observations are available for each time point), and k determines the width of your interval. This region can be defined as follows:Įquation 1: Control Bounds for the XBar Control Chart This will enable you to establish an acceptance region where the mean squared error will likely lie, given that the mean is hypothesized to be 228 squared dollars. Therefore you decide to use an XBar control chart to monitor model performance. After deployment, you want to ensure that the mean squared error remains relatively constant across different points in time (or at least does not increase). Let’s say that during the training process, you estimated that the mean squared error was 228 squared dollars. Monitoring Model Performance About Control Charts: What Is our Goal? To that end, you may be interested in monitoring prediction errors across time. What if model performance is affected by features not included in the model? In that case, only monitoring the feature distribution could lead to erroneous conclusions about model performance. Although this strategy is helpful, it may not be enough. One of the strategies for monitoring your model is using statistical techniques to identify potential distribution shifts in the features used to train the models. Nepute AI, a start-up focused on experiment tracking and model registry, wrote a helpful article about the topic. However, monitoring your deployed model is always a complicated task. You need to establish a way to monitor its performance. You know that deploying the model is not the last step of the machine learning pipeline. After validation, the model is ready to be deployed. You gathered and cleaned the data, modeled the problem, and tested those models in unseen data. As a data scientist, you followed the machine learning pipeline for model deployment. Therefore, your task as a data scientist is to develop a machine learning model to predict the cost of discharging a patient before the patient is even admitted to a hospital. Let’s say you work as a data scientist in a hospital trying to implement a system to enable patients and insurance companies to evaluate the cost of discharging patients. The Case Study: Using Control Charts to Monitor ML Model Performance One way to do that is by using control charts. Therefore, data scientists and companies must monitor their deployed machine learning models. Today, machine learning models are being deployed in all kinds of industries. We can think of a deployed machine learning model as a virtual process. We also have virtual processes that don’t necessarily have a physical component and whose data is entirely on the cloud. Today, we have sensors that can automatically gather data about any process. Xbar control chart manual#In the earlier days, most data gathering had to be manual (i.e., physical data gathering). ![]() However, I believe that recent computational advancements have made control charts more practical than ever. Some think of control charts as an antique technique only suitable for manufacturing applications. ![]() In this article, I will present a case on how we can use the most basic control chart to monitor a deployed machine learning model. Although somewhat antique, I believe control charts are a valuable methodology for monitoring deployed machine learning models. Then, you can monitor your process across time using those bounds. The main idea of control charts is to determine if a process is under statistical control by setting lower and upper bounds (i.e., control limits) based on the probability distribution of your quality characteristic. This methodology was introduced by the statistician Walter Shewhart in the early 20th century and has found many applications in industry settings (most notably in the manufacturing sector). Control Charts for Machine Learning Using Python IntroductionĬontrol charts are a visual mechanism used to monitor a process by tracking independent observations of a quality characteristic across time. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |