当前位置：网站首页>Machine learning model monitoring (Apria)

Machine learning model monitoring (Apria)

2022-06-21 16:54:00 【Liguodong】

Machine learning model monitoring

What is machine learning (ML) Model monitoring ？

Machine learning monitoring is a group of tools that are used to observe ML Model and ensure its performance reliability . ML The model is trained by observing examples in the data set , And minimize the errors that represent the performance of the model in the training task .

production ML After training the static sample set under development , Reasoning about changing data from a changing world . This difference between the static training data in development and the dynamic data in production leads to the decline of the performance of the production model over time .

Example ：

Suppose you COVID The previous user data training model for detecting credit card fraud . During a pandemic , Credit card usage and buying habits will change . Such changes may expose your model to data in distributions that your model is not trained to . This is an example of data drift , Is one of several sources of model degradation . without ML monitor , Your model will output incorrect predictions without warning signals , In the long run , This will have a negative impact on your customers and your organization .

Machine learning model monitoring aims to use data science and statistical technology to continuously evaluate the quality of machine learning models in production .

Monitoring can serve different purposes ：

Early detection of instability
Understand how and why model performance declines
Diagnose specific fault cases

Besides , some ML Monitoring platform , Such as Aporia, Not only can it be used to track and evaluate model performance , It can also be used for review and commissioning 、 Interpret model predictions and improve model performance in production .

How to monitor machine learning

Engineers monitor software because systems built today are vulnerable to uncertainty in actual deployment scenarios . Again ,ML A model is a software system , But essentially ,ML Models are only about the same as the data we provide them . therefore , The traditional software monitoring technology is applied in ML Model is invalid .

An effective machine learning monitoring system must detect changes in data . Failure to proactively view these changes may result in model silence failure . Such failure will have a significant negative impact on the business performance and reputation of the end user . Find out your ML The model may not perform well in production 5 The most common reasons .

Model monitoring can help maintain and improve ML The performance of the model in production , Ensure that the model performs as expected . The deployment of ML Models interact with the real world . therefore , The data that the model sees in production is constantly changing . Once deployed in a production environment , The performance of the model usually starts to degrade .

Monitoring performance degradation will help you quickly detect when your model is underperforming . Performance metrics are specific to models and learning tasks . for example , Accuracy rate 、 Accuracy and F1 The score will be used to classify tasks , The root mean square error will be used for the prediction task . In addition to observing the performance indicators of real-world data , The data science team can also check the input data , To learn more about performance degradation .

Besides , The measurement model drift is ML An important part of the monitoring system . Model input 、 The output and actual values drift over time , Measured by changes in distribution . Check your model for deviations , To determine if they are obsolete 、 Whether there are data quality problems , Or whether they contain confrontational inputs . You can Use ML Monitor and detect drift To better understand how to solve these problems .

A comprehensive model monitoring solution should include ：

Data drift detection ： Tracking the distribution of each input feature helps to reveal the change of input data over time . You can extend this trace to Federated distribution .
Data integrity testing ： To detect changes in the input data structure , Please check if the feature name is the same as the feature name in your training set . Scanning for missing values in the input will reveal changes or problems in the data collection pipeline .
Concept drift detection ： Understanding the importance of each feature in the input data relative to the output is a simple and effective way to prevent concept drift . The change of feature correlation indicates concept drift . We can evaluate models with specific characteristics or perform correlation studies over time . These techniques also help to understand changes in model performance . for example , You may find the relationship between the importance of certain characteristics and the time of year .

Checking the input data creates a short feedback loop , To quickly detect when production models begin to underperform .

In addition to performance degradation , Production models may also perform poorly due to data deviations or anomalies .

Data deviation ： Training your model on biased data will achieve this bias in production . Consider training a model to classify cat and dog images . Suppose your training set has much more cat images than dog images . under these circumstances , Your model may achieve good accuracy by learning to classify most images as cats rather than learning the actual boundaries between cats and dogs . In order to prevent deviation of model output in production , We can analyze the unbalanced representation or bias in the target variables and input features in the training data .
abnormal ： Outliers are input samples of outliers distributed relative to training samples . Reasoning about outliers can almost guarantee that the results are inaccurate . To prevent poor model performance due to exceptions , We can first evaluate each input sample to ensure that it belongs to the distribution of our training data .

Drift detection in machine learning model

How to detect model drift in machine learning model

testing Model drift An obvious way to do this is through ML Monitoring technology and Aporia Wait for solutions , It ensures that model performance does not degrade beyond a certain point .

Data drift and concept drift are the main sources of model drift , Therefore, it is necessary to have the ability to detect data and concept drift .

How to detect data drift in machine learning model

Data drift is caused by changes in input data . therefore , To detect data drift , You must observe the input data of the model in production , And compare it with the training data . Note that the format or distribution of production input data and training data are different , This indicates that you are experiencing data drift . for example , When the data format changes , Suppose you train a model for house price forecasting . In production , Make sure that the input matrix has the same columns as the data you used during your training . The distribution change of input data relative to training data needs Statistical techniques To detect .

The following tests can be used to detect changes in the distribution of input data

Kolmogorov-Smirnov (K-S) test ： You can use K-S Test to compare the distribution of your training set with your input in production . If the distribution is different , Reject the null hypothesis , Indicates data drift . In our Concept drift detection method Learn more about this and other methods in the guide .
Population stability index （PSI）： Random variables PSI It is an indicator to measure the change of variable distribution over time . In the example of a house price forecasting system , You can measure the PSI, for example ： Square feet or average neighborhood income , To see how the distribution of these features changes over time . Large changes may indicate data drift .
Z-score：z-core The characteristic distribution between training data and production data can be compared . If calculated z-score The absolute value of is very high , You may encounter data drift .

How to detect concept drift in machine learning model

You can detect concept drift by detecting changes in the predicted probability of a given input .

Detecting changes in model output for a given production input may indicate changes in the level of analysis you are not working on .

for example , If your house price classification model does not consider inflation , Your model will start to underestimate house prices . You can also go through ML Monitoring technology （ for example ： Performance monitoring ） Detect concept drift . Observing changes in model accuracy or classification confidence may indicate conceptual drift .

How to prevent concept drift in machine learning model

You can ML Model monitoring to prevent concept drift . ML Monitoring will reveal a decline in model performance , This may indicate a conceptual drift , To promote ML The developer updates the model .

Except for this An observation based approach to prevention Outside , You can also use A time-based approach , among , ML The model is periodically retrained within a given degradation time range . for example , If the performance of the model becomes unacceptable every four months , Then retrain every three months .

Last , You can prevent concept drift through online learning . In online learning , Your model will be trained every time new data becomes available , Instead of waiting for big data sets to accumulate , And then retrain the model .

Machine learning performance monitoring

How to monitor machine learning performance

Performance monitoring can help us detect production ML Models underperform and understand why they underperform . monitor ML Performance usually includes Monitor model activities 、 Index change 、 Model obsolescence （ Or fresh ） and Performance degradation . adopt ML The insights gained from performance monitoring will suggest changes to improve performance , for example ： Super parameter tuning 、 The migration study 、 Model retraining 、 Develop new models, etc .

Monitoring performance depends on the task of the model . The image classification model will use accuracy as the performance index , But mean square error (MSE) More suitable for regression model . It is important to understand poor performance , This does not mean that model performance is declining . for example , When using MSE when , We can expect that sensitivity to outliers will degrade the performance of the model on a given batch . However , The observed degradation does not indicate that the performance of the model is getting worse . It just has an outlier in the input data , Use at the same time MSE As an artifact of measurement . Evaluating input data is a good ML Performance monitoring practices , Examples of such performance degradation will be revealed .

In the monitoring ML The performance of the model , We need to clearly define what bad performance is . This usually means specifying an accuracy score or an error as the expected value , And observe any deviation from the expected performance over time . In practice , Data scientists understand that models may not perform as well on real-world data as test data used during development . Besides , Real world data are likely to change over time . For these reasons , Once the model is deployed , We can expect and tolerate a certain degree of performance degradation . So , We use upper and lower bounds on the expected performance of the model . The data science team should work with subject matter experts , Carefully select the parameters that define the expected performance .

According to the use case , Performance degradation can have very different consequences . therefore , The acceptable level of performance degradation depends on the specific application scenario of the model . for example , We can tolerate animal sound classification applications 3% Decreased accuracy , but 3% It is unacceptable for brain tumor detection system to reduce the accuracy of .

ML Performance monitoring is a valuable tool , It can detect when the production model is not performing well and what we can do to improve . The following points are very helpful for fixing problems in underperforming models ：

Data preprocessing and ML Models in different modules . When changes to the preprocessing pipeline are sufficient , Data preprocessing and ML Models as separate modules can help you repair degraded models more effectively . Suppose you build a model , The model performs handwritten classification for mail from the U. S. post office . In production , The post office decided to use low-intensity light bulbs to save energy . Your model is now executing on a darker image . under these circumstances , Changing the data preprocessing module to increase the pixel intensity and enhance the boundary is enough to improve the model performance . Compared with retraining the model , It also greatly reduces cost and time .
Use baseline （baseline）. The baseline model is a simpler 、 Easier to explain models , You can get good results . You use the baseline model as a sanity check for large fancy production models . for example , Of time series data LSTM The baseline can be a logistic regression model . Observing a decline in the performance of the production model and a good performance of the baseline model may indicate that your production model is over fitting the training data . under these circumstances , The adjustment of regularization hyperparameters will improve the performance of the model . If there is no baseline model , You may come to the conclusion that , The model performs poorly due to data or concept drift , And retrain or build new models .
Choose a model architecture that is easy to retrain . Neural networks are powerful ML Algorithm , Because they can approximate any complex function . Besides , They are especially suitable for producing , Because it can only train part of the neural network . for example , Image classification models that encounter images from new categories do not require complete end-to-end retraining . contrary , We can transfer learning （ Only the classified part of the network is retrained ）, Use additional classes and redeploy .

To gain further insight from monitoring model performance , It is useful to visualize the production input data related to the training data and detect anomalies , Such as “ How to monitor machine learning ？” As described in section .

How to improve model performance

Even if concept drift and data drift are controlled , as time goes on ,ML The performance of the model may still degrade . Data scientists need to be constantly trained in new and updated data ML Model to combat this phenomenon （ Lead to ML The performance of the model declines again ）, Unless model performance is regularly improved .

Here are some techniques that can be used to improve model performance ：

Use more advanced tools ： Better tools may provide more functionality to improve ML The performance of the model , But we must consider these new ML Model tools are implemented into existing ML Time required in the system .
Use more data ： Increasing the amount of data used to train the model will help the model to be more generalized , So as to maintain the correlation for a longer time . If ML The system needs a lot of data for training , Such a solution may become impractical .
Use ML Model integration approach ： as everyone knows ,ML Model integration can improve ML The performance of the model , Because the integration predicts the most likely tags based on the predictions of several different models . Integration can help ML The system avoids concept drift , Because if a model in the integration is experiencing drift , Then its contribution to the integrated prediction will be masked by other models in the integration . This approach is based on maintaining integration （ensemble） At its own cost . Integrate （ensemble） Need careful monitoring , So as not to cause greater harm than performance improvement .
Use... With higher predictive power ML Model ： Those who want to build on concept drift and data drift ML People who use the system can consider using the usually more powerful ML Model , for example ： Random forest or generalized linear model (glm). ML Model integration can also use high performance ML Model creation . Model feature selection can be considered as a method to improve model performance , Although concept drift can lead to the failure of this method , Which leads to ML Developers use more complex ML Model algorithm .

Machine learning model management

What is machine learning model management ？

Model management is MLOps A subset of , Focus on experimental tracking 、 Model version control 、 Deployment and monitoring . Developing ML Model time , Data scientists usually do many experiments to find the optimal model . These experiments include data preprocessing 、 Hyperparametric tuning and changes in the model architecture itself . The purpose of testing is to find the best model for a particular use case . Before experimenting with suboptimal configurations in the future , Data scientists often do not know whether the current configuration is optimal . therefore , Tracking experiments are important for development ML Models matter .

In a typical scenario , Model development is a collaborative effort . Data scientists often use existing data from their peers notebooks As a starting point or an experiment . This cooperation makes it more difficult to reproduce the expected results .

Model management addresses these challenges by ：

Tracking indicators 、 Loss 、 Code and data versions to facilitate experimental replicability
Reusability is achieved by delivering models in repeatable configurations
Ensure compliance with changes in business and regulatory requirements

Version control system is used for ML Model management , But only some necessary functions are provided . The version control system only tracks the changes of the system source code over time . A practical ML The model management framework must also leverage the following ：

ML Model monitoring ： A system , It can make ML The model is visible , And can detect data drift 、 Unexpected deviation 、 Data integrity problems , These problems will affect the prediction and performance of the model .
Interpretability ： The ability to understand the relationship between features in input data and model predictions .
Data version control system ： Data versioning tracks changes to datasets , For testing 、 Training and deployment . The reasons for different data versions include changes in data preprocessing and data sources . More about data versioning , Please read our MLOps Best data versioning tool post .
Experimental tracking ： The experiment tracker records the results of each training or validation experiment and the configuration that produces these results . The recorded configuration includes the super parameters , for example ： Learning rate 、 Batch size or regularization, etc .
Model registry ： Registry of all models in the deployment .

In addition to developing in research ML Beyond the model , Building machine learning systems for production is a craft . therefore , production ML You need your own set of tools and practices to successfully deliver solutions on a large scale . Integration from the beginning ML Management ensures that you use the right tools to get the job done .

Please refer to the Build from scratch ML platform Hands on tutorial .

Why manage and monitor your after deployment ML Model

Deployed models are exposed to changing real data . therefore , After deploying the model ML Management is essential to ensure that the model continues to function as expected . ML A subset of management is ML monitor , This is a group used to observe ML Tools for model quality and performance . Have... For deployed models ML The management framework helps teams track performance metrics 、 Monitoring data changes , And gain valuable insights into the causes of poor model performance , This will provide information for improving performance . for example , The input data relative to the model training data in visual production can show the data drift , Force your team to retrain the deployed model based on the updated data .

ML Management can also help you track all models in your deployment . ML Management includes keeping the model registry of all deployed models and using the model version control system . Model registry and version control are combined with performance monitoring , For... In production ML The model provides a convenient global health dashboard . With the help of model registration and version control system , The team can better identify which features are causing a given model version to perform poorly in some settings . This makes the improved deployment model more effective .

Last , Manage after deployment ML The model will help track degradation models in production , And better schedule diagnostic tests , To learn more about poor performance .

Interpretability (XAI)

What is the explicability of machine learning ？

send ML Model explicability is about building the ability to understand the relationship between the characteristics of input data and model predictions . Machine learning usually adopts an architecture with thousands of learnable parameters , Used to estimate a complex function . This makes it difficult to describe what happens inside the model to easily produce its output . The question is ML The model won “ black box ” The title of .

Make sure ML The model can explain the complexity in the real world , because ：

The interpretation of the algorithm results depends on how much data you have available
There are many ways in which machine learning algorithms can go wrong

We measure the interpretability of the model by looking at several different aspects ：

Whether the decision-making process can be explained
Accuracy of model prediction results （ That is, accuracy ）
How reliable is the decision of the classifier

Trying to understand what's wrong with machine learning algorithms requires a lot of investigation , It can be challenging . especially , If there is a deviation in the data used to train the model , We can't judge whether these deviations are caused by mistakes in training , Or is it simply due to inherent flaws in the data .

send ML Model explicability is essential to prevent model drift in production , Because it eliminates a lot of guesswork involved in troubleshooting poorly performing models .

The implementation can be explained AI A practical guide to , see also Aporia About interpretability Documents .

Machine learning experiment tracking

What is machine learning experiment tracking ？

ML Experiment tracking is the process of saving all experimental results and configurations to achieve the repeatability of these experiments .

ML The researchers conducted many experiments to find the best model , And it is difficult to track all the experiments and their related results . To find the best model ,ML Researchers on various data sets 、 Hyperparameters 、 Model architecture 、 The package version has been tested for many times .

Experimental tracking is important , Because it will help you and your team ：

Organize all... In one place ML experiment . You can run the experiment on your local machine , And teammates in the cloud or use google colab Run their experiments . ML The experiment tracking system will record the experiment metadata and results from any system or machine .
Compare and analyze the experimental results . The experiment tracking system ensures that all experiments are recorded in the same format , Thus, different experimental configurations and results can be compared without increasing the cost .
Strengthen collaboration with the team . The experiment tracking system will record the operators of each experiment . All team members can see what other members have tried . They can also pull experiments run by others , Copy it , Then continue to build from there .
Watch your experiments in real time . The experiment tracking system makes it easy to start the experiment and observe its operation remotely from the dashboard . During the experimental run , You will be able to see the loss 、epoch Time and CPU/GPU Utilization rate and other indicators . This is especially useful for running experiments in environments that are difficult to visualize , For example, in the cloud on a remote machine .

To effectively track ML experiment , You need to track ：

Code ： This includes scripts for running experiments and notebooks
Environmental Science ： Environment profile .
data ： Use the data version to track the data version used in the experiment
Parameters ： Parameter configuration includes super parameters of the model itself , For example, learning rate , It also includes any editable experiment options ; for example , Number of threads used by the data loader
indicators ： Training 、 Verification and test losses are examples of general metrics to track . You can track metrics that are specific to the model you are training . for example , You may also want to track gradient norms when using deep Neural Networks .

ML Model monitoring and tracking

ML Model monitoring and ML The fundamental difference between experimental tracking is , Model monitoring is mainly performed after the model is deployed to the production environment . by comparison , Tracking is most relevant before deployment .

Once the model is put into production , We will implement ML Monitoring to maintain and improve model performance . Once the model is put into production , We can monitor it , To observe the performance indicators of real-world data , And notice the performance degradation when it happens .

On the other hand ,ML Experimental tracking involves ML Put the research and development of the system into practice before it is put into production . ML Tracking can help ML The researchers followed in ML Code for all experiments run in the model development cycle 、 Environment configuration 、 Data version 、 Parameters and indicators , To find the best configuration . ML Track and only ML Study the environment without deployment , for example ： For the purpose of the research paper .

Machine learning model registry

What is a model registry ？

The model registry is the repository for all models in production . It provides an access to all the trained and available ML The center point of the model . The purpose of this approach is to provide an access to 、 A unified way to search and manage each deployed model to improve model reusability . And ML Related ecosystems （ for example ：OpenML、ModelZoo and MODL-Wiki） Is an example of a community effort to develop such a model registry .

An important aspect of the model registry is that all models are stored in a central location , This means that everyone looks at the same model . People who collaborate on a project have a reference to each model . The model registry bypasses the problem of slightly different versions on the local computer .

The model registry makes ML Project collaboration becomes easier ：

Connecting the experimental and production life cycle ： The model registry provides a standardized way to capture models from the development lifecycle and use them in stages for production deployment . The model registry passed ML Continuous integration of models 、 Delivery and training (CI/CD/CT) Promote researchers and MLOps Interaction between engineers .
Provide a central dashboard for the team to work with the model . Accessing the centralized location of models makes it easy for teams to search for models and check their status , For example, temporary storage 、 Deploy or retire . In the central instrument panel , The team can also refer to the training and experiment results through the experiment tracker , And pass ML Monitor and view the real-time performance of the model in production .
Present an interface for other systems to use the model . The model registry can provide for integration with other applications or systems API, So that we can put ML The model is provided to third-party client applications . The client application can pull the latest version of the model , And automatically understand the changes made to the model due to model degradation .

Link to the original text ：Machine Learning Model Monitoring 101

原网站

版权声明
本文为[Liguodong]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/172/202206211314491442.html