How Quickly Can You Update Drifting Models? Key Lessons For Model Iteration In Production
Your machine learning models reflect a snapshot of the data they were trained on, which itself reflects the conditions for when that data was generated. If those conditions change (that is, the model starts to drift), the model stops providing value. In fact, it could even be worse than no model at all if its predictions are based on obsolete assumptions. For example, if a retailer’s demand forecasting model doesn’t pick up recent changes in consumer preferences it could lead to substantial costs from having too much slow-moving inventory and too little of a profitable product.
Requirements for Agile MLOps
In order for AI and ML initiatives to continue generating value to the enterprise, MLOps need to go beyond getting any one model into deployment. What does it look like for model operations that are responsive to fast changes in the environment? What are the requirements in order to facilitate rapid iteration in ML?
- Data scientists are alerted as soon as models start to drift so they know they need to retrain the model based on the latest data.
- The data scientists can home in on the source of drift: is this from data inputs drifting away from expectations or are model outputs drifting outside of established benchmarks?
- Data scientists are able to validate that their updated model works against production data since you can’t assume a newer model automatically means a better model.
- And finally, data scientists and ML engineers work together to quickly undeploy the obsolete model and replace it with the updated model without disrupting the business. That is, all downstream systems dependent on the model outputs are still being fed.
How Wallaroo Enabled Top-10 Global Bank to Update Machine Learning Models Faster
Recently we worked with a financial services firm using hundreds of different machine learning models to analyze over a billion daily network events to their site and apps in order to detect malicious traffic. These models need to be retrained frequently given how these malicious actors continuously adjust their tactics. Unfortunately replacing outdated models could take weeks with their given MLOps processes. Using Wallaroo, they had instant model insights to detect anomalies in real time instead of batch, as well as the ability to deploy new models in seconds, not weeks.
What made the Wallaroo platform unique that enabled their ML program to become agile and iterative instead of slow and inflexible?
- Our model insights and observability provided immediate alerts to data scientists once a model started to drift both in terms of inputs as well as outputs.
- Our workspaces paradigm made it easy but still secure and compliant for data scientists and ML engineers to collaborate in the handoff, deployment, and ongoing monitoring of models in production to ensure live models were still accurate.
- Wallaroo’s support for various testing frameworks accelerated the ability to test different versions of a model against real world data to ensure optimal accuracy before rolling it out live and affecting the business.
- And finally, replacing an obsolete model with an updated model can be accomplished in seconds without affecting downstream systems using Wallaroo’s model pipeline architecture.
Taken in aggregate we refer to these capabilities as our ML Model Operations Center. By bringing these disparate functionalities into a single hub, our client was able to iterate faster and ensure their hundreds of models continued to deliver business value. And we know our job is never done either as we continue to streamline more processes in model management so that the barriers to operationalizing AI are so low that any part of the organization can adapt AI for their use cases.
You can test Wallaroo’s Operations Center for free using our Wallaroo Community Edition. You can also drop us a line to speak to any of our AI/ML specialists about the blockers you are encountering putting your machine learning to work.