How Can CPGs Combat ‘The Big Drift’ Putting Their Business At Risk?
It would not be an exaggeration to say that the Retail & Consumer Packaged Goods (CPG) businesses are driven by “forecasts”. Forecasts determine the products and the quantities that get put on the manufacturing floor. The products manufactured determines what gets distributed to the different stores and that then determines what gets stocked on shelves. If the forecasts are wrong, then the consumers will not have the product they want, leading to shelf space with slow-moving inventory, and more importantly, the potential loss of customers as they substitute one product with a competitor’s.
In my own experience as CTO, board member, and advisor at CPG companies, the delay between forecast to stocking on shelves was 90-days. So, in other words, human beings (aka sales teams) had to guess what a group of shoppers in a certain region would want three months into the future.
Historically, CPG forecasts were based on previous year sales layered with human intelligence on local factors that might affect demand up or down. Recently however, CPGs have embraced AI and Machine Learning to build more precise models integrating more factors like weather, social trends, and supply chain data. While these AI-based models can be much more precise than human analysts, they are also prone to hidden biases that can lead to discrepancy from current customer preferences.
Recently we spoke to the head of supply chain analytics at a global CPG brand. He mentioned how last quarter one of their AI models predicted demand that was off by more than 50% from the previous years. The magnitude of the “drift” caused the manufacturing floor to question the model. When they dug deeper, they saw that their model had been thrown off by the massive shift in demand during COVID. But now that consumption had started to shift back to normal, their AI model had failed to weigh demand signals from the past few weeks as heavily as overall demand from the last two and a half years.
This error from prediction by the model to reality is called “drift”. CPG companies can easily have millions of such ML models due to the number of products and the number of markets they operate in. So, not only is “drift” a huge problem, but since there are millions of models, the problem of drift gets multiplied millions of times.
The data scientists at such CPG are great at building very precise models (the first mile of ML operations), but what they also need is “an ML control room” that makes it easy to:
- Detect when a model is starting to drift
- Test models against the latest real world conditions
- Replace an outdated model with a better one very quickly
An ML Control Room Built For Scale
When do you need a control room for your machine learning? First, let us step back and look at what is an ML control room?
There are many functionalities in a ML control room, but overall the control room is the single dashboard that data teams can reference to oversee all their models in development, testing, and production. Conventional wisdom is that you don’t need a control room until you achieve a certain level of scale as measured by the number of models in production. Before that, your ML engineers and data scientists can handle manually using a basic model repository.
However, as we’ve started working with more enterprise teams we see that it’s not about the number of models that drives a need for a machine learning control room but rather the complexity of the models:
- First is the complexity of getting models online and offline. How are the data scientists handing models over to ML engineers for testing? How do they manage versioning to deploy the latest and greatest model while undeploying an outdated model, especially if this is one model in part of a larger chain (for example, updating a segmentation model that is part of a larger dynamic pricing model)?
- Next is the complexity around all the permutations of a single model. A single demand forecasting model can have many permutations per product line per region, with additional versions for testing and optimization. How easy is it to search all models to find the one they’re looking for, check its status, and move it from testing to production?
- Then there is the complexity of optimizing compute resources across data scientists and use cases. An ML control room enables different data scientists and use cases to share resources instead of standing up dedicated pipelines that can suck up resources even when they aren’t in use.
- And finally there is ongoing monitoring, testing, and troubleshooting to optimize the performance of live models. A proper ML control room provides full model observability to ensure you have the best performing model in production.
What does your forecasting look like? Are you able to pivot quickly based on what your customers and supply chain are telling you in real time? Do you have a good line of sight into how your models are performing on an ongoing basis down to the regional and product level? Overall, do you feel you have a good platform for managing all your models, whether they are in the testing or production phase?
Recently, I came across a company named Wallaroo Labs. Microsoft Corporation has invested in this company to help create the ML Control room for businesses that need it.
I would love to hear from you. Obviously, I see Wallaroo as an answer to these problems in the last mile of ML, but I want to make this a conversation and learn from you about how you are building a control tower for your model operations.
About Manish Sinha
Manish is a special advisor to Wallaroo and an award winning CIO with global and multi-industry experience. Winner of the CIO of Year and the Peer CIO awards. Nominated by peer CIOs to the Wall Street Journal CIO Forum in USA. Member of the Senior Leadership Team in IT for Loreal, ANSYS, UBS, Yahoo and Microsoft. Broad range of experience from Infrastructure, Artificial Intelligence (AI), Cloud solutions to Business applications. Recognized by PwC and ANSYS Board for making multi years of information security enhancements in 18 months. Used AI, Master Data Management (MDM) & Enterprise Monitoring to predict incidences and prevent outages. Active in giving back to the IT community through advisory councils (ServiceNow & Microsoft), VC advisory forums, writing publications and speaking at global conferences like Economic Times, and National CIO Review.