Working Smarter with Machine Learning Model Chains
Software developers have long been slicing their code into modular, reusable parts that they can click into any application — instead of writing code from scratch for every new context. To achieve this level of efficiency with machine learning (ML), you’ll need to do the same with your ML workflows.
Model chaining (or pipelining) is the process of splitting up your ML workflow into independent parts. This allows you to re-use those parts in other workflows and build new ones much faster and easier. These chains don’t have to be simple linear chains and can include any set of steps combined in a DAG (directed acyclic graph) to produce a final outcome. This includes model stacks, for example, where the results of multiple sub-learners are combined by a last step into a final result.
Consider the scenario of breaking a video down into images for image classification. Once you’ve separated the images as your output, you can use that output for different parts of the post-production process. You can send it to a model that applies color correction and, at the same time, to another model that steadies the camera during jumpy scenes. The results would then pass through a model that recombines them into a video again.
Model chaining is what makes it possible to get this done faster and at scale, while also enabling you to optimize or swap out a specific model without compromising the entire workflow.
Getting model chaining right, however, can be challenging and expensive. But with the right approach (and the right AI/ML platform), you can remove a lot of the painstaking work and instead shift that time and effort over to higher-value tasks.
To better acquaint you with model chaining, here’s an introduction to how it works, the main benefits and challenges, and how you can streamline ML model chaining at speed and at scale.
Understanding the basics of model chaining
Model chains are essentially two or more models that execute sequentially. At its simplest level, the model chaining process looks something like this:
- Load two ML models
- Bind the input to the first model
- The output of the first model becomes the input of the second model
- The output of the second model is the desired output of the whole chain
For example, say you’re creating a model that helps predict sales. You have a set of features that can be used as the input for the first model (like weather, prices, shape, and color) that can influence whether a product will sell. The first model encodes the elasticities (the variable inputs) that influence the sales output.
Knowing what variables will influence sales is great. But what happens when you want to take this a step further and make decisions that will increase sales, rather than simply predict them?
For this, the second model optimizes the first model by running through the data and capturing the relationship between the changes in features and changes in sales. When you introduce the optimization stage, you can make more informed decisions based on the data you have available.
The benefits of model chaining
In the above example, you could run the whole thing as a single large model instead of breaking it up into two components. However, separating them makes it easier to optimize or swap out a step without breaking something else down the pipeline.
This scenario highlights the advantages of chaining independent models together rather than relying on just one long, complex pipeline. To give you a better picture, here’s a deeper dive into the main benefits of model chaining:
Modularity
As you know, having a repository of pre-built models lets you create different use cases using the same models — almost like Lego pieces. This means you can simply call the parts of the pipeline you need (when you need them) and plug past models into new pipelines without repetitive development.
Another enormous advantage of this modular setup is the ability to execute the steps in an ML workflow at different speeds or times. For example, if you chain two independent models together, you can:
- Run Step One well in advance of Step Two.
- Store and reuse the output of Step One without incurring the costs of re-running it.
- Experiment with many variants of Step Two to implement the version that produces the most accurate results.
- Create a fork: the output of Step One might be useful in an entirely different context.
In addition, model chaining is particularly useful when you have a number of models that all require specific inputs to work correctly — since it can transform outputs on the fly to fit the use case. It also simplifies model versioning since you only have to manage one service, and any workflow that calls it will automatically integrate your updates.
Optimization
With a single model, retraining any aspect of it means retraining the entire thing. This is both time consuming and highly error prone as everything is inextricably linked with everything else. With model chaining, you can optimize each segment on its own while preserving the integrity of the pipeline as a whole.
This also enables different teams to safely work on different segments of the chain without stepping on each other’s toes. Furthermore, once you’ve identified the models that get used the most across the organization, you can tweak deployment to optimize runtime or even call key algorithms to run in advance to avoid cold starts.
Independent of language and framework
Model chaining relieves the need to consistently use the same language and framework to execute a monolithic ML pipeline. This is actually a huge reason why most ML rarely reaches production since models first have to be painstakingly re-engineered using Java frameworks.
Fortunately, chaining models together using APIs means you’re free to build using the language of your choice, whether that’s Python or R, and with the frameworks that you already know. This speeds up model development and enables you to simply connect them in your ML pipeline and play to the strengths of your usual stack.
The challenges of model chaining
For all its benefits, model chaining comes with a few challenges. One notable challenge is the risk of introducing undesirable nested model bias. This is when the model appears to work well in the training environment, but then performs poorly in production.
Another major challenge of model chaining is that everything needs to be put together exactly right. Think of it as building a tunnel where every section must fit together seamlessly, otherwise the entire construction becomes unusable.
This example from Microsoft illustrates this hurdle perfectly by outlining an elaborate seven-step process to create a model chain. If you fudge any of the steps, even in the slightest way, the whole chain will collapse since the output of one model will no longer work as the input for the next model.
The problem is that in a fast-paced industry like machine learning, teams rarely have the time (or resources) to meticulously line up models this way — and their time could be better spent on other tasks, like overcoming nested bias. As more models are added to the chain, these challenges only worsen and ultimately turn into a very costly operation that fails to deliver any real business value.
Simplify model chaining with Wallaroo
The good news is that, as complicated as model chaining might seem, there are tools that can significantly streamline the entire process. One such tool is Wallaroo — an enterprise ML platform for production AI that puts speed and simplicity first so you
can turn your data into business results within seconds, not weeks.
Wallaroo reduces model chaining to just a few lines of code, sparing you from dealing with servers, environment setup, and all the headaches that come with it. All you need to do is tell Wallaroo what you want the model to do, and let the platform take care of the rest.
In our next post, we’ll show you an example of how simple it is to create a 10-step model chain with Wallaroo. In the meantime, if you’re ready to build model chains without the stress, contact us today to get started.