How to Scale from One Model to Thousands

4 min readMar 1, 2022

If your business has been struggling with its ML deployment process, Wallaroo may be the solution for you. Email us at deployML@wallaroo.ai.

When an organization puts its first few models into production, it’s easy to manage them individually, as pets, if you will. The process of going from a trained model to usefully running the model against production data might be cumbersome, but at least it is doable.

As an organization sees financial success in the use of their first models, expect the demands to grow rapidly. Whether by expanding into adjacent business areas, finding new ways to optimize your existing models or by hyper-segmenting your models around fine-grained information, an unsuspecting data science team might rapidly go from just one or 10 models in production to needing to deploy hundreds or thousands.

Wallaroo gives you all the tools you need to efficiently scale up your machine learning infrastructure as far as you need it to go.

Hand-Built Deployment Solutions

The data scientist’s primary role is to design, build, and validate the best models to solve business needs, The skills and knowledge needed to accomplish this are not necessarily the skills needed to put those models into operation, especially at a large scale. In order to get their models into production, and to monitor them in the production stack, companies often have to dedicate substantial technical resources to designing and maintaining an ad-hoc deployment process. Resources spent on this effort are resources that are not being spent on the company’s core mission.

The Wallaroo platform is a ready-made, easy to install solution that streamlines the process of deploying and monitoring models in production. This allows your teams to focus their resources where they matter the most: the business.

The Wallaroo Approach

The Wallaroo platform focuses on the ML last mile: model deployment and monitoring. The platform runs in your environment: on-prem, edge, or cloud. It integrates with your data ecosystem, and it allows your data scientists to develop models with the tools that they prefer. Unlike an all-in-one MLOps platform, we fit your process — not the other way around.

The Data Scientist’s Self-Service Deployment Toolkit

With ad-hoc model deployment processes, it can take weeks or even months for a single model to go from development into production. Wallaroo’s deployment toolkit provides an easy to use SDK, API, and UI that allows a data scientist to specify their model pipeline — including any necessary data pre-or post-processing — and then deploy that pipeline into a staging or production environment in seconds, with just a line or two of Python. By reducing model deployment time, teams can get more models into production, and iterate faster to improve them.

The Wallaroo Compute Engine

Wallaroo’s distributed compute engine was specifically written to provide blazingly fast inference capabilities, efficiently. Not only does our high-performance engine let you do more inferencing using fewer resources, but our auto-scaling features automatically tailor the resources to variations in load. When switching to Wallaroo, it’s not unusual for our customers to see 5 to 12X improvement in inferencing speed, with as much as 80% less infrastructure than before!

So not only can you more easily deploy more models than before, you can do it at a lower cost.

Model Management and Observability

Once the models are in production, Wallaroo provides comprehensive observability and tracking of those models:

Detailed event logs and full audit logs, to support performance monitoring and compliance.
Data validation checks to help guard your models against unexpected data issues.
Advanced model insights to monitor model outputs and inputs for data drift or concept drift that might affect your models’ performance.
Configurable alerting capabilities to quickly catch any acute problems in production.
A Model Registry that tracks the models (and versions of models) that have been deployed, by whom, and where they are being used.

Data scientists and ML Engineers can keep tabs on deployed models in real-time via the Wallaroo dashboard, or by exporting log data to the tools of their choice.

When needed, models can be easily updated or rolled back, via the Wallaroo SDK/API.

Streamline the ML Deployment Process

Wallaroo enables Data Scientists and ML Engineers to deploy enterprise-level AI into production simpler, faster, and with incredible efficiency. Our platform provides powerful self-service tools, a purpose-built ultrafast engine for ML workflows, observability, and an experimentation framework. Wallaroo runs in cloud, on-prem, and edge environments while reducing infrastructure costs by 80 percent.

Wallaroo’s unique approach to production AI gives any organization the desired fast time-to-market, audited visibility, scalability — and ultimately measurable business value — from their AI-driven initiatives, and allows Data Scientists to focus on value creation, not low-level “plumbing.”

Want to Hear More?

If your business has been struggling with its ML deployment process, Wallaroo may be the solution for you. Email us at deployML@wallaroo.ai.

You can find more information about Wallaroo at https://www.wallaroo.ai/blog.