Edge machine learning deployment architecture on Wallaroo
The challenges with last mile machine learning are magnified for edge:
- How do you deploy a model to an environment that might not have consistent or any connectivity?
- How do you run that model efficiently in power and compute constrained environments?
- How do you monitor the ongoing accuracy of predictions in a live environment?
- How do you manage versioning to make sure all devices have the latest model?
- How do you run A/B tests or stage experiments on a subset of locations or devices to validate before rolling out to all clients?
But edge ML deployment is more than just sticking some code in a device. MLOps teams need to consider how four different environments interact with each other:
- The model artifact (for example, a notebook) which is the end result of the model development
- The model registry managing which models go to which devices
- The “fat edge” environment where IoT devices are managed — not just the ML models but everything that goes with fleet management like software updates, security, data aggregation going into and out of the IoT devices, etc.
- The IoT device itself, consisting of sensors picking up external data and software (including the ML model) running on constrained compute, power, and (possibly) connectivity.
As purpose-built installed software for any environment, Wallaroo is uniquely positioned to help deploy, manage, and run ML models at the edge. Here is how Wallaroo works at each of these stages:
1. Model training:
Check out this blog post from Databricks for an in-depth discussion around the IoT architecture to train ML models. But basically Wallaroo is agnostic to where or what you use to develop your models. Wallaroo takes nearly any model artifact resulting from model training and converts it to ONNX to run as efficient code.
2. Model registry:
At this stage, the Wallaroo model registry manages pushing out the latest model to each device and helps with experimentation (A/B/… testing, shadow deployments). Developers can choose to use either the Wallaroo UI as the central hub and dashboard for automation and managing the models, pipelines, and logs, or they can use Wallaroo’s SDK to manage through their portal of choice.
3. Data and edge device aggregation:
In cases where latency matters, like in smart factories, you will usually have on-prem servers for near field/low latency device management and orchestration as well as aggregating and cleaning telemetry data being sent back from each of the clients before sending back to cloud storage. Enterprises select vendors like Edge Foundry and Litmus focused solely on device management. The Wallaroo instance (Wallaroo Standalone) is installed on this edge server to handle ML model observability to provide model insights (drift detection, experimentation results, etc) which it then feeds back to whatever you are using for model training to alert about need for model retraining. For IoT like autonomous vehicles where there is no fat edge compute, these aggregation functions can take place in the cloud.
4. Wallaroo Edge:
You can deploy our edge ML solution via connection or air gapped. Simply load software on a USB stick or other separate drive and send a technician to set up the Wallaroo Edge node on-site. From there the edge device takes in sensor data from IoT device, runs ML inference efficiently, and then feeds results to the application (e.g., results of computer vision to the robot arm on the assembly line to let it know something coming off the line does not meet QA).
The above is our overall vision for how to deploy machine learning at the edge with Wallaroo. Of course, each edge deployment, even more so than cloud or on-prem, will have its own unique needs and limitations. Email us at deployML@wallaroo.ai to see how we can improve how your models are managed and running at the edge.