Deploying Models in a Simulated Edge Environment

3 min readOct 25, 2022

Edge computing is growing in popularity and capability to bring new ML and AI business opportunities across industries of all types. But what is “the edge”? Machine Learning at the edge is a concept that brings the power of running ML models locally close to the source of the data, to minimize latency and network transport requirements.

However, ML at the Edge runs into the same operational challenges as traditional cloud deployments. These challenges such as compute and operational efficiencies, scale, flexibility across different workloads and actionability through getting ahead of issues, tightening the feedback loop and taking preventive and corrective measures in a timely manner, are all blockers to ML value realization.

The Wallaroo Edge stack helps overcome these last-mile issues by enabling deployment on-device, to local servers, and in cloud environments using the same engine and providing the same advanced observability capabilities.

When it comes to deploying and managing ML models to edge devices Wallaroo provides two key capabilities:

Since the same engine is used in both environments, the model behavior can often be simulated accurately using Wallaroo in a data center for testing prior to deployment. This notebook demonstrates how.
Wallaroo makes edge deployments “observable” so the same tools used to monitor model performance can be used in both kinds of deployments.

In the tutorial below we will step through testing an edge deployment in the same manner as a non edge configuration. The primary difference is instead of providing ample resources to a pipeline to allow high-throughput operation we will specify a resource budget matching what is expected in the final deployment. Then we can apply the expected load to the model and observe how it behaves given the available resources.

You can try this tutorial out for yourself by downloading the free Wallaroo Community Edition and going through the ML Edge Simulation tutorial and follow along with the tutorial video.

We will be using an open source model that uses an Aloha CNN LSTM model for classifying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution. This could be deployed on a network router to detect suspicious domains in real-time. Of course, it is important to monitor the behavior of the model across all of the deployments so we can see if the detect rate starts to drift over time.

For our example, we will perform the following:

Create a workspace for our work.
Upload the Aloha model.
Define a resource budget for our inference pipeline.
Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
Run a sample inference through our pipeline by loading a file
Run a batch inference through our pipeline’s URL and store the results in a file and find that the original memory allocation is too small.
Redeploy the pipeline with a larger memory budget and attempt sending the same batch of requests through again.

All sample data and models are available through the Wallaroo Quick Start Guide Samples repository.

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Note that this example is not intended for production use and is meant as an example of running Wallaroo in a restrained environment. The environment is based on the Wallaroo AWS EC2 Setup guide.

For full details on how to configure a deployment through the SDK, see the Wallaroo SDK guides.

Operating in a Simulated Edge Environment

Step 1: Connect to Wallaroo…

The full version of this tutorial and others are available on our blog at wallaroo.ai/blog

Deploying Models in a Simulated Edge Environment

Operating in a Simulated Edge Environment

Written by Wallaroo.AI

No responses yet