The conventional approach to operationalizing machine learning involves data scientists building a model and then handing that model over to ML engineers to deploy into production.
The problem with this waterfall approach is ML models do not lend themselves to clean handoffs from data scientist to engineer. ML models require constant monitoring and tweaking even in production as the world continually changes. For example an ecommerce retailer will need to continually update their recommendation engines to reflect changing consumer preferences or competitor offers. Operationalizing ML models is a cycle of feedback and optimization, requiring close collaboration among data engineers, data scientists, and ML engineers.
So ensuring that models remain relevant to real world conditions means model operations that:
- Give transparency into the ongoing performance of models to data scientists
- Provide a clear way of communicating to data scientists and ML engineers that a model has started to drift
- Make it easy to deploy a new version of a model and undeploy an outdated version without interrupting the business
- Allow data scientists to test various models against each other to see which performs best in live conditions.
This process of constant feedback and optimization requires ITOps to rethink access controls and permissions to allow for collaboration between different types of users for some facets of model operations, while still maintaining clear governance, security, and quality control over who can push models live into production.
Scope of Responsibilities For Main Model Operations Personas
To simplify, we see three main personas in model operations (Admin, Data Scientist, and ML Engineer). See Table 1 for a more in-depth breakdown of what is in scope and out of scope for each role, but overall:
- The Admin provides the infrastructure and tooling for platform users to use. Their main concern is managing resource usage, costs, and security while enabling business teams to get their jobs done.
- Data Scientists are focused on finding patterns in the data to build a predictive model solving a specific business problem like: what is best route for my delivery to take? How can I automate quality control at each step of the manufacturing process? Who are my most valuable customers?
- ML Engineers take the models built by data scientists (usually in the form of a notebook) and get them live into production for use by the business.
Wallaroo’s Workspaces & Role-Based Permissions
When we talk to enterprises, the plurality are doing some form of building in-house deployment solutions on an ad hoc basis. For platform administrators this makes it incredibly difficult to understand who is deploying what where. They will often respond by limiting all access to the production environment to just a few individuals, but this comes at the cost of giving other users, like data scientists, visibility into the ongoing performance of those models.
What they need is the ability to easily add or take away not just new users, but also provide different roles-based types of access. So for example, allow data scientists to monitor the output of models but not necessarily the ability to deploy models live. And even within that broad rubric, enable project owners the ability to add or take away users from a project. That is, allow data scientists to view only the models they are assigned and not have broad access to all live models, which has security and compliance implications.
We’ve built Wallaroo on a workspace-based paradigm to make it easy for administrators to give the right kind of access to the different user types. You still have your overall Platform Administrator who sets up the infrastructure, manages resources, manages which applications can be installed in the enterprise data ecosystem (whether on-prem or on the cloud), and manages adding or taking away platform users. The Admin owns platform and workspace permissions — specifically:
- Install/uninstall Wallaroo in the admin console
- User management (create/update/delete/reset users)
- Manage user entitlements
- Workspace management and administration (Create/Update/Delete workspaces)
- Can see all workspaces/models/pipelines
- Has full read/write permissions on models and pipelines in all workspaces
The Admin then grants different access and enablements to Platform Users who cannot install/uninstall the platform, or manage Users (create/update/delete/reset/activate/deactivate users and entitlements) and only have access to assigned workspaces.
The type of role they have within their workspace is defined by whether they are a Workspace Owner or a Workspace Collaborator. The main difference between a Workspace Owner and a Collaborator is the Owner can add/remove/promote/demote collaborators. So for example, if an ML Engineer is a workspace owner, they can enable permissions for an external data scientist to do shadow deployments to test a model against a champion, but not allow the external data scientist to turn the model fully live in production.
A Workspace Owner has the following permissions in their Wallaroo workspace:
- Collaborator management (add/remove/promote/demote collaborators)
- Can see only workspaces assigned to them
- Can configure specific workspace connections or integrations (data connectors for data store access within a workspace or compute resource allocations for a workspace)
- Has full read/write permissions on models and pipelines in their workspaces
- Can see other collaborators
- Cannot manage (add/remove/promote/demote) collaborators
- Cannot create other workspaces
- Can see only workspaces assigned to them
- Can manage specific models in the workspace
The New Class of Model Ops Stakeholders
As AI/ML has become more embedded in strategy and operations, the C-suite is looking to stay updated on KPIs. As a result, business analysts using business intelligence tools like Tableau and PowerBI are being tasked with measuring the impact of ML on the business and providing consumable reports to leadership. They are not building models (like data scientists) or managing model operations (like ML engineers). What we have instead is a new collaboration type, with business analysts working with data scientists to interpret model inference results and compare them against ground truth or extrapola business KPIs. They may need to run datasets or subsets against a deployed model to gather early insights or test the model viability against a business KPI.
Likely there will be more citizen data scientists or business users who will become part of ongoing model operations. For Platform Admins, this means looking at scalable ways to manage user permissions that provide both the flexibility the business needs as well as the transparency and controls required by Security and Compliance. Our Workspaces paradigm goes beyond the ability to share a notebook and enables the right kind of collaboration while still being secure.