Addressing the Unique Requirements of MLOps for Healthcare & Life Sciences

4 min readMay 31, 2022

Healthcare & Life Sciences (HLS) is a broad industry category encompassing disparate enterprises with very different business models like healthcare providers (hospitals), benefit payers (insurers), pharmaceutical companies, BioTech, and medical device manufacturers. Because of the huge volume of data generated from patients, experiments, devices, and even social media, as well as its large share of US GDP, it has been one of the largest areas of AI investment over the last ten years.

Precisely because it deals with people’s health and wellbeing, the bar for operationalizing machine learning is much higher in HLS due to the added degrees of increased validation and caution. And while a biotech firm might have a different model than a national hospital chain, we have found 3 common requirements that make AI unique across all of HLS:

Regulatory compliance: Safety and privacy regulation means data science teams can’t introduce just any tool for analyzing confidential patient data. MLOps tools need to meet all requirements (like HIPAA), no matter the data environment.
Explainability and experimentation: There are some areas where black-box ML approaches can work, but for the most part researchers and regulatory approval require establishing causality through continuous and concurrent experimentation. Data scientists and researchers need to quickly identify when and why their back-tested models aren’t matching in the field conditions (for example, chest x-rays of sickly patients lying down threw off early diagnostic models for COVID).
Efficiently analyzing massive and unstructured data sets: As an example, data about a single human genome sequence would take up 200 gigabytes. Additionally, much of HLS data is unstructured, like clinical notes, digital pathology slides, or X-ray images. Clinical data abstraction can benefit from complex natural language processing (NLP) models deployed easily on large and complex clinical data pipelines or using computer vision (CV) for AI-assisted imaging data classification and segmentation. But these are computationally intense models that can be expensive to run in production

Where you can move faster with ML in HLS

When it comes to AI in healthcare, generally anything that touches patient diagnosis and therapies and outcomes, where AI is considered a medical device, will have much higher regulatory hurdles, particularly when biases in the training data can lead to less accurate predictions or recommendations when applied to minorities. But AI can be used in a lot of research use cases to help scientists get to insights faster, which drives efficiency in drug development, clinical trial design, and time to insights in research/studies. For example, Pfizer used ML to quickly clean post-trial test data, taking a highly manual process that normally takes more than 30 days to less than 22 hours. This contributed to their record-breaking vaccine development.

Depending on the sub-vertical within HLS, we see use cases with accelerated ML adoption. For example:

For providers:

Clinical data abstraction
Diagnostics for patient reports
MRI scan segmentation with AI

For pharma/bioTech

Clinical trial matching
Biomarker discovery
Automated data cleaning

For medical device manufacturers

Anomaly detection
Device failure prediction

How Wallaroo can help

We designed Wallaroo specifically to operationalize ML for the most demanding use cases and environments, which makes it particularly well suited for HLS:

When it comes to regulatory compliance, Wallaroo runs in your own health data infrastructure (the hardware and software used to securely aggregate, store, process, and transmit healthcare data). We do not take possession of your data so no additional security vulnerabilities are introduced to your environment. In addition, the platform keeps full audit logs, so inferences can be traced back to specific inputs, and to specific models.

More broadly, for explainability and experimentation, Wallaroo provides real-time monitoring of drift and explainability on complex models with multiple clinical and genomic features. Our model explainability and troubleshooting reports, which run natively in the Wallaroo platform as well as in the 3rd party reporting tool of your choice, also provide for feature effect to understand which features are contributing to specific model predictions or a group of predictions over a period of time, which is critical when running models on or as medical devices (which requires FDA approval). In addition, Wallaroo’s experimentation pipelines make it easy to compare the performance of multiple models on real-life data.

And finally, the Wallaroo platform is built around a high-performance, scalable Rust inferencing engine that is specialized for fast, high-volume computational tasks to efficiently analyze massive, unstructured data sets. So even complex NLP transformer models or computer vision models with millions or even billions of parameters can run on a standard CPU instead of a GPU.

If you are an HLS enterprise looking for a better way to apply machine learning in production, contact us to speak to a specialist.

Addressing the Unique Requirements of MLOps for Healthcare & Life Sciences

Written by Wallaroo.AI