Any organisation involved with deploying machine learning models to production knows it comes with its share of business and technical challenges and will typically look to solve ‘some’ of those challenges by using a Machine Learning Platform complemented with some MLOps processes to increase maturity and governance in your team.
For organisations running multiple models in production and looking to adopt an ML platform they’ll typically either build an end-to-end ML platform in-house (Uber, Airbnb, Facebook Learner, Google TFX etc), or buy. In this article I am going to compare some ML Platforms which you can buy.
When do you need an end-end ML Platform?
You should always answer another question first. “What problems are you trying to solve?”. If you can’t think of any existing problems, then there is no point in implementing new technology. However, if you are running into issues like, tracking what models or what versions are in production, increasing governance with your ML experiments, sharing notebooks as your data science team is growing or proactively monitoring data drift and/or feature drift, then chances are you going down or about to go down the path of implementing components of a ML platform.
To assist with this process I have performed a high-level comparison of what I’d like to think are key components of a ML platform.
Comparison
The following comparison is considering out of the box functionality. E.g. not deploying an open-source solution to close a gap in functionality or building in-house.
Features | AWS | GCP | Azure | Databricks |
Data pipeline | Data Pipeline | Dataflow | Data Factory | Spark |
Feature Store | — | — | — | — |
Model Monitoring | Model Monitor | — | Azure Monitor | — |
Experiment Management | SageMaker Experiments | — | Azure Machine Learning SDK | MLFlow Tracking |
Model versioning | Production Variants | Versions | Model registration | MLflow Model Registry |
A/B Testing | Sagemaker | — | Controlled Rollout | — |
Model Serving | Sagemaker | AI Platform | Azure Machine Learning | MLFlow Model Serving |
AutoML | Autopilot | Cloud AutoML | AutomatedML | — |
Notebooks | Sagemaker Notebooks | AI Platform Notebooks | Microsoft Azure Notebooks | Notebooks |
Feature Store?
You might have noticed, not a single solution offers a feature store. Feature Stores are a fairly new component in the landscape and I suspect more organisations will be adding this to their ML platforms in the near future. Their are a few open-source solutions available, and even more research material on this amazing website. Just remember to focus on what problems you are aiming to solve, before investing time and money in a nonproblem.
“What problems are you trying to solve?”
Conclusion
To conclude this short article, I wanted to highlight some other products worth mentioning, but were not included in this comparison. DataRobot, IBM Watson Studio, H2o.ai, Data Bricks & Cloudera. I am going to endeavor to try keep this article up to date in the coming months as the machine learning landscape is changing rapidly. You don’t have to listen to me, just check out all the announcement’s on MLflow blog from Databricks at the Spark + AI Summit this year!
If you want your product added to this comparison please reach out to me on my LinkedIn.