Now Available: Machine Learning with Metaflow on Azure

Authors

Ever since Netflix open-sourced Metaflow in collaboration with AWS in 2019, we have heard this question frequently: Is Metaflow going to support Microsoft Azure? Organizations using Azure would like to benefit from Metaflow’s open-source, user-friendly, productivity-boosting API that allows data scientists to develop and deploy machine learning applications autonomously. As of today, this is finally possible!

To get an idea of how Metaflow works on Azure, take a look at this quick walkthrough:

Why Metaflow on Microsoft Azure?

Imagine a data scientist who is tasked to build an application powered by data science or ML, say, to forecast demand or to use state-of-the-art computer vision to recognize defects in manufacturing. Consider the infrastructure components, with the foundational layers at the bottom and the application-specific ones at the top of the stack, which are required to build such an application:

  1. The data scientist needs to determine what data to use and they need to develop pipelines to access, load, and transform data.
  2. They need a way to train models and perform inference, which requires access to compute resources. In the case of computationally demanding models, they may require a cluster of hundreds or even thousands of instances.
  3. Data pipelines, training code, and all other business logic need to be architected as a production-ready application, which is often structured as a workflow. For the fastest speed of development and frictionless deployment, it would be optimal if the data scientist could develop and test the workflow autonomously, using a robust workflow orchestrator which supports both rapid local development as well as highly-available production deployments.
  4. All the work needs to be tracked, versioned, and made easily observable, allowing the project to be improved systematically over time, potentially by multiple data scientists.
  5. The application needs to be deployed and integrated into the surrounding systems to produce the desired real-world impact.
  6. Finally, the data scientist should be able to leverage their expertise and domain knowledge to the fullest, leveraging the best off-the-shelf libraries, as they develop models powering the application.

In the past, addressing the six layers has required a fully staffed team consisting of infrastructure engineers, data engineers, and data scientists, which has made the development of ML-powered applications slow and costly. This, in turn, has curtailed the appetite of many organizations to experiment with new ML ideas, which is counterproductive in the long term.

What is Metaflow on Azure?

Metaflow provides a radically simple, user-friendly Python API that covers the full stack of data science and ML infrastructure outlined above. On the backend side, it relies on a stack of battle-hardened, enterprise-ready components for data, compute, and production orchestration, provided by the Azure and Kubernetes ecosystems. The figure below illustrates the new stack on Azure in the context of a simple Metaflow flow:

Note how the Metaflow code itself doesn’t have any hard-coded references to Azure or any other cloud. The integration with Azure, as well as any security and other policies specific to your company, lives as configuration separate from code. For more details on how this works in practice, see Metaflow resources for engineering. Or, if you don’t want to manage this infrastructure by yourself, reach out to us to learn more about our managed offering, deployed in your account.

Thanks to this architecture, Metaflow also supports bursting to the public cloud from on-prem Kubernetes-based hybrid cloud deployments, as well as migration between clouds in a straightforward manner, without requiring tedious rewrites of the business logic.

User-friendly and production-ready

Knowing that companies use Metaflow to power their business-critical ML and data science applications, we spent over a year developing and testing the Metaflow stack on Azure. We have ensured that the Azure integration is at parity with the corresponding stack on AWS, which has been used to power thousands of applications across hundreds of companies, such as 23andMe, CNN, and Realtor.com.

We are thankful to early adopters who have helped us test the stack on Azure, including Softlandia, a group of consultants with a decade of experience with Azure-based ML. Today, we are confident that Metaflow on Azure provides one of the most user-friendly data science and ML experiences on Azure, both for data scientists as well as engineers who operate the infrastructure.

Getting started with Metaflow on Azure

If you are a data scientist or engineer, check out the documentation for open-source Metaflow, which we have updated recently to reflect the multi-cloud nature of Metaflow. You may also want to take a look at a new book covering Metaflow, Effective Data Science Infrastructure, the lessons of which can be readily applied on Azure.

If you want to get a taste of Metaflow before you install anything locally, you can do so by signing up for a Metaflow Sandbox that allows you to test the full stack in the browser. No matter which path you choose, make sure to join the Metaflow support Slack for feedback, support, and to chat with 1500+ like-minded engineers and data scientists. We’d love to hear from you!