By Valay Dave, Ville Tuulos, Ciro Greco, and Jacopo Tagliabue
tl;dr In this post, we are excited to introduce DAG cards for machine learning pipelines! These cards make it extremely easy to attach custom visual reports in every workflow, without having to install any additional tooling or infrastructure. The feature, which is developed together with Metaflow users at Coveo, is motivated by the ubiquitous needs of modern, data-centric ML workflows. If you’d like to get started right away, you can go straight to the documentation. Or watch the video below for a quick tour (without sound)!
Human-readable, visual reports for every ML pipeline
Model documentation needs to be about the entire ML pipeline.
Data science and machine learning practitioners require better, more ergonomic tools to visualize and interrogate their workflows. Just focusing on model monitoring is not enough, as modern ML workflows involve a heterogeneous set of functionalities: To put models into production, you need to pay a great deal of attention to what comes before training (such as data collection, preparation, and labeling) and to what comes after (for example, testing, serving, monitoring, and data drift). Put simply, machine learning consists of building, testing, deploying, and maintaining models, each step of which requires multiple substeps.
This has become clear in a major shift we believe is particularly important: The switch of focus from modeling to data, often referred to as Data-Centric AI. Even though some prominent scholars have been talking about it for some time now, Data-Centric AI rose to the attention of the whole community only quite recently. In late 2021, NeurIPS hosted the first workshop on Data-Centric AI, where we were fortunate enough to be presenting our work on DAG cards for machine learning workflows.
As a concrete example, consider this recommendation pipeline from Chia et al, in which the end-to-end pipeline is a series of tasks with explicit dependencies. Some tasks must be executed before others, some can run in parallel (red box); some require GPUs (blue), some are light computations; some may require restarting some others when they fail, some may not – and so on.
We take this last point very seriously and we believe that documentation should incorporate those aspects as well: model documentation needs to be integrated into the entire ML pipeline. And since the ML pipeline is, in practice, a Directed Acyclic Graph (DAG) we are excited to introduce DAG Cards for Metaflow, an open-source framework we started at Netflix and continue developing at Outerbounds. For the past four years, we have been systematically developing Metaflow together with data scientists at a diverse set of companies, adding features like cards that improve their productivity and the quality of projects delivered.
This idea of DAG Cards was originally developed by Jacopo Tagliabue and the team at Coveo, a B2B company providing ML-powered search and recommendations for a wide range of businesses, from e-commerce to customer service going through the workplace. Here’s how Jacopo describes the origin story of the idea:
We have tons of models in production, but only some of them are completely self-explanatory. The models we have do depend quite a lot on the use case, because that changes the data flow and the user behavior. In the end, to understand what many of our models do a certain degree of domain knowledge is required. Since our team finds Metaflow to be a flexible, ergonomic tool for ML pipelines, the natural thing to do was try building a DAG Card generator that lives on top of Metaflow existing classes.
We were initially inspired by the concept of Model Cards, introduced in 2019 by Margaret Mitchell and colleagues and developed further by Google Cloud. Model cards are short documents accompanying trained ML models that provide benchmarked evaluation and crucial information about the model behavior in different conditions. We loved the idea right away. The beauty of the proposal lies in its inclusiveness: Technical folks can use Model Cards to build better ML systems while non-technical people can use them to ponder the strengths and weaknesses of models and make responsible use of them. We realized how useful this could be for working data scientists and loved the idea of getting a human-readable report for every workflow. But how should the reports be produced in practice?
How to produce cards?
Finding an ergonomic and effortless way to produce static, visual reports, like DAG cards, that complements existing tools.
A reason why many real-world ML pipelines remain under-documented and under-monitored is that these features don’t come for free. In particular, when a data scientist wants to monitor business metrics and attributes that are specific to a use case, they need to spend a non-trivial amount of time finding a suitable reporting tool, designing and implementing a report, and connecting it to production workflows.
The last point in particular can be painful. While modern tools make it possible to build stunning dashboards, making the dashboard present the latest (not to mention historical) results from modeling workflows is not easy, as you have to bridge two separate tools. Still, we would like to be able to see a report for every model ever produced, across both prototyping and production.
After working on many data science projects that faced this problem in various forms, we developed the following high-level understanding of the tooling landscape of today:
There’s an inherent tradeoff between the versatility of a tool – how complex and advanced reports you can build with it – and the effort required to get even the simplest report produced. To be clear, this assessment is made specifically from the point of view of producing reports in ML pipelines. For a different set of requirements, the ranking would look different (and we would encourage you to perform such an assessment much of the time).
In the lower-left corner we have the most effortless solution: Just produce textual logs from each modeling task, which is straightforward. Storing and accessing logs separately for each model isn’t hard either. A challenge is that textual logs are not versatile – in particular, one can’t use them to produce visualizations of any kind, which are key for data science.
Notebooks are a popular choice for crafting visualizations. It doesn’t take much effort to open a notebook and produce a chart. However, producing a notebook automatically based on an existing parameterized template, while possible using a tool like Papermill, requires much more tooling and infrastructure, as well as a new mental model for using notebooks in a novel way. Notebooks certainly remain a key part of the Metaflow stack, as they are indispensable for ad-hoc exploration and quick prototyping, while cards come in handy when you know what aspects of the pipeline benefit from automated reports.
Tools like Streamlit and Plotly Dash make it delightfully easy to create interactive dashboards, but connecting them to workflows is non-trivial, and hosting them reliably in production environments requires effort. These tools may even be too powerful for this use case, as our cards don’t need to be interactive. Along the same lines, towards the upper-right corner, there are other sophisticated tools like Tableau and custom web applications, which require so much upfront investment that we have seen them being used mainly for a select few, advanced use cases. Our goal is to make the cost of reporting so low that we can expect every pipeline to include at least one basic card.
To make this vision a reality, we need an exceptionally simple solution that
- Allows rich, visual output,
- Can be used as effortlessly as a simple print() statement,
- Connects to ML workflows trivially easily, and
- Can be easily shared with non-technical stakeholders.
These requirements, fulfilled by Metaflow Cards, occupy a unique spot on the chart: They make it straightforward to produce static, visual reports, like DAG cards, in every pipeline. Since they focus on a specific set of use cases, they complement rather than compete with existing tools.
Getting started with Cards
The latest version of Metaflow comes with a built-in Default Card which you can attach to any existing flow – no changes in the code required! Just run your flow “––with card” as such:
python myflow.py run –-with card
After the run has finished, you can view cards produced by the run by specifying a step name. For instance, to view the card produced by the start step, type:
python myflow.py card view start
This will open the card in your browser. The default card shows all artifacts produced by a task, metadata about it, and a visualization of the flow DAG. This information can come in handy during development, as you can quickly eyeball whether the results of a task look valid.
Following our original motivation, we want to make it trivially easy to produce rich visual output from any Metaflow task. For instance, this code snippets outputs a table:
from metaflow import card, current from metaflow.cards import Table, Markdown @card(type='blank') @step def start(self): current.card.append(Markdown("# Here’s a table")) current.card.append(Table([['first', 'second'], [1, 2]]))
The card, which you can see with “card view start” as before, looks like this:
See the documentation for Card Components for more details and examples. If you have the Metaflow GUI installed, cards are automatically visible in the task view, like this card that uses Altair for visualization:
As shown above, you can use cards to embed custom visualizations and charts in the GUI. For instance, you can have a custom benchmarking card attached to the model training step which automatically ranks the performance of a new model against existing ones. Anyone participating in the project can see the results in real-time in the GUI.
Thanks for reading! We’d love to hear your feedback for this new feature, as well as learn about ideas for new card templates. The easiest way to do this is to join us and over 900 data scientists and engineers at Metaflow Community Slack 👋