Category: machine learning
-
Excited to join Metaflow and Outerbounds
I’ve devoted a large part of my career so far to helping scientists do better science and I want to double down on that. This was initially a result of working in basic scientific research at the intersection of biology, physics, and mathematical modeling. At both the Max Planck Institute (MPI) for Cell Biology in Dresden, Germany (2011-2013), and Yale University (2013-2016), I worked in biology labs with experimentalists, who were generating increasingly large and complex datasets.
-
Data Scientists Don’t Need to Know Kubernetes with Metaflow
With Metaflow, data scientists can leverage K8s clusters for their work without having to know K8s. Metaflow presents a user-friendly UX to data scientists, while working nicely with production infrastructure, which engineers can appreciate. Read on to discover more about “Metaflow as the data scientist’s interface to Kubernetes” and to give the beta version a spin.
-
Notebooks In Production With Metaflow
Learn how to use notebooks in production ML pipelines with a new Metaflow feature. This is model framework-agnostic and so will work with all types of ML models, whether they be scikit-learn, TensorFlow, PyTorch, or XGBoost.
-
How to Evaluate MLOps Tools (Stanford Talk Highlights)
I had fun recently giving this talk in Chip Huyen ‘s Machine Learning Systems Class at Stanford about a subject we don’t talk about enough: (1) How to evaluate ML Tooling and (2) How to spot & deal with 🔥Tool Zealots 🔥
-
Integrating Pythonic visual reports into ML pipelines
We are excited to introduce DAG cards for machine learning pipelines! These cards make it extremely easy to attach custom visual reports in every workflow, without having to install any additional tooling or infrastructure. The feature, which is developed together with Metaflow users at Coveo, is motivated by the ubiquitous needs of modern, data-centric ML workflows.
-
MLOps vs DevOps: the Difference Data Makes
We recently wrote an essay for O’Reilly Radar about the machine learning deployment stack and why data makes software different. The questions we sought to answer were: Why does ML need special treatment in the first place? Can’t we just fold it into existing DevOps best practices? What does a modern technology stack for streamlined ML processes look like? How can you start applying the stack in practice today?
-
Machine learning pipelines: from prototype to production
As the industry is working to develop shared foundations, standards, and a software stack for building and deploying production-grade machine learning software and applications, we are witness to a growing gap between data scientists who create machine learning models and DevOps tools and processes to put those models into production. This often results in data scientists building machine learning pipelines that aren’t ready to be deployed and/or that are challenging to experiment and iterate with. It can also result in models that are neither properly validated nor isolated from relevant business logic.