Access & process data flexibly
Access data in data lakes and warehouses, quickly and securely. Orchestrate data engineering workflows and ML/AI on a unified platform.
Integrate with existing data systems
Access data from your existing data lakes, warehouses, and Amazon S3 securely. Build reactive ML and data applications that run automatically whenever new data is available.
Orchestrate ETL, ELT, and data transformations using tools like dbt, Pandas, Polars, and DuckDB, depending on the needs of your project.
Lightning fast data access
Leverage Metaflow’s built-in, high-performance Amazon S3 client to develop highly efficient data loaders and processing pipelines that reach a throughput of over 20Gbps.
Process small, medium, or big data in parallel, in-memory, using GPUs, or simply in Pandas, choosing the best approach for each task. One paradigm doesn’t need to fit all approaches.
Trusted by the best
MetaFlow has a really fantastic API. https://docs.metaflow.org/metaflow/basics. Most DAG systems could learn from it.
wow I didn't realize how much of the infrastructure needs Metaflow takes care of!
In six weeks, a team that hadn’t used Metaflow before was able to build an ML-based model, A/B test its performance, which handily beat the old simple approach, and roll it out to production.
By unifying our data and ML pipelines we can now automatically train models when an underlying warehouse table is updated which drastically improves efficiency and collaboration.
I really appreciate the clean syntax and the seamless transitioning from dev to dev-at-scale to prod-at-scale
Metaflow records the result of every step to S3. IMO the ability of metaflow to expose individual components of a complicated process is a big deal. Like, a really big deal.
The MLOps piece of tech I'm most excited about is @MetaflowOSS. Just from reading the docs I'm amazed at how a large, enterprise company like Netflix seems to have really, really grasped the developer experience in the way I'd only expect 'smaller' projects to do.
@MetaflowOSS rules!
Metaflow - what a great tool!
Outerbounds was like Metaflow, but with a ‘plus plus’—it made everything easier.
I've been extremely impressed by the Metaflow team. They have built something that's loved by everyone, so much so that they have Metaflow sad hours to encourage people say what can be improved. In my conversations with our users, I've often heard that Metaflow is one of the best software Netflix has ever built.
Metaflow has really been helping in speeding up our ds workflows
I find it *extremely* satisfying that the tutorials for Metaflow (which was originally developed at Netflix) are organised in seasons and episodes 🤓
Bumped into Metaflow at work, plus @HamelHusain seems excited about it, so I am excited about it 🙂 But after watching the video, wow 😲
Metaflow has made my daily life as an ML engineer so much easier, especially as we scale our efforts. I've never worked with a framework that works out of the box as well as metaflow.
❤️ metaflow! The people-first API design is fantastic.
We are saving over 90% of the @awscloud cost in our #artificialintelligence model training phase. We've just finished the automation of our AI pipelines through the implementation of @MetaflowOSS , and the migration from http://H2O.ai
Metaflow has helped us avoid the anti-pattern of needing to push code to find out if something works
I haven’t yet heard a bad thing about metaflow.
Thanks for Metaflow: it's a pleasure to use!
Been using metaflow for a while now, and love it. At its best, it hugely speeds up our development cycles
"As an informal estimate, our #datascience team believes they were able to test twice as many models in Q1 2021 as they did in all of 2020, with simple experiments that would have taken a week now taking half a day."
Metaflow is such an amazing tool!
Love metaflow!
I've been using Metaflow for years now. I can't endorse it nor the people behind it enough.💥Parallelize hyperparameters across containers trivially 💥Persist, retrieve, reproduce results and data artifacts from forever ago 💥Stand up an API that serves resulting data
This diagram is pure perfection h/t @moritzplassnig How much data scientist cares vs how much infrastructure is needed 🤣
Built with the core idea of allowing ML engineers and data scientists to use Python as native way to work, this friendly approach has tremendously boosted Carsales productivity in ML
Metaflow was recommended at work. I spent some time with it today... it's amazing. Their tagline is on-point: "Build and manage real-life data science projects with ease."
Have been using Airflow in prod for several years now & looking forward to making a switch to Metaflow
Very impressive work! Easily mix multiple Python virtual environments and compute resources within a single pipeline 🤯 using #Metaflow
Our complex, multi-stage workflows are codified and orchestrated using Metaflow.
There's a lot of good ideas made practical in Metaflow to help ML projects and developers focus on what's important.
Very impressive level of polish in the Metaflow docs/tutorials. Pretty incredible how many details have been abstracted away by Metaflow.
The team has shaved months off the time it takes to build a productionized machine learning model, and at the same time improved the overall collaboration of the data science organization with other departments
We've had a really great experience transitioning from Argo workflows to Metaflow
we’ve been able to entirely switch over our research experimentation process to rely on Metaflow with great success.
What happened to Kubeflow? So many companies are migrating away from it because it's a pain to configure and maintain. I find it hard to recommend Kubeflow anymore. Right now, Metaflow seems to be a much nicer option.
A space I'm excited about: 'open source on top of public cloud tech.' Instead of using open source libraries (which you have to run & scale, e.g. RabbitMQ), it uses infinitely-scalable off-the-shelf cloud services (e.g. SQS). Metaflow is a good example. https://docs.metaflow.org/metaflow-on-aws/metaflow-on-aws
Maybe we'll be talking about this as the "Airflow for Data Science" in a few years. Not replacing Airflow, but in the sense that it's the default standard for orchestrating data science projects like Airflow does for data engineering pipelines.
Outerbounds Platform has proven to be transformative for us. The platform has allowed us to execute and train models without the burden of worrying about infrastructure
One of the things I liked about Metaflow from Netflix is their clear documentation on "Dealing with Failures": it's important to raise awareness about "retry" logic, transient/permanent failures, idempotent operations, etc.
Thanks for a fantastic product ❤️
i love metaflow.
really love the event triggering feature in Metaflow
The end-to-end ML management in Outerbounds—event triggers, experiment tracking, managed artifacts—was a game-changer for our small team.