Access & process data flexibly

Access data from your existing data lakes, warehouses, and Amazon S3 securely. Build reactive ML and data applications that run automatically whenever new data is available.

Orchestrate ETL, ELT, and data transformations using tools like dbt, Pandas, Polars, and DuckDB, depending on the needs of your project.

Lightning fast data access

Leverage Metaflow’s built-in, high-performance Amazon S3 client to develop highly efficient data loaders and processing pipelines that reach a throughput of over 20Gbps.

Process small, medium, or big data in parallel, in-memory, using GPUs, or simply in Pandas, choosing the best approach for each task. One paradigm doesn’t need to fit all approaches.

Unify data and ML pipelines to improve efficiency and collaboration

Outerbounds empowers ML scientists to quickly scale our model training and less technical users to easily solve “mundane” analytical tasks in a clean and repeatable manner. By unifying our data and ML pipelines we can now automatically train models when an underlying warehouse table is updated as opposed to relying on manually scheduled independent pipelines, which drastically improves efficiency and collaboration.

Marc Millstone, Principal Engineering Manager, Convoy

See Success Stories

Learn more about data access

Loading data in Metaflow

Triggering flows based on updating data

Patterns for fast data

Test data access in the Metaflow sandbox

Try in a Sandbox