Skip to main content

Metaflow on Kubernetes

We support a Terraform module that automates the deployment of infrastructure to scale your Metaflow flows beyond your laptop. Here you can find an example of using this module to deploy Metaflow and Argo Workflows in a Kubernetes cluster managed by Amazon EKS.

The example includes configuration for:

  • Amazon S3 - A dedicated private bucket to serve as a centralized storage backend.
  • Kubernetes - A dedicated Kubernetes cluster to extend Metaflow's compute capabilities to the cloud.
  • Argo Workflows - To allow scheduling Metaflow flows on Argo Workflows within the Kubernetes cluster.
  • AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostgresSQL DB on Amazon Relational Database Service to log flow execution metadata, as well as (optional) UI backend and front end app.
  • Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to the Metadata service.
  • AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch.
  • AWS Lambda - An AWS Lambda function that automates database migrations for the Metadata service.

You can see the details in the README in the repository.

For help or feedback, please join Metaflow Slack. To suggest an article, you may open an issue on GitHub.

Join Slack