Skip to main content

Metaflow on Kubernetes

We support a Terraform module that automates the deployment of infrastructure to scale your Metaflow flows beyond your laptop. Here you can find an example of using this module to deploy Metaflow with a Kubernetes cluster, using Amazon EKS.

The example includes configuration for:

  • Amazon S3 - A dedicated private bucket to serve as a centralized storage backend.
  • Kubernetes - A dedicated Kubernetes cluster to extend Metaflow's compute capabilities to the cloud.
  • Argo Workflows - To allow scheduling Metaflow flows on Argo Workflows within the Kubernetes cluster.
  • AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostgresSQL DB on Amazon Relational Database Service to log flow execution metadata, as well as (optional) UI backend and front end app.
  • Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to the Metadata service.
  • AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch.
  • AWS Lambda - An AWS Lambda function that automates database migrations for the Metadata service.

You can see the details in the README in the repository.