Here are key technical details about the Metaflow deployment on AWS Kubernetes.
AWS Services List
The terraform template will deploy these services in your AWS account:
- Amazon S3 - A dedicated private bucket to serve as a centralized storage backend.
- AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostgresSQL DB on Amazon Relational Database Service to log flow execution metadata
- AWS EKS cluster - an AWS-managed Kubernetes cluster to execute Metaflow tasks
In addition to this, you'll want to install Argo Workflows for scheduling of production runs. The quickstart Kubernetes manifest published by Argo Workflows spins up the following services inside the EKS cluster:
kubectl get services -n argo
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argo-server ClusterIP 10.0.26.126 <none> 2746/TCP 32m
httpbin ClusterIP 10.0.66.229 <none> 9100/TCP 32m
minio ClusterIP 10.0.173.242 <none> 9000/TCP,9001/TCP 32m
postgres ClusterIP 10.0.51.199 <none> 5432/TCP 32m
workflow-controller-metrics ClusterIP 10.0.139.237 <none> 9090/TCP 32m