Skip to main content

Deploying to AWS with Kubernetes

This page shows how to deploy a complete Metaflow stack powered by Kubernetes on AWS. For more information about the deployment, see deployment details, advanced options and FAQ.

1. Preparation

Install Terraform

Terraform is a popular infrastructure-as-code tool for managing cloud resources. We have published a set of terraform templates here for setting up Metaflow on AWS. Terraform needs to be installed on your system in order to use these templates.

  1. Install terraform by following these instructions.
  2. Download Metaflow on AWS terraform templates:
git clone git@github.com:outerbounds/terraform-aws-metaflow.git

Install aws CLI

This is the official CLI tool ("aws") published by Amazon for working with AWS. It will be used by Terraform when applying our templates (e.g. for authentication vs AWS). Please install it by following these instructions.

Install kubectl

kubectl is a standard CLI tool for working with Kubernetes clusters. It will be used by Terraform when applying our templates (e.g. for deploying some services to your Amazon Elastic Kubernetes Service (EKS) cluster). Please install it by following these instructions.

2. Provision AWS Resources

See here for the exact set of resources to be provisioned.

Login to AWS

You must be logged onto AWS as an account with sufficient permissions to provision the required resources. Use the AWS CLI (aws)

aws configure

Initialize your Terraform Workspace

First, choose which workflow orchestrator you want to use for Metaflow's production deployments:

  1. cd terraform-aws-metaflow/examples/eks_argo to use Argo Workflows or
  2. cd terraform-aws-metaflow/examples/eks_airflow to use Apache Airflow

If in doubt, choose eks_argo as it comes with fewer limitations. In particular, the eks_argo option comes with Metaflow's event triggering enabled automatically (more technical details here).

Next, in your chosen directory run

terraform init

Apply terraform Template to Provision AWS Infrastructure

In the same directory, run

terraform apply

A plan of action will be printed to the terminal. You should review it before accepting. See details for what to expect. This command typically takes ~20 minutes to execute.

3. End User Setup Instructions

When the command above completes, make note of the EKS cluster name (it is a short string that starts with mf-). Use the AWS CLI (aws) to generate cluster configuration

aws eks update-kubeconfig --name <cluster name> configure

Create the file ~/.metaflowconfig/config.json with the contents of config.json. If this file already exists, keep a backup of it and move it aside first.