Metaflow ships with an AWS CloudFormation template that automates the deployment of all the AWS resources needed to enable cloud-scaling in Metaflow. The Metaflow Sandbox uses a version of this template to automatically vend out sandboxes for evaluating Metaflow.
If you are looking for assistance with deploying Metaflow resources on Kubernetes instead, please get in touch with us.
The major components of the template are:
- Amazon S3 - A dedicated private bucket and all appropriate permissions to serve as a centralized storage backend.
- AWS Batch - A dedicated AWS Batch Compute Environment and Job Queue to extend Metaflow's compute capabilities to the cloud.
- Amazon CloudWatch - Configuration to store and manage AWS Batch job execution logs.
- AWS Step Functions - A dedicated role to allow scheduling Metaflow flows on AWS Step Functions.
- Amazon Event Bridge - A dedicated role to allow time-based triggers for Metaflow flows configures on AWS Step Functions.
- Amazon DynamoDB - A dedicated Amazon DynamoDB table for tracking certain step executions on AWS Step Functions.
- Amazon Sagemaker - An Amazon Sagemaker Notebook instance for interfacing with Metaflow flows.
- AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostGres DB on Amazon Relational Database Service to log flow execution metadata, as well as (optional) UI backend and front end app.
- Amazon Cognito - Authentication for the UI (if using the UI).
- Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to the Metadata service.
- Amazon VPC Networking - A VPC with (2) customizable subnets and Internet connectivity.
- AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances.
- AWS Lambda - An AWS Lambda function that automates any migrations needed for the Metadata service.
Steps for AWS CloudFormation Deployment
- Navigate to Services and select CloudFormation under the Management and Governance heading (or search for it in the search bar) in your AWS console.
- Click Create stack and select With new resources (standard).
- Download the template from this location and save it locally.
- Ensure Template is ready remains selected, choose Upload a template file, and click Choose file and upload the file saved in previous step.
- Name your stack, select your parameters, and click Next, noting that if you enable APIBasicAuth and/or CustomRole, further configuration will be required after deployment.
- If desired, feel free to tag your stack in whatever way best fits your organization. When finished, click Next.
- Ensure you select the check box next to I acknowledge that AWS CloudFormation might create IAM resources. and click Create stack.
- Wait roughly 10-15 minutes for deployment to complete. The Stack status will eventually change to CREATE_COMPLETE.
Once complete, you'll find an Outputs tab that contains values for the components generated by this CloudFormation template. Those values correlate to respective environment variables (listed next to the outputs) you'll set to enable cloud features within Metaflow.
Did you choose to enable APIBasicAuth and/or CustomRole and are wondering how they work? Below are some details on what happens when those features are enabled and how to make use of them.
- APIBasicAuth - In addition to TLS termination, Amazon API Gateway provides the ability to generate an API key that restricts access only to requests that pass that API key in the 'x-api-key' HTTP header. This is useful in that it restricts access to flow information from the general Internet while still allowing remote connectivity to authenticated clients. However, enabling this feature means that you'll need to request the API Key from Amazon API Gateway, as exposing a credential as an output from CloudFormation is a potential security problem. CloudFormation does, however, output the ID of the API Key that correlates to your stack, making it easy to get the key and pass it to Metaflow. Follow one of the two instructions below to get
- From the AWS CLI, run the following:
aws apigateway get-api-key --api-key <YOUR_KEY_ID_FROM_CFN> --include-value | grep value
- From the AWS Console, navigate to Services and select API Gateway from Networking & Content Delivery (or search for it in the search bar). Click on your API, select API Keys from the left side, select the API that corresponds to your Stack name, and click show next to API Key.
- From the AWS CLI, run the following:
- CustomRole - This template can create an optional role that can be assumed by users (or applications) that includes limited permissions to only the resources required by Metaflow, including access only to the Amazon S3 bucket, AWS Batch Compute Environment, and Amazon Sagemaker Notebook Instance created by this template. You will, however, need to modify the trust policy for the role to grant access to the principals (users/roles/accounts) who will assume it, and you'll also need to have your users configure an appropriate role-assumption profile. The ARN of the Custom Role can be found in the Output tab of the CloudFormation stack under
MetaflowUserRoleArn. To modify the trust policy to allow new principals, follow the directions here. Once you've granted access to the principals of your choice, have your users create a new Profile for the AWS CLI that assumes the role ARN by following the directions here.
Once you have followed all these steps, you can configure your metaflow installation using the outputs from the CloudFormation stack.
Option to deploy Metaflow User Interface (
Please note: This section can be ignored if
EnableUI parameter is set to false (this is the default value).
This template deploys the UI with authentication using Amazon Cognito. For Cognito to work, you'll need to provide a DNS name and SSL certificate from AWS ACM. That means you'll need a few additional steps if using the UI:
- Figure out what DNS name to use, that you have control of. You can either register a new domain name, or create a subdomain.
- Generate and verify a SSL certificate valid for that name using AWS ACM. Follow the instructions from AWS for this.
- Deploy this Cloudformation template. You'll need to set
EnableUIto "true", and in addition to this:
PublicDomainNameto the domain name you chose
CertificateArnto the certificate ARN from step 2 above
- After Cloudformation template is deployed, make note of
LoadBalancerUIDNSNameoutput value. You'll need to modify DNS settings to point your domain name to that name.
- If you're using Route53, create an A record that is an Alias and choose the load balancer from the drop down.
- If using a different DNS management tool/registrar, create a CNAME record that points to
- After DNS changes propagate, you should be able to navigate to the DNS name in your browser and see a login prompt. To create a user, go to AWS Console -> Cognito -> User Pools, find the pool that corresponds to this stack and create a new user under "Users and Groups".