Skip to main content

Use a Custom Docker Image

Question

How can I use a custom docker image to run a Metaflow step?

Solution

Metaflow has decorators to run steps on remote compute environments like @batch and @kubernetes. The environments run jobs created from a Docker image.

1Select an Image

You can either build an image or choose one. If you choose an existing image, make sure that Python can be invoked from the container.

You can tell Metaflow which image you want to use in several ways:

  • passing the image argument in a decorator like @batch(image="my_image:latest")
  • in Metaflow config files
    • METAFLOW_DEFAULT_CONTAINER_REGISTRY controls which registry Metaflow uses to pick the image - this defaults to DockerHub but could also be a URL to a public or private ECR repository on AWS.
    • METAFLOW_DEFAULT_CONTAINER_IMAGE dictates the default container image that Metaflow should use.
  • don't specify and let Metaflow default to the official Python image
    • in this case, the default corresponds to the major.minor version of Python that the user used to launch the flow

2Run Flow

For example, this flow uses the official Python image in the run_in_container step. In this example only the image name and tag is specified but know that you can also pass in the full URL to the image in @batch or @kubernetes - @batch(image="url-to-docker-repo/docker-image:version").

Note about GPU images

In these decorators you will see resource arguments like cpu=1. Assuming that your Metaflow deployment allows you to access compute instances with GPU resources, you can also set gpu=N and Metaflow will automatically prepare your image in a way that works with GPU. In this example access means that the AWS Batch compute environment will need access to EC2 instances with GPUs.

use_image_flow.py
from metaflow import FlowSpec, step, batch, conda
import os

class UseImageFlow(FlowSpec):

@step
def start(self):
self.next(self.run_in_container)

@batch(image="python:3.10", cpu=1)
@step
def run_in_container(self):
self.artifact_from_container = 7
self.next(self.end)

@step
def end(self):
pass

if __name__ == "__main__":
UseImageFlow()
python use_image_flow.py run
     Workflow starting (run-id 685):
[685/start/3463 (pid 33227)] Task is starting.
[685/start/3463 (pid 33227)] Task finished successfully.
[685/run_in_container/3464 (pid 33231)] Task is starting.
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Task is starting (status SUBMITTED)...
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Task is starting (status STARTING)...
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Task is starting (status RUNNING)...
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Setting up task environment.
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Downloading code package...
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Code package downloaded.
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Task is starting.
[685/run_in_container/3464 (pid 33231)] [98f8111b-76f1-4e04-97d4-7d7bc2c58abd] Task finished with exit code 0.
[685/run_in_container/3464 (pid 33231)] Task finished successfully.
[685/end/3465 (pid 33245)] Task is starting.
[685/end/3465 (pid 33245)] Task finished successfully.
Done!

3Access Artifacts Outside of Flow

The following can be run in a Python script or notebook to access the artifact produced in the container step:

from metaflow import Flow
run = Flow("UseImageFlow").latest_run
assert run.successful

# get data produced in containerized step
artifact = run.data.artifact_from_container
assert artifact == 7

Further Reading