Fast, Automatic Containerization of ML and AI Projects with Fast Bakery

A new feature in Outerbounds, Fast Bakery, builds containers for your ML/AI projects quickly and automatically. If you are in a hurry, scroll down to the benchmark section below to see how Fast Bakery performs orders of magnitudes faster than typical alternatives.

Machine learning, AI in particular, requires significant compute power. Fortunately, modern cloud platforms provide scalable access to high-performance GPUs and other resources, so developers don’t have to be constrained by the limitations of local workstations.

To make cloud adoption seamless and effective, two key features are needed:

  1. Secure, cost-efficient access to on-demand cloud instances,
  2. Streamlined deployment of code and dependencies to remote environments.

Both capabilities are core features of Outerbounds. Previously, we have covered multi-cloud compute. Today, we focus on automated containerization of software dependencies with a new core feature in the platform: Fast Bakery.

Shipping container images - been there, done that

For the past ten years, the standard solution for packaging code and dependencies has been container images, Docker images in particular. Hence it’s safe to assume that most teams have an existing solution for building images, either on their laptops or through a CI/CD pipeline.

A challenge is that many container pipelines are designed for the needs of traditional software and DevOps teams, not for fast-moving ML/AI projects with a number of special needs:

  • ML/AI projects are experimental in nature, often requiring rapid testing at large scale. One needs to be able to iterate on the user code constantly and retain the ability to test libraries easily, so the image may need to be changed frequently.
  • Yet, the image build process can take anything from tens of minutes to hours.
  • Hours can turn into days or worse if end users can’t alter the image by themselves, but they have to request them from a separate team (who tend to be constantly overburdened).
  • Editing and maintaining build recipes like Dockerfiles requires expertise outside core ML/AI skills. Any bugs in recipes can lead to hard-to-debug errors.
  • Images are not reproducible by default: Even a simple line of pip install can produce different results over time, as dependencies evolve, leading to more surprises unless one carefully freezes the complete dependency graph.
  • Managing multiple images and their versions manually can become untenable. On the other hand, forcing everyone to use a single image leads to unnecessarily large images, dependency conflicts, and arbitrary limitations.
  • ML/AI packages rely on specialized hardware like GPUs, TPUs, and Trainium, making it crucial to match packages with the target hardware and drivers like CUDA.
  • Images with modern ML/AI packages tend to be large, in the range of 5-10GB, adding overhead to all operations.

Moreover, using the resulting images is less straightforward than in the case of traditional software: It is not uncommon to run hundreds or thousands of tasks concurrently, so the image registry needs to be able to serve a massive number of concurrent downloads with a high throughput. Most existing image registries are not optimized for this, resulting in tasks taking minutes to start.

Introducing Fast Bakery

Fast Bakery is an efficient containerization backend within Outerbounds that packages ML/AI dependencies into reproducible images automatically, enabling rapid deployment and efficient scaling for demanding workloads. 

Since its inception, Metaflow has been taking care of packaging the user code automatically to improve dependency management. To support 3rd party libraries, we first introduced support to isolated execution environments specified with @conda, now powered by mamba and Conda Forge, followed up with support for @pypi. Besides developer productivity, managing the software supply chain carefully is critical for compliance and security.

With Fast Bakery, we are supercharging the existing @pypi and @conda decorators by turning the environments they specify into container images automatically, as shown in the video below:

Fast Bakery addresses most of the issues we have observed with existing containerization setups:

  • Users can create images as a part of their existing development workflow. There is no need to learn new tools, Dockerfiles, or CI/CD systems.
  • The images are built blazing fast: A typical image is ready in seconds.
  • The images are maintained automatically as a part of the workflow, making it feasible to create small, purpose-built images even at the level of individual steps, which would be too complicated to manage manually.
  • The images are stored in a special registry that is optimized for highly parallelized ML/AI workloads.

Conveniently, Fast Bakery works with existing Metaflow flows that use @pypi or @conda so no changes in the code are required.

Fast Bakery under the hood

Fast Bakery is like the Tesla Plaid: it might look unassuming on the surface, but once you're behind the wheel, the difference is palpable, and a surprising amount of sophisticated engineering powers both under the hood.

Let’s take a quick tour of how Fast Bakery works:

  1. Fast Bakery is activated when you run or deploy a flow with @pypi or @conda using the --environment=fast-bakery option.
  2. We have optimized dependency resolution, which now happens outside your workstation, guaranteeing fast execution, so the full dependency graph can be resolved in a few hundred milliseconds - see below for benchmarks.
  3. With all the dependencies resolved, we can build the layers that make up an image in parallel. We also cache layers actively, allowing us to amortize build times across many images. The result is much faster than a usual sequential build process.
  4. Instead of using an ordinary container registry, like ECR or Docker Hub, we use an object store like S3 as a registry to maximize throughput. Crucially, the layers are stored securely in your cloud account.
  5. Once an image is available, we can use it to launch tasks as Kubernetes pods.
  6. Behind the scenes, Metaflow takes care of using the right image, alleviating the need to manage images and tags manually.
  7. Thanks to the optimized image registry, you can execute even thousands of concurrent tasks pulling the image quickly without throttling.

Fast Bakery is fast

How do these features translate into real-world performance? We ran a benchmark with five representative ML/AI environments:

We compared three approaches for creating the above execution environments:

  1. Fast Bakery,
  2. the existing @pypi and @conda decorators in Metaflow running on a large workstation, 
  3. The standard way of baking and publishing images on GitHub Actions using Docker’s build-push-action. We distinguish between the image build time and pushing to the Docker Hub, to highlight the importance of fast registry.

Here are the results:

Fast Bakery is significantly faster than the other approaches in all cases, but the difference is especially pronounced with large packages. Remarkably, Fast Bakery took never more than 40s to produce an image whereas GitHub Actions took over 5 minutes in the largest case. This is especially notable since the timings for Fast Bakery include the total wait time, whereas the timings for GitHub Actions exclude the overhead of launching and waiting actions to start.

Notice how the dark blue bars for GitHub Actions, representing dependency graph resolution time, are longer than the total time taken by Fast Bakery — showcasing the advantages of rapid dependency resolution. The light blue bars show that it can take even longer to publish the images to Docker Hub. There are certainly other CI/CD setups and registries that can perform faster than vanilla Github Actions and DockerHub, but even with 50% speedups Fast Bakery would beat them handily.

These benchmarks measure build times for single-purpose images with a small set of libraries. In real-world scenarios, images often contain a much broader library set to support multiple use cases, leading to significantly larger image sizes. Such cases would benefit even more from Fast Bakery.

Fast Bakery vs. open-source Metaflow

Comparing the @pypi and @conda decorators available in open-source Metaflow to Fast Bakery is somewhat apples-to-oranges.

The former were specifically designed to work without centralized infrastructure and dependencies to any container infrastructure, to make it easy to adopt them in a self-service manner. Instead of baking images, they resolve and persist environments on top of any existing image. From the point of view of the user, the end result is exactly the same: An isolated execution environment that includes the desired packages.

Compared to @pypi and @conda, Fast Bakery can be 7x faster when executed on a large workstation or a modern laptop with a fast internet connection. The speedups can be arbitrarily large when comparing performance to a small workstation or a slow internet connection, as the open-source decorators perform all the work locally, whereas Fast Bakery can leverage centralized infrastructure, independent of the workstation.

It is worth noting that @pypi and @conda perform faster than plain pip or conda run manually, thanks to optimizations included in Metaflow, so you can expect Fast Bakery to perform much faster than your usual command line tools.

Another massive benefit of Fast Bakery is faster startup time for tasks. The @pypi and @conda decorators download and hydrate the environment for every task independently, whereas Fast Bakery benefits from layers cached on instances, as well as higher throughput to pull down the image.

Stay tuned for another set of benchmarks highlighting the effect on the task startup latencies.

Fast and featureful bakery

In addition to packaging open-source libraries in a pristine container image, Fast Bakery includes several advanced features:

  • Install from a private package registry - many companies host a private PyPi registry or a Conda channel. Fast Bakery integrates with the authentication infrastructure in Outerbounds, allowing you to install private packages securely without separate secrets management.
  • Base image management - Outerbounds allows you to allow-list a set of base images to be used with Fast Bakery, which the user can choose from on the fly.
  • Augment packages in the base image - By default, @pypi and @conda create an isolated execution environment, independent of the base image. Optionally, you can choose to avoid creating a separate environment and augment packages in the base image instead, which makes it easy for users to customize images for their specific needs.
  • Secure, multi-cloud images - Outerbounds supports multi-cloud compute natively: You can use images created with Fast Bakery across clouds. For instance, you can create an image with PyTorch and CUDA and use it in a GPU cluster on CoreWeave, or use the same images across AWS, Azure, and GCP.
  • Task-level audit trail - The full dependency graph of each task is persisted and versioned, allowing you to inspect execution environments for debugging and auditing purposes.
  • HIPAA and SOC2 compliant - Fast Bakery is covered by our HIPAA and SOC2 compliance so it is a perfect match for security-conscious environments which need to manage their software supply chains carefully. 

The combination of speed and versatility allows teams to rethink their approach to images and dependency management in general. For more practical examples, take a look at Managing Dependencies in the Outerbounds documentation. 

Try it by yourself

Fast Bakery is available on Outerbounds today. Sign up for a free trial and take it for a spin!

Start building today

Join our office hours for a live demo! Whether you're curious about Outerbounds or have specific questions - nothing is off limits.


Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.