Skip to main content

Use Artifacts in Metaflow Join Step


How can I pass data artifacts of a Metaflow flow through a join step? What are my options for merging artifacts?


You can merge_artifacts in the join step. There are additional Metaflow features that allow you to exclude upstream artifacts during the merge. You will also want to be aware of any potential collisions with upstream artifact names.

This flow shows how to:

  • Access upstream values after branches are joined.
  • Select a value from a specific branch because there is a naming collision.
  • Exclude an upstream value from the merge.
from metaflow import FlowSpec, step

class JoinArtifacts(FlowSpec):

def start(self):
self.pre_branch_data = 0, self.branch_b)

def branch_a(self):
self.x = 1 # define x
self.a = "a"

def branch_b(self):
self.x = 2 # define another x!
self.b = "b"

def join(self, inputs):
# pick which x to propagate
self.x = inputs.branch_a.x
self.merge_artifacts(inputs, exclude=["a"])

def end(self):
print("`pre_branch_data` " + \
f"value is: {self.pre_branch_data}.")
print(f"`x` value is: {self.x}.")
print(f"`b` value is: {self.b}.")
print(f"`a` value is: {self.a}.")
except AttributeError as e:
print("`a` was excluded! \U0001F632")

if __name__ == "__main__":
python run
     Workflow starting (run-id 1654221288038724):
[1654221288038724/start/1 (pid 71304)] Task is starting.
[1654221288038724/start/1 (pid 71304)] Task finished successfully.
[1654221288038724/branch_a/2 (pid 71314)] Task is starting.
[1654221288038724/branch_b/3 (pid 71315)] Task is starting.
[1654221288038724/branch_a/2 (pid 71314)] Task finished successfully.
[1654221288038724/branch_b/3 (pid 71315)] Task finished successfully.
[1654221288038724/join/4 (pid 71337)] Task is starting.
[1654221288038724/join/4 (pid 71337)] Task finished successfully.
[1654221288038724/end/5 (pid 71375)] Task is starting.
[1654221288038724/end/5 (pid 71375)] `pre_branch_data` value is: 0.
[1654221288038724/end/5 (pid 71375)] `x` value is: 1.
[1654221288038724/end/5 (pid 71375)] `b` value is: b.
[1654221288038724/end/5 (pid 71375)] `a` was excluded! 😲
[1654221288038724/end/5 (pid 71375)] Task finished successfully.

Further Reading