Access Parent Directories from a Flow
Question
How do I import common packages from a parent directory in a flow?
Solution
From your Metaflow code, you can access higher-level directories by adding a symlink. The symlinks in this example are represented by arrows pointing to the common package defined in the parent directory ../common_utils
.
flow_type_1
flow1.py
common_utils -> ../common_utils
flow_type_2
flow2.py
common_utils -> ../common_utils
common_utils
__init__.py
some_module.py
Since Metaflow version 2.5.2, you can add symlinks in the directories containing your flow script, and Metaflow will dereference the symlinks and include the contents (to ../common_utils
in this case) with your code. This allows you to import from common_utils
inside flow steps like those in the flow1.py
script whether they are run locally or remotely.
The rest of this page goes through an example that uses symlinks in this way.
1Define Common Functionality
You can define generic functionality in a custom Python package and reuse it across flows in the directory structure shown above. Here is a definition for a function that will be used in flow_type_1/flow1.py
:
import os
def general_function():
return os.getcwd()
Then you can import the function in the initialization module for the common_utils
package:
from .some_module import general_function
2Create Symlink
In order to import general_function
from the common_utils
package in a flow, you need to add a symlink from the directory containing your flow script to the higher-level package you want to import in the flow.
The ln
command is a standard Unix way to do this. It has an argument -s
that creates a symbolic link between the file at ../common_utils
and flow_type_1/common_utils
:
ln -s ../common_utils flow_type_1/common_utils
3Write a Flow that Imports the Function
This flow imports the common_utils
package in the start step. This works because of the symlink to the common_utils
package from the last section.
from metaflow import FlowSpec, step
import os
class Flow1(FlowSpec):
@step
def start(self):
from common_utils import general_function
self.result = general_function()
self.next(self.end)
@step
def end(self):
pass
if __name__ == "__main__":
Flow1()
4Run the Flow
python 'flow_type_1/flow1.py' run