Is there an example of a Python Dataflow Flex Template with more than one file where the script is importing other files included in the same folder?
My project structure is like this:
├── pipeline│├── __init__.py│├── main.py│├── setup.py│├── custom.py
I'm trying to import custom.py inside of main.py for a dataflow flex template.
I receive the following error in the pipeline execution:
ModuleNotFoundError: No module named 'custom'
The pipeline works fine if I include all of the code in a single file and don't make any imports.
Example Dockerfile:
FROM gcr.io/dataflow-templates-base/python3-template-launcher-baseARG WORKDIR=/dataflow/template/pipelineRUN mkdir -p ${WORKDIR}WORKDIR ${WORKDIR}COPY pipeline /dataflow/template/pipelineCOPY spec/python_command_spec.json /dataflow/template/ENV DATAFLOW_PYTHON_COMMAND_SPEC /dataflow/template/python_command_spec.jsonRUN pip install avro-python3 pyarrow==0.11.1 apache-beam[gcp]==2.24.0ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="${WORKDIR}/setup.py"ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/main.py"
Python spec file:
{"pyFile":"/dataflow/template/pipeline/main.py"}
I am deploying the template with the following command:
gcloud builds submit --project=${PROJECT} --tag ${TARGET_GCR_IMAGE} .