Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13891

How to serialize Jinja2 template in PySpark?

$
0
0

I want to use a Jinja2 template to create a column in a df using PySpark. For example, if I have a column name, use the following template to create another column called new_name.

from jinja2 import TemplateTEMPLATE =  """Hello {{ customize(name) }}!"""template = Template(source = TEMPLATE)template.globals["customize"] = customizedef customize(name):    return name+"san"def udf_foo(name):    template.render(name)convertUDF = udf(lambda z: udf_foo(z),StringType())df = df.select(df.name)df1 = df.withColumn("new_name", convertUDF(col("name")))

Executing the code, I get the following error which I think is because the template cannot be serialized successfully.

An exception was thrown from the Python worker. Please see the stack trace below.'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last):  File "/databricks/spark/python/pyspark/serializers.py", line 189, in _read_with_length    return self.loads(obj)  File "/databricks/spark/python/pyspark/serializers.py", line 541, in loads    return cloudpickle.loads(obj, encoding=encoding)TypeError: Template.__new__() missing 1 required positional argument: 'source''. Full traceback below:Traceback (most recent call last):  File "/databricks/spark/python/pyspark/serializers.py", line 189, in _read_with_length    return self.loads(obj)  File "/databricks/spark/python/pyspark/serializers.py", line 541, in loads    return cloudpickle.loads(obj, encoding=encoding)TypeError: Template.__new__() missing 1 required positional argument: 'source'

I have tried using other serializers like Pickle, Kryo etc but the error persists.

  1. Does anyone think it might not be serialization related error?
  2. Do you know how to fix this so that we can use Jinja2 with Pyspark?

Thanks in advance!


Viewing all articles
Browse latest Browse all 13891

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>