Quantcast
Viewing all articles
Browse latest Browse all 14126

How to convert a dictionary to dataframe in PySpark?

I am trying to convert a dictionary:data_dict = {'t1': '1', 't2': '2', 't3': '3'} into a dataframe:

key   |   value|----------------t1          1t2          2t3          3

To do that, I tried:

schema = StructType([StructField("key", StringType(), True), StructField("value", StringType(), True)])ddf = spark.createDataFrame(data_dict, schema)

But I got the below error:

Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/usr/local/Cellar/apache-spark/2.4.5/libexec/python/pyspark/sql/session.py", line 748, in createDataFrame    rdd, schema = self._createFromLocal(map(prepare, data), schema)  File "/usr/local/Cellar/apache-spark/2.4.5/libexec/python/pyspark/sql/session.py", line 413, in _createFromLocal    data = list(data)  File "/usr/local/Cellar/apache-spark/2.4.5/libexec/python/pyspark/sql/session.py", line 730, in prepare    verify_func(obj)  File "/usr/local/Cellar/apache-spark/2.4.5/libexec/python/pyspark/sql/types.py", line 1389, in verify    verify_value(obj)  File "/usr/local/Cellar/apache-spark/2.4.5/libexec/python/pyspark/sql/types.py", line 1377, in verify_struct    % (obj, type(obj))))TypeError: StructType can not accept object 't1' in type <class 'str'>

So I tried this without specifying any schema but just the column datatypes:ddf = spark.createDataFrame(data_dict, StringType()& ddf = spark.createDataFrame(data_dict, StringType(), StringType())

But both result in a dataframe with one column which is key of the dictionary as below:

+-----+|value|+-----+|t1   ||t2   ||t3   |+-----+

Could anyone let me know how to convert a dictionary into a spark dataframe in PySpark ?


Viewing all articles
Browse latest Browse all 14126

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>