Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14126

Pandas read_parquet() with filters raises ArrowNotImplementedError for partitioned column with int64 dtype

$
0
0

I am encountering an issue while trying to read a parquet file with Pandas read_parquet() function using the filters argument. One of the partitioned columns in the parquet file has an int64 dtype. However, when applying filters on this column, I'm getting the following error:

pyarrow.lib.ArrowNotImplementedError: Function 'equal' has no kernel matching input types (string, int64)

It seems that Pandas is incorrectly inferring the data type of the partitioned column as a string, causing this error. I've verified that the data type of the filter is correct, so the issue seems to be with Pandas incorrectly inferring the dtype of the partitioned column.

How can I resolve this issue and correctly read the parquet file with filters applied to the partitioned column?

Here's the code I'm using - it just this:

import pandas as pd# Read the parquet file with filtersdf = pd.read_parquet('path_to_file.parquet', filters=[('partition_column_name', '==', 123)])

Ps. I am sure the data is correct

Thank you for your help!


Viewing all articles
Browse latest Browse all 14126

Trending Articles