I was trying to partition the parquet on S3 and it worked with AWS Wrangler.
basename_template = 'part.'partitioning = ['cust_id', 'file_name', 'added_year', 'added_month', 'added_date']loop = asyncio.get_event_loop()s3_path = "s3://customer-data-lake/main/parquet_data"await loop.run_in_executor(None, lambda: wr.s3.to_parquet( df=batch.to_pandas() , path=s3_path, dataset=True, max_rows_by_file=MAX_ROWS_PER_FILE, use_threads=True, partition_cols = partitioning, mode='append', boto3_session=s3_session, filename_prefix=basename_template ))
Then I tried to convert it to lakeFs, I changed the endpoint to LakeFS
wr.config.s3_endpoint_url = lakefsEndPoint
Then suddenly partitioning was not working anymore. It just appends to the same partition.
This image is the original S3 one
Then this is after I changed to lakeFs
It just appends to the csv_1. What am I doing wrong here?