Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13981

concat one thousand dataframes(30mb each) takes half an hour using rechunk

$
0
0

I have a dataset partitioned by date which consist of one thousand of arrow files with 300 columns. I tried to read it with pl.read_ipc and concat it with rechunk=True. The rechunk step took more than half an hour on my server with 1TB RAM. Is there a way to accelerate this procedure or any suggestions for processing or storage in this scenario? Many thanks for any help and effort in advance.


Viewing all articles
Browse latest Browse all 13981

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>