Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 16803

Process killed when reading big feather file with pandas due to OOM

$
0
0

I'm trying to read a big feather file and my process get's killed:

gerardo@hal9000:~/Projects/Morriello$ du -h EtOH.feather 5.3G    EtOH.feather

I'm using pandas and pyarrow, here are the versions

gerardo@hal9000:~/Projects/Morriello$ pip freeze | grep "pandas\|pyarrow"pandas==2.2.1pyarrow==15.0.0

When I try to load the dataset into a dataframe I just get the process killed:

In [1]: import pandas as pdIn [2]: df = pd.read_feather("EtOH.feather", dtype_backend='pyarrow')Killed

I'm on linux and I'm using Python 3.12, on a machine with 16Gb of RAM.

I saw the process get's killed due to an Out Of Memory error.

Out of memory: Killed process 918058 (ipython) total-vm:24890996kB, anon-rss:8455548kB, file-rss:640kB, shmem-rss:0kB, UID:1000 pgtables:17228kB oom_score_adj:100

How do I read the file in this case? And if I manage to read it, would there been noticeable advantages in converting it to a parquet format?


Viewing all articles
Browse latest Browse all 16803

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>