Python app keeps OOM crashing on Pandas merge

I have a ligh Python app which should perform a very simple task, but keeps crashing due to OOM.

What app should do

Loads data from .parquet in to dataframe
Calculate indicator using stockstats package
Merge freshly calculated data into original dataframe -> here is crashes
Store dataframe as .parquet

Where is crashes

df = pd.merge(df, st, on=['datetime'])

Using

Python 3.10
pandas~=2.1.4
stockstats~=0.4.1
Kubernetes 1.28.2-do.0 (running in Digital Ocean)

Here is the strange thing, the dataframe is very small (df.size is 208446, file size is 1.00337 MB, mem usage is 1.85537 MB).

Measured

import osfile_stats = os.stat(filename)file_size = file_stats.st_size / (1024 * 1024)  # 1.00337 MBdf_mem_usage = dataframe.memory_usage(deep=True)df_mem_usage_print = round(df_mem_usage.sum() / (1024 * 1024), 6   # 1.85537 MBdf_size = dataframe.size  # 208446

Deployment info

App is deployed into Kubernetes using Helm with following resources set

resources:  limits:    cpu: 1000m    memory: 6000Mi  requests:    cpu: 1000m    memory: 4000Mi

I am using nodes with 4vCPU + 8 GB memory and the node not under performance pressure.

kubectl top node node-xxxNAME              CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%node-xxx          750m         19%    1693Mi          25%

Pod info

kubectl describe pod xxx...    State:          Waiting      Reason:       CrashLoopBackOff    Last State:     Terminated      Reason:       OOMKilled      Exit Code:    137      Started:      Sun, 24 Mar 2024 16:08:56 +0000      Finished:     Sun, 24 Mar 2024 16:09:06 +0000...

Here is CPU and memory consumption from Grafana. I am aware that very short Memory or CPU spikes will be hard to see, but from long term perspective, the app does not consume a lot of RAM. On the other hand, from my experience we are using the same pandas operations on containers with less RAM and dataframes are much much bigger with not problems.

How should I fix this?What else should I debug in order to prevent OOM?

Data and code example

Original dataframe (named df)

              datetime   open   high    low  close        volume0  2023-11-14 11:15:00  2.185  2.187  2.171  2.187  19897.8473141  2023-11-14 11:20:00  2.186  2.191  2.183  2.184   8884.6347282  2023-11-14 11:25:00  2.184  2.185  2.171  2.176  12106.1539543  2023-11-14 11:30:00  2.176  2.176  2.158  2.171  22904.3540824  2023-11-14 11:35:00  2.171  2.173  2.167  2.171   1691.211455

New dataframe (named st).
Note: If trend_orientation = 1 => st_lower = NaN, if -1 => st_upper = NaN

              datetime   supertrend_ub  supertrend_lb    trend_orientation    st_trend_segment0  2023-11-14 11:15:00   0.21495        NaN              -1                   11  2023-11-14 11:20:00   0.21495        NaN              -10                  12  2023-11-14 11:25:00   0.21495        NaN              -11                  13  2023-11-14 11:30:00   0.21495        NaN              -12                  14  2023-11-14 11:35:00   0.21495        NaN              -13                  1

Code example

import pandas as pdimport multiprocessingimport numpy as npimport stockstatsdef add_supertrend(market):    try:        # Read data from file        df = pd.read_parquet(market, engine="fastparquet")        # Extract date columns        date_column = df['datetime']        # Convert to stockstats object        st_a = stockstats.wrap(df.copy())        # Generate supertrend        st_a = st_a[['supertrend', 'supertrend_ub', 'supertrend_lb']]        # Add back datetime columns        st_a.insert(0, "datetime", date_column)        # Add trend orientation using conditional columns        conditions = [            st_a['supertrend_ub'] == st_a['supertrend'],            st_a['supertrend_lb'] == st_a['supertrend']        ]        values = [-1, 1]        st_a['trend_orientation'] = np.select(conditions, values)        # Remove not required supertrend values        st_a.loc[st_a['trend_orientation'] < 0, 'st_lower'] = np.NaN        st_a.loc[st_a['trend_orientation'] > 0, 'st_upper'] = np.NaN        # Unwrap back to dataframe        st = stockstats.unwrap(st_a)        # Ensure correct date types are used        st = st.astype({'supertrend': 'float32','supertrend_ub': 'float32','supertrend_lb': 'float32','trend_orientation': 'int8'        })        # Add trend segments        st_to = st[['trend_orientation']]        st['st_trend_segment'] = st_to.ne(st_to.shift()).cumsum()        # Remove trend value        st.drop(columns=['supertrend'], inplace=True)        # Merge ST with DF        df = pd.merge(df, st, on=['datetime'])        # Write back to parquet        df.to_parquet(market, compression=None)    except Exception as e:        # Using proper logger in real code        print(e)        passdef main():    # Using fixed market as example, in real code market is fetched    market = "BTCUSDT"    # Using multiprocessing to free up memory after each iteration    p = multiprocessing.Process(target=add_supertrend, args=(market,))    p.start()    p.join()if __name__ == "__main__":    main()

Dockerfile

FROM python:3.10ENV PYTHONFAULTHANDLER=1 \    PYTHONHASHSEED=random \    PYTHONUNBUFFERED=1 \    PYTHONPATH=.# Adding vimRUN ["apt-get", "update"]# Get dependenciesCOPY requirements.txt .RUN pip3 install -r requirements.txt# Copy main appADD . .CMD main.py

Python app keeps OOM crashing on Pandas merge

What app should do

Where is crashes

Using

Deployment info

Data and code example

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...