This is my DataFrame:
import pandas as pd df = pd.DataFrame( {'a': [98, 97, 100, 135, 103, 100, 105, 109, 130],'b': [100, 103, 101, 105, 110, 120, 101, 150, 160] })
And this is the desired output. I want to create column c
:
a b c0 98 100 1001 97 103 1002 100 101 1003 135 105 1004 103 110 1105 100 120 1106 105 101 1017 109 150 1508 130 160 150
It is not so easy for me to describe the issue in pure English, since it is a little bit complicated.c
is df.b.cummmin()
but under certain conditions it changes. I describe it row by row:
The process starts with:
df['c'] = df.b.cummin()
The condition that changes c
is :
cond = df.a.shift(1) > df.c.shift(1)
Now the rows that matter are the ones that cond == True
. For these rows df.c = df.b
And the cummin()
of b
RESETS.
For example, the first instance of cond
is row 4
. So c
changes to 110 (in other words, whatever b
is). And for row 5
it is the cummin()
of b
from row 4
. The logic is the same to the end.
This is one of my attempts. But it does not work where the cond
kicks in:
df['c'] = df.b.cummin()df.loc[df.a.shift(1) > df.c.shift(1), 'c'] = df.b