This is my DataFrame:
import pandas as pddf = pd.DataFrame( {'a': [10, 20, 30, 400, 50, 60],'b': [897, 9, 33, 4, 55, 65] })
And this is the output that I want. I want to create column c
.
a b c0 10 897 NaN1 20 9 897.02 30 33 NaN3 400 4 400.04 50 55 NaN5 60 65 NaN
These are the steps needed:
a) Find rows that df.a > df.b
b) From the above rows compare the value from a
to its previous value from b
. If it was more than previous b
value, put a
in column c
otherwise put the previous b
.
For example:
a) Rows 1
and 3
met df.a > df.b
b) From row 1
, 20 is less than 897 so 897 is chosen. However in row 3
, 400 is greater than 33 so it is selected.
This image clarifies the point:
Image may be NSFW.
Clik here to view.
This is what I have tried but it does not work:
df.loc[df.a > df.b, 'c'] = max(df.a, df.b.shift(1))