Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13891

Finding the first row that meets conditions of a mask and selecting one row after it that meets a condition

$
0
0

This is an extension to this post.

My dataframe is:

import pandas as pddf = pd.DataFrame(    {'a': [100, 1123, 123, 100, 1, 0, 1],'b': [1000, 11123, 1123, 0, 55, 0, 1],'c': [100, 1123, 123, 999, 11, 50, 1],'d': [100, 1123, 123, 190, 1, 105, 1],'e': ['a', 'b', 'c', 'd', 'e', 'f', 'g'],    })

And this is the output that I want. I need to create column x:

      a      b     c     d  e   x0   100   1000   100   100  a   NaN1  1123  11123  1123  1123  b   NaN2   123   1123   123   123  c   NaN3   100      0   999   190  d   NaN4     1     55    11     1  e   NaN5     0      0    50   105  f   f6     1      1     1     1  g   NaN

My mask is:

mask = (df.a > df.b)

And these are the steps needed:

a) Find the first row that meets conditions of the mask.

b) Get the value of column a of the above step.

c) Find the first row that the above value is between columns c and d. Being equal to one of them is also OK.

d) Get the value in column e and create column x.

For example for the above dataframe:

a) First row of mask is row 3.

b) The value of column a is 100.

c) From rows that are after the mask (4, 5, ...) the first row that 100 is between columns c and d is row 5. So 'f' is selected for column x.

d) So 'f' is chosen for column x.

This image clarifies the above steps:

enter image description here

This is what I have tried:

mask = (df.a > df.b)val = df.loc[mask.cumsum().eq(1) & mask, 'a']

I prefer the solution to be generic like this answer.

I have provided some additional dataframes in case you need to test the code with other subtle different conditions. For instance what if there no rows that meets conditions of the mask. In that case column x is all NaNs. Column names are all the same as the above df.

df = pd.DataFrame({'a': [100, 1123, 123, -1, 1, 0, 1], 'b': [1000, 11123, 1123, 0, 55, 0, 1],'c': [100, 1123, 123, 999, 11, 50, 1], 'd': [100, 1123, 123, 190, 1, 105, 1], 'e': ['a', 'b', 'c', 'd', 'e', 'f', 'g']})df = pd.DataFrame({'a': [100, 1123, 123, 100, 1, 0, 1], 'b': [1000, 11123, 1123, 0, 55, 0, 1], 'c': [100, 1123, 123, 999, 11, -1, 1], 'd': [100, 1123, 123, 190, 1, 10, 1], 'e': ['a', 'b', 'c', 'd', 'e', 'f', 'g']})df = pd.DataFrame({'a': [100, 1123, 123, 1, 1, 0, 100], 'b': [1000, 11123, 1123, 0, 55, 0, 1], 'c': [100, 1123, 123, 999, 11, -1, 50], 'd': [100, 1123, 123, 190, 1, 10, 101], 'e': ['a', 'b', 'c', 'd', 'e', 'f', 'g']})df = pd.DataFrame({'a': [100, 1123, 123, 100, 1, 1000, 1],'b': [1000, 11123, 1123, 0, 55, 0, 1],'c': [100, 1123, 123, 999, 11, 50, 500], 'd': [100, 1123, 123, 190, 1, 105, 2000], 'e': ['a', 'b', 'c', 'd', 'e', 'f', 'g']})

Viewing all articles
Browse latest Browse all 13891

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>