Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13891

Polars DataFrames Python do replacement by using mask which itself is another Polars DataFrame

$
0
0

How to change variables (or recreate) dataframe, with another boolean mask Polars DataFrame?So not just single column vectors (Series), but both a DataFrame.

So set the following to 1000, where amount > 270, value at the bottom would become 1000

Input:

            apples[0].amount  apples[1].amount...  apples[3].amount  apples[4].amount    0                  NaN       321.68012  ...             NaN             NaN    1                  NaN             NaN  ...             NaN       259.70487    2                  NaN             NaN  ...             NaN       259.70487    3                  NaN             NaN  ...             NaN       259.70487    4                  NaN             NaN  ...             NaN       259.70487    ...                ...             ...  ...             ...             ...    440582        79.57273             NaN  ...             NaN             NaN    440583             NaN             NaN  ...             NaN             NaN    440584             NaN             NaN  ...             NaN             NaN    440585             NaN             NaN  ...             NaN             NaN    440586             NaN             NaN  ...       299.91544             NaN    [440587 rows x 5 columns]

Expected Output:

            apples[0].amount  apples[1].amount...  apples[3].amount  apples[4].amount    0                  NaN       1000.00000 ...             NaN             NaN    1                  NaN             NaN  ...             NaN       259.70487    2                  NaN             NaN  ...             NaN       259.70487    3                  NaN             NaN  ...             NaN       259.70487    4                  NaN             NaN  ...             NaN       259.70487    ...                ...             ...  ...             ...             ...    440582        79.57273             NaN  ...             NaN             NaN    440583             NaN             NaN  ...             NaN             NaN    440584             NaN             NaN  ...             NaN             NaN    440585             NaN             NaN  ...             NaN             NaN    440586             NaN             NaN  ...       1000.00000            NaN    [440587 rows x 5 columns]

Another example:cum_sum_volume_apples input:

        apples[0].amount  apples[1].amount  ...  apples[3].amount  apples[4].amount0            321.66164      1322.18012  ...      1581.98712      1683.343881            321.66164       574.39164  ...       849.15207      1260.204872            321.66164       574.39164  ...       849.15207      1260.204873            321.66164       574.39164  ...       849.15207      1260.204874            321.66164       574.39164  ...       849.15207      1260.20487...                ...             ...  ...             ...             ...440582      1080.07273      1089.38273  ...      3248.32543      3266.94847440583         9.06278        26.69990  ...      1107.99783      1117.30783440584       346.34516       363.98228  ...      1445.28021      1454.59021440585       346.34516       363.98228  ...       882.09418       891.40418440586       426.89556       773.24072  ...      1300.41544      1308.98974[440587 rows x 5 columns]

at_or_above_threshold_mask ~1000

        apples[0].amount  apples[1].amount  ...  apples[3].amount  apples[4].amount0                False            True  ...           False           False1                False           False  ...           False            True2                False           False  ...           False            True3                False           False  ...           False            True4                False           False  ...           False            True...                ...             ...  ...             ...             ...440582            True           False  ...           False           False440583           False           False  ...           False           False440584           False           False  ...           False           False440585           False           False  ...           False           False440586           False           False  ...            True           False[440587 rows x 5 columns]

How to filter on just the true values, with the at_threshold_mask, on another dataframe with the same x/y length? (a sample could include a mask on the already existing cum_sum_volume_apples above)

cum_sum_all = pl.cum_sum_horizontal("*")at_or_above_threshold_boolean_cum_sum = (        (cum_sum_volume_apples >= volume_threshold).select(cum_sum_all).unnest("cum_sum")    )at_or_above_threshold_mask = at_or_above_threshold_boolean_cum_sum >= 1at_threshold_mask = at_or_above_threshold_boolean_cum_sum == 1

Viewing all articles
Browse latest Browse all 13891

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>