Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 16595

Is there a sum(min_count=1) that is as fast as sum on DataFrameGroupBy.agg

$
0
0

I have a DataFrame with a few million rows that I groupby and sum various columns via

df.groupby(…).agg({"foo": "sum","bar": "sum",…})

That is rather fast and takes around 10 seconds on my machine.

Now I need a variant of sum that keeps NaN values, like

sum(min_count=1)

The only way this seems to work with the groupby/agg from above is via

... agg({"foo": (lambda s: s.sum(min_count=1))})

But that is upto 10x slower.

I tried passing kwargs but these seem not to be propagated when passing a dict to agg:

... agg({"foo": "sum","bar": "sum",}, min_count=1)

Is there a way to group/agg with the semantics of min_count=1 and the speed of a bare sum?


Viewing all articles
Browse latest Browse all 16595

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>