I have a DataFrame with a few million rows that I groupby and sum various columns via
df.groupby(…).agg({"foo": "sum","bar": "sum",…})
That is rather fast and takes around 10 seconds on my machine.
Now I need a variant of sum that keeps NaN values, like
sum(min_count=1)
The only way this seems to work with the groupby/agg from above is via
... agg({"foo": (lambda s: s.sum(min_count=1))})
But that is upto 10x slower.
I tried passing kwargs but these seem not to be propagated when passing a dict to agg:
... agg({"foo": "sum","bar": "sum",}, min_count=1)
Is there a way to group/agg with the semantics of min_count=1 and the speed of a bare sum?