I have two dataframes: df1
with shape (34, 1151649)
, df2
with shape (76, 3467)
. I would like to perform pandas.where.groupby
operation on them. They are quite slow. I'm wondering is there's any way to speed up the code. A sample code is as follows.
df1 = pd.DataFrame(np.arange(6).reshape(2, 3), index=[1, 2], columns=pd.MultiIndex.from_tuples((('a', 1), ('a', 2), ('b', 3)), names=['n1', 'n2']))df2 = pd.DataFrame(np.arange(6).reshape(3, 2), index=[0, 1, 2], columns=pd.Index(['a', 'c'], name='n1'))df1df2df1.where(df2 == 2).groupby(level=1, axis=1).sum()
The output is as follows: