I have a dataframe of the form
item_id name attribute value1 Name1 curator "Aaron"1 Name1 description "Some description 1"2 Name2 curator "Bboyd"2 Name2 curator "Biyd"2 Name2 description "Some description 2"etc
I want to pivot this dataframe after the attribute
column, concatenating multiple values for the same attribute, to arrive at:
item_id name curator description1 Name1 "Aaron" "Some description 1"2 Name2 "Bboyd//Biyd" "Some description 2"etc
I would prefer a solution which works if the unique values which populate the attribute
column are unknown, as these values might change over time in my application - for example a new attribute price
may appear.
As suggested in related question where the solution is different from what I want I tried using both pivot_table
like this:
df = df.pivot_table(index=['item_id', 'name'], columns='attribute', values='value', aggfunc=lambda x: '///'.join().to_list()))
, and, as suggested in the answers, I also tried
df = df.groupby(['item_id', 'name''attribute'])['value'].aggregate( lambda x: x).unstack().reset_index()
None of these solutions solve my problem, because with both I get:
item_id name curator description1 Name1 "Aaron" NaN1 Name1 NaN "Some description 1"2 Name2 "Bboyd" NaN2 Name2 "Biyd" NaN2 Name2 NaN "Some description 2"etc
Looking at the second solution, it's clear to me why this happens, it's because we groupby the attribute. In the first it's not so clear what pivot_table
does with the aggfunc
- it's not exactly clear what it applies it to - so maybe I could get the desired result with another aggfunc
? Either way, what solution do you suggest? It would be equally fine if I could somehow convert the resulting table from pivot_table
/groupby
to my desired form, but honestly I don't see how to do it, at least not with pandas standard functions.