Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 18995

Pivoting a Pandas dataframe with strings as values where desired output has all columns populated

$
0
0

I have a dataframe of the form

item_id  name  attribute    value1       Name1    curator     "Aaron"1       Name1  description   "Some description 1"2       Name2    curator     "Bboyd"2       Name2    curator     "Biyd"2       Name2  description   "Some description 2"etc

I want to pivot this dataframe after the attribute column, concatenating multiple values for the same attribute, to arrive at:

item_id  name  curator    description1       Name1    "Aaron"     "Some description 1"2       Name2  "Bboyd//Biyd"   "Some description 2"etc

I would prefer a solution which works if the unique values which populate the attribute column are unknown, as these values might change over time in my application - for example a new attribute price may appear.

As suggested in related question where the solution is different from what I want I tried using both pivot_table like this:

df = df.pivot_table(index=['item_id', 'name'],                                         columns='attribute',                                         values='value',                                         aggfunc=lambda x: '///'.join().to_list()))

, and, as suggested in the answers, I also tried

df = df.groupby(['item_id', 'name''attribute'])['value'].aggregate(                lambda x: x).unstack().reset_index()

None of these solutions solve my problem, because with both I get:

item_id  name  curator    description1       Name1    "Aaron"     NaN1       Name1    NaN     "Some description 1"2       Name2  "Bboyd"      NaN2       Name2  "Biyd"       NaN2       Name2    NaN   "Some description 2"etc

Looking at the second solution, it's clear to me why this happens, it's because we groupby the attribute. In the first it's not so clear what pivot_table does with the aggfunc - it's not exactly clear what it applies it to - so maybe I could get the desired result with another aggfunc? Either way, what solution do you suggest? It would be equally fine if I could somehow convert the resulting table from pivot_table/groupby to my desired form, but honestly I don't see how to do it, at least not with pandas standard functions.


Viewing all articles
Browse latest Browse all 18995

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>