Let's say I have a cartesian product dataframe like this:
soup = pd.DataFrame(data={'Beets': [ 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2],'Carrots': [ 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2],'Potatoes': [ 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]})
I need a new string column [Recipe]
that lists all non-zero ingredients: their number followed by the letter code.
For example, the row with 1 beet, 0 carrots and 2 potatoes must show 1B, 2P
. Carrots must be absent from that string, because their number in that row is 0. A row with all three ingredients must show 2B, 2C, 1P
.
It would seem .apply
is the most straightforward solution, but I didn't get very far.
My plan was to pass columns into the function like this:
def recipe_namer(b, c, p): name = [] if b > 0: name.append[b.astype(str) +'B, '] if c > 0: name.append[c.astype(str) +'C, '] if p > 0: name.append[p.astype(str) +'P, '] return namesoup['Recipe'] = soup.apply(recipe_namer(soup['Beets'], soup['Carrots'], soup['Potatoes']))
I was met with ValueError: The truth value of a Series is ambiguous.
- which, fair enough.
After googling some more, I tried passing the whole dataframe:
def recipe_namer(s): name = [] if s['Beets'] > 0: name.append[s['Beets'].astype(str) +'B, '] if s['Carrots'] > 0: name.append[s['Carrots'].astype(str) +'C, '] if s['Potatoes'] > 0: name.append[s['Potatoes'].astype(str) +'P, '] return namesoup['Recipe'] = soup.apply(recipe_namer)
However, now I'm seeing KeyError: 'Beets'
, as if it can't find the column I'm telling it to look at.
I'd appreciate any pointers as to how to move forward.
I'm aware that I'll have hanging commas at the end of the string even if this approach works, but that bridge should be relatively easy to cross once I get to it.