Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

How can I append a variable number of blank rows to a dataframe by group?

$
0
0

I have a dataframe that looks like the following, with x number of person ids (more than 1000 persons), x number of transactions per person, and x number of variables (more than 1000 variables):

Person_IDtransaction_IDvariable_1variable_2variable_3variable_X
person1transaction112301abc
person1transaction245610def
person1transaction312301abc
personxtransaction112301abc
personxtransaction245601def

I want to pad it with rows containing -10 at the beginning of every person id group so that the total number of rows per person id group is 6, like the following:

Person_IDtransaction_IDvariable_1variable_2variable_3variable_X
person1-10-10-10-10-10
person1-10-10-10-10-10
person1-10-10-10-10-10
person1transaction112301abc
person1transaction245610def
person1transaction312301abc
personx-10-10-10-10-10
personx-10-10-10-10-10
personx-10-10-10-10-10
personx-10-10-10-10-10
personxtransaction112301abc
personxtransaction245601def

Here is the code I tried (updated with concat) and the error below it.

df2 = pd.DataFrame([[''] * len(newdf.columns)], columns=newdf.columns)df2
for row in newdf.groupby('person_id')['transaction_id']:   x=newdf.groupby('person_id')['person_id'].nunique()   if x.any() < 6:       newdf=pd.concat([newdf, df2*(6-x)], ignore_index=True)
RuntimeWarning: '<' not supported between instances of 'int' and 'tuple', sort order is undefined for incomparable objects.  newdf=pd.concat([newdf, df2*(6-x)], ignore_index=True)

It appended several NaN rows to the bottom of the dataframe, but not inbetween groups as needed. Thank you in advance as I am a beginner.


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>