I have a dataset that looks like this
StID SubClassID PhOrder PhAmt LbOrder LbAmt4326 200572288 Anti1 23.3 Asprin 13.74326 200572288 Anti2 39.3 Morphin 2.24326 200572288 NULL NULL Medine 30.54326 200572288 Anti3 13.5 Kabomin 20.34326 200572288 NULL NULL Zorifin 0.20993 200348299 Anti1 8.4 Zorifin 9.70993 200348299 Anti2 10.9 Zorifin 95.60993 200348299 Anti4 48.9 NULL NULL
I want to flat this table using one hot encoding
so it looks like this
StID SubClassID Anti1 Anti2 Anti3 Anti4 Asprin Morphin Medine Kabomin Zorifin4326 200572288 23.3 39.3 13.5 NULL 13.7 2.2 30.5 20.3 0.20993 200348299 8.4 10.9 NULL 48.9 NULL NULL NULL NULL 52.65
Here is the code I did but I get many duplicated columns with wrong values
data = {'StId': [4326, 4326, 4326, 4326, 4326, 993, 993, 993],'SubClassID': [200572288, 200572288, 200572288, 200572288, 200572288, 200348299, 200348299, 200348299],'PhOrder': ['Anti1', 'Anti2', 'NULL', 'Anti3', 'NULL', 'Anti1', 'Anti2', 'Anti4'],'PhAmt': [23.3, 39.3, None, 13.5, None, 8.4, 10.9, 48.9],'LbOrder': ['Asprin', 'Morphin', 'Medine', 'Kabomin', 'Zorifin', 'Zorifin', 'Zorifin', None],'LbAmt': [13.7, 2.2, 30.5, 20.3, 0.2, 9.7, 95.6, None]}df = pd.DataFrame(data)df_pivot = df.pivot_table(index=['StId', 'SubClassID'], columns=['PhOrder','LbOrder'], values=['PhAmt','LbAmt'], aggfunc='first')