i have the following data:
import pandas as pd, numpy as npdata = [['id', 'date', 'b_field', 'a_field'], ['XYX', '01/01/2024', 100, 101], ['XYX', '01/02/2024', 200, 201], ['ABC', '01/01/2024', 300, 301], ['ABC', '01/02/2024', 400, 401]]id_sort=['ABC', 'XYZ']field_sort=['a_field', 'b_field']df = pd.DataFrame(data[1:], columns=data[:1])[![enter image description here][1]][1]
I would like to effectively transform this such that if i were to serialize this to a csv or json array it would look like this:
[![enter image description here][2]][2]
where:
- the ID is grouped at the top (not repeated for each column)
- sort by ID and then fields
- DATE label on the same column row as the field
So far i've tried a few things. The closest i have come is the following:
df.set_index('date').reindex(pd.MultiIndex.from_product([id_sort, field_sort]),axis=1).reset_index()Ofcourse as you can see there are several issues:
- no values
- the 'date' label is at the top level, not 2nd
- dates look like a tuple, not sure why
[![enter image description here][3]][3]
EDIT:For #2, for shifting the date label down, it looks like i can do this (not sure if its the best way):
df = df.rename(columns={'date': '', '':'date'})`` [1]: https://i.sstatic.net/fmdYCJ6t.png [2]: https://i.sstatic.net/fzlAY1W6.png [3]: https://i.sstatic.net/z9ZNx5nm.png