Please consider the below dict for example:
d2 = [{'event_id': 't1','display_name': 't1','form_count': 0,'repetition_id': None,'children': [{'event_id': 't_01','display_name': 't(1)','form_count': 1,'repetition_id': 't1','children': [],'forms': [{'form_id': 'f1','form_repetition_id': '1','form_name': 'fff1','is_active': True,'is_submitted': False}]}],'forms': []}, {'event_id': 't2','display_name': 't2','form_count': 0,'repetition_id': None,'children': [{'event_id': 't_02','display_name': 't(2)','form_count': 1,'repetition_id': 't2','children': [{'event_id': 't_03','display_name': 't(3)','form_count': 1,'repetition_id': 't3','children': [],'forms': [{'form_id': 'f3','form_repetition_id': '1','form_name': 'fff3','is_active': True,'is_submitted': False}]}],'forms': [{'form_id': 'f2','form_repetition_id': '1','form_name': 'fff2','is_active': True,'is_submitted': False}]}],'forms': []}]
Above d2
is a list of dicts, where children
is a nested dict with same keys as the parent.
Also, children
can have nesting upto multiple levels which is not possible to know upfront. So in short, I don't know how many times to keep exploding it.
Current df:
In [54]: df11 = pd.DataFrame(d2)In [55]: df11Out[55]: event_id display_name form_count repetition_id children forms0 t1 t1 0 None [{'event_id': 't_01', 'display_name': 't(1)', ... []1 t2 t2 0 None [{'event_id': 't_02', 'display_name': 't(2)', ... []
I want to flatten it in the below way.
Expected output:
event_id display_name form_count repetition_id children forms0 t1 t1 0 None {'event_id': 't_01', 'display_name': 't(1)', '... []1 t2 t2 0 None {'event_id': 't_02', 'display_name': 't(2)', '... []0 t_01 t(1) 1 t1 [] [{'form_id': 'f1', 'form_repetition_id': '1', ...1 t_02 t(2) 1 t2 [{'event_id': 't_03', 'display_name': 't(3)', ... [{'form_id': 'f2', 'form_repetition_id': '1', ...0 t_03 t(3) 0 t3 [] [{'form_id': 'f2', 'form_repetition_id': '1'}]
How do I know that how many nested children are there?
My attempt:
In [58]: df12 = df11.explode('children')In [64]: final = pd.concat([df12, pd.json_normalize(df12.children)])In [72]: finalOut[72]: event_id display_name form_count repetition_id children forms0 t1 t1 0 None {'event_id': 't_01', 'display_name': 't(1)', '... []1 t2 t2 0 None {'event_id': 't_02', 'display_name': 't(2)', '... []0 t_01 t(1) 1 t1 [] [{'form_id': 'f1', 'form_repetition_id': '1', ...1 t_02 t(2) 1 t2 [{'event_id': 't_03', 'display_name': 't(3)', ... [{'form_id': 'f2', 'form_repetition_id': '1', ...
Any help will be appreciated.