Apologies as I'm quite inexperienced with Python and looking for assistance here 🙏
I am trying to flatten a list of dicts into new columns in an existing dataframe, with using key as column names and values as cell content. Now json normalize is successfull in unwrapping the data, but fails to place dictvalue1 as column header, and dictvalue2 as cell content. In the examples I saw in other posts, it did work but somehow not for me 🥲
The column originally looks like this :
View of the column (1D array) before unwrapping
What I obtain is something like this :
View after unwrapping with json_normalize
Obviously what I'd like to achieve is something like this :
I have tried pd.DataFrame(form_records(df["fields"]), and also tried pd.json_normalize(df["fields"]) which both came to the same output. The returned dataframe has a dict in each cell which is not what I am trying to get. What I'd like is "id" as column name, and "value" as cell content for every new column created.
Eventually I tried after json_normalize to extract the "ids" to rename my columns, like so :
#extracting row 1my_list = fields_values.iloc[:1].values.flatten().tolist()# getting all values under the "id" key ids = [d['id'] for d in my_list]#extracting all current column names col_names = fields_values.columns.tolist()#zipping the extract "ids" with the current col names new_col_names = dict(zip(col_names, ids))#renaming the columns with the ids fields_values.rename(columns=new_col_names, inplace=True)
This eventually gave me the desired col names, but I still have the cell content with the dicts instead of only the value under "value" ... So I'm not sure if this is the best practice :
correct col names but incorrect cell content
Does any one has any advice on this ? IS there something I'm missing in the json_normalize function arguments, or shall I continue with extracting per row/col the dict values and replace in each cell (I find this to be such a pain!) ? Isn't there a more efficient way to do this ?
Thank you 🙏