I have multiple basic dataframes that I need to merge using multiple fields.
I've seen posts on merging multiple dataframes, but not on multiple fields.
Would appreciate any help as I'm totally stuck. Eventually I will have to merge 30 dataframes so trying tocome up with a better solution than individually merging them.
First I use a function to rename the fields ... But the final dataframe (reduce_output) only has 2 reports merged, and they are duplicated multiple times. Plus, all the columns have the same name, except for one.
def merge_helper(df1, df2): name = df2['Report Name'][0] print(f"Rpt Name of DF2{name}") name1 = df1['Report Name'][0] print(f"Rpt Name of DF1{name1}") out = pd.merge(df1,df2, on=('Date', 'Entity ID', 'Field', 'Price Type', 'Return Type', 'Linkage', 'Benchmark #'), how='outer', suffixes=('', '_remove')) #print('check1', out.columns) try: out.rename(columns={'Data_Value' : f'Data_Value{name1}'}, inplace=True) except: print('no_such_column') out.rename(columns={'Data_Value_remove' : f'Data_Value{name}'}, inplace=True) out.drop([i for i in out.columns if 'remove' in i], axis=1, inplace=True) #print('check2', out.columns) print(out.columns) return outreduce_output = reduce(lambda x,y: merge_helper(x,y), [reports[0], reports[1], reports[2], reports[3], reports[4]])