places_within_catchment has a list of place_ids.
Based on the example below, from places_within_catchment, if place_id 3 has the highest sales, remove rows where place_id = 2.
So, going immediately to the third row (since row number 2 has been deleted), remove rows where place_id = 2 and place_id = 5.
How can I do this in a Python function?
I have tried:
result_df = pd.DataFrame(columns=new_df_copy.columns)for index, row in new_df_copy.iterrows(): max_avg_sales = 0 max_id = 0# Find the id with the highest avg_sales within places_within_catchmentfor place_id in row['places_within_catchment']: if place_id in new_df_copy['id'].values: avg_sales = new_df_copy.loc[new_df_copy['id'] == place_id, 'avg_sales'].values[0] if avg_sales > max_avg_sales: max_avg_sales = avg_sales max_id = place_id# Append the row with the maximum avg_sales to the result DataFrameresult_df = pd.concat([result_df, new_df_copy.loc[new_df_copy['id'] == max_id]], ignore_index=True)# Display the result DataFrameprint(result_df)This is the code to reproduce the table below:
# Data for the new DataFramedata = {'place_id': list(range(1, 6)),'avg_sales': [500.4, 200.4, 600.25, 200.93, 60.1],'places_within_catchment': [[2, 3], [1, 3, 4, 5], [1, 2, 5], [1], [1, 3]]}# Create the DataFramenew_df = pd.DataFrame(data)# Display the resultprint(new_df)this is more on the expected output and its logic

