Suppose I have a function that compares rows in a dataframe:
def comp(lhs: pandas.Series, rhs: pandas.Series) -> bool: if lhs.id == rhs.id: return True if abs(lhs.val1 - rhs.val1) < 1e-8: if abs(lhs.val2 - rhs.val2) < 1e-8: return True return FalseNow I have a dataframe containing id, val1 and val2 columns and I want to generate group ids such that any two rows for which comp evaluates to true have the group number. How do I do this with pandas? I've been trying to think of a way to get groupby to achieve this but can't think of a way.
MRE:
example_input = pandas.DataFrame({'id' : [0, 1, 2, 2, 3],'value1' : [1.1, 1.2, 1.3, 1.4, 1.1],'value2' : [2.1, 2.2, 2.3, 2.4, 2.1]})example_output = example_input.copy()example_output.index = [0, 1, 2, 2, 0]example_output.index.name = 'groups'