I'm trying to set up an indicator for when a new application causes the rejection of an old application.
If any rejected_time within a personal_id occurs within 5 minutes after a creation_timestamp, it has been rejected because of the new application. Based on this, I should create the column "new_application_causes_rejection" as presented in the example.
There are hundreds of thousands of personal IDs, most with multiple application IDs, and the number of rows within the application IDs varies.
personal_id | application_id | creation_timestamp | approved_amount | rejected_time | new_application_causes_rejection |
---|---|---|---|---|---|
5a | 694f | 2023-01-24 13:01:07.939534 | 8000.0 | 2023-01-24 13:13:15.499000 | 0 |
5a | 694f | 2023-01-24 13:01:07.939534 | 8000.0 | 2023-01-24 14:38:02.359000 | 1 |
5a | 694f | 2023-01-24 13:01:07.939534 | 8000.0 | 2023-01-24 14:37:18.616000 | 1 |
5a | 694f | 2023-01-24 13:01:07.939534 | NaN | 2023-01-24 13:03:59.626000 | 0 |
5a | 43fa | 2023-01-24 14:36:08.287521 | NaN | 2023-01-24 14:37:22.096000 | 0 |
5a | 43fa | 2023-01-24 14:36:08.287521 | 13000.0 | 2023-01-24 14:39:31.750000 | 1 |
5a | 43fa | 2023-01-24 14:36:08.287521 | 13000.0 | 2023-02-02 08:42:26.980106 | 1 |
5a | 43fa | 2023-01-24 14:36:08.287521 | NaN | 2023-01-24 14:37:22.948214 | 0 |
5a | a4b6 | 2023-01-24 14:38:42.625969 | 5000.0 | 2023-02-02 08:42:26.980106 | 0 |
5a | a4b7 | 2023-01-24 14:38:42.625969 | NaN | 2023-01-24 14:38:46.922000 | 0 |
5a | a4b8 | 2023-01-24 14:38:42.625969 | 8000.0 | 2023-02-02 08:42:26.980106 | 0 |