I have two dataframes, one that designates one characteristic and another that designates another characteristic, I wanted to join them, but in such a way that the result would be dependent of the intersection between dates.
df1:
| Key | in_date | out_date | char1 | |
|---|---|---|---|---|
| 0 | 1000 | 01/01/2020 | 01/10/2020 | A |
| 1 | 1000 | 02/10/2020 | 10/12/2020 | B |
| 2 | 2000 | 01/01/2019 | 10/01/2019 | C |
| 3 | 2000 | 11/01/2019 | 01/10/2022 | D |
| 4 | 2000 | 02/10/2022 | 01/01/2023 | B |
| 5 | 2000 | 02/01/2023 | 31/12/2030 | L |
df2
| Key | st_date | end_date | char2 | |
|---|---|---|---|---|
| 0 | 1000 | 01/05/2019 | 01/09/2020 | G |
| 1 | 1000 | 02/09/2020 | 10/11/2020 | GG |
| 2 | 2000 | 20/01/2019 | 15/02/2019 | K |
| 3 | 2000 | 16/02/2019 | 10/01/2022 | GE |
| 4 | 2000 | 11/01/2022 | 31/12/2030 | GG |
Desire result:
| key | start | end | char1 | char2 | |
|---|---|---|---|---|---|
| 0 | 1000 | 01/05/2019 | 31/12/2019 | null | G |
| 1 | 1000 | 01/01/2020 | 31/08/2020 | A | G |
| 2 | 1000 | 01/09/2020 | 01/10/2020 | A | GG |
| 3 | 1000 | 02/10/2020 | 10/11/2020 | A | GG |
| 4 | 1000 | 11/11/2020 | 10/12/2020 | B | nan |
| 5 | 2000 | 01/01/2019 | 10/01/2019 | C | nan |
| 6 | 2000 | 11/01/2019 | 19/01/2019 | D | nan |
| 7 | 2000 | 20/01/2019 | 15/02/2019 | D | K |
| 8 | 2000 | 16/02/2019 | 10/01/2022 | D | GE |
| 9 | 2000 | 11/01/2022 | 01/10/2022 | D | GG |
| 10 | 2000 | 02/10/2022 | 01/01/2023 | B | GG |
| 11 | 2000 | 02/01/2023 | 31/12/2030 | L | GG |
I tried to use many if and else, but when I tried to aggregate the dataframe, didn't work.
I tried to use pd.merge but I have a sparse matrix