Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13951

How to identify the root from hierarchical data structure using Pandas

$
0
0

I have a dataframe with the following columns: Parent, Parent_Rev, Child, ChildRev. In this structure, Parent serves as a parent node, while Child functions as a child node. Both Parent_Rev and ChildRev capture different revisions of their respective nodes. A Child may appear in the parent column and have it's own child values, indicated in columns Child and Child_Rev.

This hierarchical chain continues until a path is established where a Parent doesn't appear as a Child to any other values.

To generate the desired output, it's necessary to traverse through all the values, identifying all possible levels and combinations until the top node is reached. When checking for possible links, the combination of Node+Rev should be considered unique, rather than just the Node value alone.

  • Sample Input Dataframe:
ParentParent_RevChildChild_REV
A11B11
B11C11
C11D11
D11E11
A21B21
B21C21
C21C21
A31B33
A41H43
    import pandas as pd    df1 = pd.DataFrame({'Node1': ['A1','B1','C1','D1','A2','B2','C2','A3','A4'],'Node1_Rev': ['1','1','1','1','1','1','1','1','1'],'Node2': ['B1','C1','D1','E1','B2','C2','C2','B3','H4'],'Node2_Rev': ['1','1', '1','1','1','1','1','3','3']    }    )- **Sample Output Dataframe:**|Root|Root_Rev| Parent | Parent_Rev | Child | Child_REV || --- | --- | ---| --- | --- |--- || A1 | 1 || A1 | 1 | B1 | 1 | | A1 | 1 || B1 | 1 | C1 | 1 || A1 | 1 || C1 | 1 | D1 | 1 || A1 | 1 || D1 | 1 | E1 | 1 || A2 | 1 || A2 | 1 | B2 | 1 || A2 | 1 || B2 | 1 | C2 | 1 || A2 | 1 || C2 | 1 | C2 | 1 || A3 | 1 || A3 | 1 | B3 | 3 || A4 | 1 || A4 | 1 | H4 | 3 |What are the efficient ways to generate the output for a bigger dataset to update the root and rev for all the parent_child input combinations?

Viewing all articles
Browse latest Browse all 13951

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>