I have groups of related cases in a dataset that I want to rearrange so that data, which consists of numerical and non-numerical data, are represented as a single case in the resulting data frame. Essentially, I have multiple measurements of the same subject over time, thus there are duplicate ID numbers in my dataset. Additionally, there is no need to aggregate the restructured data.
In SPSS, I would use the following code (link to SPSS function):
CASESTOVARS /ID=PropertyLocation /GROUPBY=INDEX.
The result would look similar to the image below in which the variables associated with each case have been added as new variables with a count included in the variable name (example 'date' variable for case 1 is 'date'. 'date.1' is used for second observation performed at a later time of case 1).
In Python, I've tried using df.pivot_table, but it is aggregating my numerical values and not displaying my non-numerical data.
My code in Python is as follows:
reshaped_sales = sales_data_cleansed_2.pivot_table(index="Property Location", columns="sales_index")
Column 'sales_index' is simply a count of each repeated measure of each case. Not all cases are repeatedly measured the same number of times.
Here is the sample dataset https://drive.google.com/file/d/1Q56bVaLCxfY5jF1C0vhVEmpMtyq7dQif/view?usp=sharing
Thanks in advance for your help!