Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 16803

Long to Wide in Python Pandas Similar to IBM SPSS Data Restructure Cases into Variables

$
0
0

I have groups of related cases in a dataset that I want to rearrange so that data, which consists of numerical and non-numerical data, are represented as a single case in the resulting data frame. Essentially, I have multiple measurements of the same subject over time, thus there are duplicate ID numbers in my dataset. Additionally, there is no need to aggregate the restructured data.

In SPSS, I would use the following code (link to SPSS function):

CASESTOVARS    /ID=PropertyLocation  /GROUPBY=INDEX.

The result would look similar to the image below in which the variables associated with each case have been added as new variables with a count included in the variable name (example 'date' variable for case 1 is 'date'. 'date.1' is used for second observation performed at a later time of case 1).

enter image description here

In Python, I've tried using df.pivot_table, but it is aggregating my numerical values and not displaying my non-numerical data.

enter image description here

My code in Python is as follows:

reshaped_sales = sales_data_cleansed_2.pivot_table(index="Property Location", columns="sales_index")

Column 'sales_index' is simply a count of each repeated measure of each case. Not all cases are repeatedly measured the same number of times.

Here is the sample dataset https://drive.google.com/file/d/1Q56bVaLCxfY5jF1C0vhVEmpMtyq7dQif/view?usp=sharing

Thanks in advance for your help!


Viewing all articles
Browse latest Browse all 16803

Trending Articles