Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 17360

Transpose all rows in one column of dataframe to multiple columns based on certain conditions

$
0
0

I would like to convert one column of data to multiple columns in dataframe based on certain values/conditions.

Please find the code to generate the input dataframe

df1 = pd.DataFrame({'VARIABLE':['studyid',1,'age_interview', 65,'Gender','1.Male','2.Female','Ethnicity','1.Chinese','2.Indian','3.Malay']})

The data looks like as shown below

enter image description here

Please note that I may not know the column names in advance. But it usually follows this format. What I have shown above is a sample data and real data might have around 600-700 columns and data arranged in this fashion

What I would like to do is convert values which start with non-digits(characters) as new columns in dataframe. It can be a new dataframe.

I attempted to write a for loop but failed to due to the below error. Can you please help me achieve this outcome.

for i in range(3,len(df1)):#str(df1['VARIABLE'][i].contains('^\d'))    if (df1['VARIABLE'][i].astype(str).contains('^\d') == True):

Through the above loop, I was trying to check whether first char is a digit, if yes, then retain it as a value (ex: 1,2,3 etc) and if it's a character (ex:gender, ethnicity etc), then create a new column. But guess this is an incorrect and lengthy approach

For example, in the above example, the columns would be studyid,age_interview,Gender,Ethnicity.

The final output would look like this

enter image description here

Can you please let me know if there is an elegant approach to do this?


Viewing all articles
Browse latest Browse all 17360

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>