How to handle missing values (NaN) in categorical data when using scikit-learn OneHotEncoder?

I have recently started learning python to develop a predictive model for a research project using machine learning methods. I have a large dataset comprised of both numerical and categorical data. The dataset has lots of missing values. I am currently trying to encode the categorical features using OneHotEncoder. When I read about OneHotEncoder, my understanding was that for a missing value (NaN), OneHotEncoder would assign 0s to all the feature's categories, as such:

0     Male 1     Female2     NaN

After applying OneHotEncoder:

0     10 1     012     00

However, when running the following code:

    # Encoding categorical data    from sklearn.compose import ColumnTransformer    from sklearn.preprocessing import OneHotEncoder    ct = ColumnTransformer([('encoder', OneHotEncoder(handle_unknown='ignore'), [1])],                           remainder='passthrough')    obj_df = np.array(ct.fit_transform(obj_df))    print(obj_df)

I am getting the error ValueError: Input contains NaN

So I am guessing my previous understanding of how OneHotEncoder handles missing values is wrong. Is there a way for me to get the functionality described above? I know imputing the missing values before encoding will resolve this issue, but I am reluctant to do this as I am dealing with medical data and fear that imputation may decrease the predictive accuracy of my model.

I found this question that is similar but the answer doesn't offer a detailed enough solution on how to deal with the NaN values.

Let me know what your thoughts are, thanks.

How to handle missing values (NaN) in categorical data when using scikit-learn OneHotEncoder?

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...