How do I convert a pandas dataframe into a NumPy array?
DataFrame:
import numpy as npimport pandas as pdindex = [1, 2, 3, 4, 5, 6, 7]a = [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1]b = [0.2, np.nan, 0.2, 0.2, 0.2, np.nan, np.nan]c = [np.nan, 0.5, 0.5, np.nan, 0.5, 0.5, np.nan]df = pd.DataFrame({'A': a, 'B': b, 'C': c}, index=index)df = df.rename_axis('ID')
gives
A B CID 1 NaN 0.2 NaN2 NaN NaN 0.53 NaN 0.2 0.54 0.1 0.2 NaN5 0.1 0.2 0.56 0.1 NaN 0.57 0.1 NaN NaN
I would like to convert this to a NumPy array, like so:
array([[ nan, 0.2, nan], [ nan, nan, 0.5], [ nan, 0.2, 0.5], [ 0.1, 0.2, nan], [ 0.1, 0.2, 0.5], [ 0.1, nan, 0.5], [ 0.1, nan, nan]])
Also, is it possible to preserve the dtypes, like this?
array([[ 1, nan, 0.2, nan], [ 2, nan, nan, 0.5], [ 3, nan, 0.2, 0.5], [ 4, 0.1, 0.2, nan], [ 5, 0.1, 0.2, 0.5], [ 6, 0.1, nan, 0.5], [ 7, 0.1, nan, nan]], dtype=[('ID', '<i4'), ('A', '<f8'), ('B', '<f8'), ('B', '<f8')])