Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13921

Deal with 'DtypeWarning: Columns (3) have mixed types. Specify dtype option on import or set low_memory=False.'

$
0
0

I am trying to read a bunch of tsv dataset files with (normally) three columns

Pandas df of a file looks like this

But some of the files have extra two values here and there, also separated by tabs. So i made a code to solve this problem, by reading files as having five columns and then removing two of them.

for i in range(101, 154):     print(i)    # read a file into pandas df    thisfile = pd.read_csv(f'pgc-csv/2022_06_22_TRPV1_AAV488_6x10-11_No1/2022-06-22_TRPV1_AAV488_6x10-11_No{i}.txt',                       skiprows=10, header = None,                       names = ['Время, s', 'Laser, V', 'ECG lead', 'empty1', 'empty2'],                       encoding = 'unicode_escape', delimiter = '\t',                       )    #delete extra columns    del thisfile['empty1']    del thisfile['empty2']

But for that problem files I get an error

"DtypeWarning: Columns (3) have mixed types. Specify dtype option on import or set low_memory=False.'

I tried to usу a method from this article:https://www.roelpeters.be/solved-dtypewarning-columns-have-mixed-types-specify-dtype-option-on-import-or-set-low-memory-in-pandas/

for i in range(101, 154):    print(i)    # read a file into pandas df    thisfile = pd.read_csv(f'pgc-csv/2022_06_22_TRPV1_AAV488_6x10-11_No1/2022-06-22_TRPV1_AAV488_6x10-11_No{i}.txt',                       skiprows=10, header = None,                       names = ['Время, s', 'Laser, V', 'ECG lead', 'empty1', 'empty2'],                       encoding = 'unicode_escape', delimiter = '\t',                       dtype={'Время, s': float, 'Laser, V':float, 'ECG lead': float, 'empty1': 'str', 'empty2': 'str'})    #delete extra columns    del thisfile['empty1']    del thisfile['empty2']

But i still get the errors:Screenshot

The first question is: how can remove this error?

The second question is that, as i understand, there are some values with datatypes other then float in the df.

I tried to get them with this:

ecgfile[lambda x: not isinstance(x['Время, s'], float)]

And this:

ecgfile[lambda x: type(x['Время, s']) is not float]

But didn't succeed. So i need an advice on this part, too.

The last question is, maybe, there is some overall better way to do all this procedures?Thank you)


Viewing all articles
Browse latest Browse all 13921

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>