Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

How do I read a binary file into a Pandas DataFrame using Numpy dtypes?

$
0
0

I want to remove rows in a DataFrame that I have generated using by using a Numpy.dtype template to read in a binary file. I've used multiple methods dropping a row and continue to be stymied by errors, typically:

TypeError: void() takes at least 1 positional argument (0 given)

Opening the variable explorer in an IDE shows the same error when trying to inspect the column name, which suggests an incorrect method for ingesting the data is somehow corrupting the column names.

I load the data in the following manner (number of variables shortened here for brevity):

```data_template = np.dtype([    ('header_a','V22'),    ('variable_A','>u2'),    ('gpssec','>u4')    ])with open(source_file, 'rb') as f: byte_data = f.read()np_data = np.frombuffer(byte_data, data_template)df = pd.DataFrame(np_data)```

When I try to use a method to reduce the DataFrame.

`df = df[df['gpssec'] > 1000]`

I get...

    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\frame.py:3798 in __getitem__      return self._getitem_bool_array(key)    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\frame.py:3853 in _getitem_bool_array      return self._take_with_is_copy(indexer, axis=0)    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\generic.py:3902 in _take_with_is_copy      result = self._take(indices=indices, axis=axis)    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\generic.py:3886 in _take      new_data = self._mgr.take(    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\internals\managers.py:978 in take      return self.reindex_indexer(    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\internals\managers.py:751 in  reindex_indexer      new_blocks = [    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\internals\managers.py:752 in <listcomp>      blk.take_nd(    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\internals\blocks.py:880 in take_nd      new_values = algos.take_nd(    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\array_algos\take.py:117 in take_nd      return _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\array_algos\take.py:134 in _take_nd_ndarray      dtype, fill_value, mask_info = _take_preprocess_indexer_and_fill_value(    File C:\ProgramData\anaconda311\Lib\site-packages\pandas\core\array_algos\take.py:582 in _take_preprocess_indexer_and_fill_value      dtype, fill_value = arr.dtype, arr.dtype.type()    TypeError: void() takes at least 1 positional argument (0 given)    ```I've been able to work around the problem by copying each column of relevant data into a blank DataFrame that doesn't have the corrupt headers, but it's a kludgy solution. Not sure this qualifies as a bug as it's very likely it's a user error, but I can't find anything obvious I'm doing wrong.

Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>