Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 16388

Fast way to preinitialize addresses in Python without iterating?

$
0
0

I have binary file made of data packets that is serialized like this:

[Length][Payload][Length][Payload][Length][Payload]

The lengths are always 4 bytes and the value is variable, there is no specific pattern. The value of the length includes the 4 bytes of the [Length] itself. I need to extract the byte position in the file of the first bytes of every [Length]. For example:

00 00 02 00 FF FF FF FF FF ...00 01 3C E5 FF FF FF FF FF ...00 00 A5 90 FF FF FF FF FF ...^Need to save all these indexes

This works but it's slow, since I have more than 100k data packets per file:

data = mmap.mmap(filename.fileno(), 0, access=mmap.ACCESS_READ)fileSize = os.path.getsize(filename.name)address = 0addresses = []start = time.time()while address < fileSize:    pkt_length = bytes2int(data[address:(address + 4)])    addresses.append(address)    address += pkt_lengthend = time.time()print(len(addresses))print(end-start)

What can I use to do this faster?

EDIT 1Here is a performance example with a 4.2GB file:

4555945621.271047115325928

The RAM consumption goes pretty high too.


Viewing all articles
Browse latest Browse all 16388

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>