Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13921

How do you use Python's string interning?

$
0
0

I was curious about the performance benefits on dictionary lookups when using Python's sys.intern. As a toy example to measure that, I implemented the following code snippets:

Without interning:

import randomfrom uuid import uuid4keys = [str(uuid4()) for _ in range(1_000_000)]values = [random.random() for _ in range(1_000_000)]my_dict = dict(zip(keys, values))keys_sample = random.choices(keys, k=50_000)def get_values(d, ks):    return [d[k] for k in ks]

Test results using ipython:

%timeit get_values(my_dict, keys_sample)8.92 ms ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

With interning:

import sysimport randomfrom uuid import uuid4keys = [sys.intern(str(uuid4())) for _ in range(1_000_000)]values = [random.random() for _ in range(1_000_000)]my_dict = dict(zip(keys, values))keys_sample = random.choices(keys, k=50_000)def get_values(d, ks):    return [d[k] for k in ks]

Test results:

%timeit get_values(my_dict, keys_sample)8.83 ms ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, no meaningful difference between both cases. I tried pumping up the dict size and sampling, but the results stayed on par. Am I using sys.intern incorrectly? Or is the testing flawed?


Viewing all articles
Browse latest Browse all 13921

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>