Many languages that include hashmaps in the standard library or as builtins, allow the programmer to iterate over the hashmap. For example, in python:
d = {"US": "English", "Spain": "Spanish", "France": "French", "Canada": "English"}for key in d: print(key)Or C++:
#include <iostream>#include <unordered_map>using namespace std;int main(){ unordered_map <string, string> m; m["US"] = "English"; m["Spain"] = "Spanish"; m["France"] = "French"; m["Canada"] = "English"; for (auto &it: m){ cout << it.first << endl; } return 0;}Rust also has a similar thing and the iteration looks more like the python.
So my question is: Do the hashmap implementations contain a vector of keys that they can iterate over, and either just give you the key or lookup the value on the fly in O(1)? Or do they go through the actual hash table including empty buckets?
I ask because in the latter case, presumably it's more efficient to keep a list of keys myself and iterate that way, rather than actually use the map's functionality.
Edit: Perhaps I was unclear in this question. The purpose of a hashmap is to associate keys with values in O(1) time. All of these implementations do that well. But when you want to iterate over all the key/value pairs in your map, there are two different ways I can think of to achieve that. One is to have an array of your keys, go through that array, lookup each key in O(1) time, and print the key/value. This takes O(n) because you have n keys.
The other way is to run a for loop over the hashmap and extract the keys, values, or both. My question is whether this form typically comes with a speed penalty because the hashtable has empty buckets that need to be checked as we iterate. Normally it seems obvious that we should incur this penalty, but I feel like if I were implementing a library like this I would keep an array of keys and when the user requests to iterate I would do it the first way.