In this article, we will look into some of the most important HashMap interview questions from the perspective of data structure. Each question takes us to the next one and we will follow it to get to the bottom of the data structure.
Where is the key, value pair stored in HashMap?
The key and value pair is stored in an array.
How is the index identified?
The key is used to identify the table index. If we divide the key’s hash code number by the table’s capacity, the remainder will be the table index which is why it is important to override hashCode().
What is collision?
If two different keys result in the same hash code or different hash codes map to the same table index, there will be a collision. Since two keys can’t sit in the same location, each slot of the bucket array is a pointer to a linked list that contains the key-value pairs hashed to the same location.
What is the maximum capacity of a HashMap?
The capacity is always in power of 2 irrespective of the size we pass to the constructor. The maximum capacity would be the maximum power of 2 which would be
0100 0000 0000 0000 0000 0000 0000 0000 = 1 << 30
Why is the capacity in power of 2?
In general, the number of buckets should be prime is so that the hash values are well distributed and will have less collisions. In case of HashMap, the capacity is always a power-of-two. In contrast, Hashtable by default allocates a size of 11, a prime number. If the size is specified, it will create an array of the specified size.
Whereas in HashMap, if the size is specified and is not a power-of-2, it will be incremented to power-of-2 to create the internal array.
In order to convert the hash code to index, we divide it by the capacity and the remainder would be the index. There seems to be a Bug in versions from JDK1.4 on-wards where integer and modulus operation are much slower than its earlier versions. If the capacity of array is in power-of-two, the hash code can be easily converted to the index based on a simple AND operation and this seems to be more efficient as compared to modulus operation:
index = hashCode & (array length-1)
This might be the reason why the array capacity always has to be a power-of-two.
How can a simple AND operation translate the hash code into table index?
Suppose the hash code=311 and length=16, that is,
24, applying the modulus operation we get:
hash code=length*quotient + index, or
311 = 16 * 19 + 7, where table index = 7.
Let us convert the hash code to binary.
311 = 1 0011 0111in binary
= 1 0011 0111
= 1 0011 0000 + 0111
= 24*1 + 25*1 + 26*0 + 27*0 + 28*1 + 0111
= 24(20*1 + 21*1 + 22*0 + 23*0 + 24*1) + 0111
= 24(10011) + 0111
This is in the same form as the
length*quotient + index, so
table index = 0111.
Thus,we can obtain the table index by masking the hash code using & operation.
table Index = hash code & (2length - 1) = 1 0111 0111 & 1111 = 0111
What effect does table capacity have on the procedure of converting hash code into table index?
table index = hash code & (2x - 1)
So if hash code=311, table index would be,
1 0111 0111 & 1111 = 0111which is nothing but the lower x bits of hash code.
Thus, if the table capacity is in power-of-two, only the lower bits of hash code determine the table index.
What is re-hashing in context of HashMap?
Since HashMap has table capacity in power-of-two, only the lower bits influence the table index. This is why a supplemental hash function is applied to the given hashCode before masking off the low order bits. Its job is to distribute the information over all the bits, and in particular, into the low order bits. This is called re-hashing.
How does put() method of HashMap work in java?
When the put (key, value) method is called to store the key and value pair on HashMap, it first determines the bucket location where the key and value entry should go.
The implementation calls hashcode() method on the key object to map the hash code on to a table index. Since the table length is in power-of-two and only the lower bits determine the table index, it applies a supplementary hash function on the hash code to further strengthen it. Its on this final hash code that the below AND operation is applied.
Table index = supplementaryHashFunction(key's hash code) & (tableLength -1)
There may be an existing entry for the determined bucket location. In the simplest case, where there is no entry, an entry is created and inserted into the identified bucket.
So what’s the basic algorithm of put method?
From our last answer, we can summarize it as:
- Get key’s hash code
- Apply supplementary hash function on it
- Determine table index, based on the new hash code
- Insert a new entry of key and value into the identified location
There are many subtle variations to the above algorithm, like, what if the key is null or what if an entry already exists in the bucket location? In the questions that follow, we will discuss these variations.
Does HashMap allow NULL keys?
Yes. Since key is null, it won’t be able to derive the table index, so it stores the entry into index 0.
What will be the hash code for NULL key?
What happens if the key already exists?
If key already exists, its value will be replaced by the new one and the old value will be returned to the caller.
What if an entry already exists in the bucket location?
The old entry will replaced by the new entry and the new entry’s next will point to the old entry.
How does get(key) method of HashMap work in java?
First the key is converted to a table index. There may be 0 or more entries in the form of a linked structure in the bucket found. If an entry already exists then we just need to traverse through the linked structure to find the key’s entry. We know we have found the key’s entry if both the below conditions hold true.
- The key’s hash code and entry’s hash code match.
- The passed in key and the entry’s key are equal
If the hash code doesn’t match then we don’t have to check key’s equality.
If an entry is found then its value will be returned to the caller else a NULL will be returned.
What is the type of the object stored in the array?
It is of type Entry. Its attributes are key, value, next entry and hash.
Does HashMap retain the order of its elements?
It doesn’t retain the order. Iterator starts traversing from 0th element and returns the first not null element in the array (E0). Next element would be the one that the array element points to (E1), till it reaches the end of the linked list (E2). Once it reaches the end of linked list, it retrieves the next not null element (E3) from the array and the process goes on it reaches the end of array (E4->E5…E9).