Implementing a Custom HashMap in Java

Introduction

HashMap is a widely used data structure for performing insertion, retrieval, and deletion operations in almost O(1) time complexity. Due to its simplicity, it is extensively Used. Today, we'll take a closer look at how HashMap works and then make our own HashMap.

Understanding HashMap in Java

A HashMap is a key-value pair data structure that performs operations such as insertion, deletion, and retrieval in almost O(1) time complexity. In a HashMap, each key must be unique. If the same element is encountered, it is deleted and then reinserted. For handling duplicate data, we can use frequency pairs. HashMap can store various types of data of different types

We can also perform operations on our own datatypes in Hashmap using generics

Some Frequently use Built-in Functions of Hashmap
1. put: Used to insert a value into the hashmap.
2. remove: Used to remove an element from the hashmap.
3. get: Used to retrieve an element from the hashmap.
4. clear: Clears the entire hashmap.
5. clone: Creates a copy of the elements in the hashmap.
6. size: Returns the size of the hashmap.

import java.util.HashMap;

public class HashMap {
    public static void main(String[] args) {
        HashMap<String,Integer> Data=new HashMap<>();
        //Adding Values
        Data.put("Apple",30);
        Data.put("Mango",10);
        System.out.println(Data); // {Apple=30, Mango=10}
        Data.put("Banana",20);
        System.out.println(Data); // {Apple=30, Mango=10, Banana=20}
        //Updating Values
        Data.put("Apple",40);
        System.out.println(Data); // {Apple=40, Mango=10, Banana=20}
        //Removing Values
        Data.remove("Banana");
        System.out.println(Data); // {Mango=10, Apple=40}
        // Clones HashMap
        System.out.println(Data.clone()); // {Mango=10, Apple=40}
        // Returns Size of the HashMap
        System.out.println(Data.size()); // 2
        //Checking the Values and Keys
        if(Data.containsKey("Mango")){
            System.out.println("Present");
        }
        else{
            System.out.println("Not Present");
        }
        // Clears the hashmap
        Data.clear();
        System.out.println(Data); // {}
    }
}

Understanding How HashMap Works

Assumptions

We are hard-coding the array length to comprehend the load factor
The load factor is currently set to 2. However, it might be adjusted in the future

Working of HashMap

Initially, we hard-coded the array size as 4, leading to the creation of a computer memory array of size 4.
Null is initialized at each position in the hashmap.
During the insertion of values into the hashmap, the value passes through the hash function, which returns an appropriate index.
A new linked list is created, and the value is inserted as the head of the linked list.
As values are inserted, if the same index is encountered, a new node is created in the linked list at that position, and the value is inserted.
When the limit exceeds the load factor, rehashing occurs, and all values are rearranged.

Understanding each step of implementation

Initializing HashMap

Initializing HashMap involves creating an array of size 4 and initializing each index with a linked list of null values.

 public HashMap() {
             this.N = 4; 
             this.buckets = new LinkedList[4];
             for(int i=0; i<4; i++) {
                 this.buckets[i] = new LinkedList<>(); 
             }
         }

Hashing Function

The hashing function will transform the input value into a hash value and then further break it down into an index number compatible with the size of the array.

Here, we consider the absolute value of the hashing to handle potential negative hash values. We then divide it by N, which represents the length of the bucket, and utilize the internal Java hashcode() function to generate a hash value
```
 private int hashFunction(K key) {
         int bi = key.hashCode();
         return Math.abs(bi) % N; 
 }
```

Searching in Linked List

Before inserting, we check whether the value corresponding to the bucket index is present in the linked list, which we further utilize in our functions. We will return -1 if is not present in the linked List

 private int searchInLL(K key, int bi) {
             LinkedList<Node> ll = buckets[bi];
             for(int i=0; i<ll.size(); i++) {
                 if(ll.get(i).key == key) {
                     return i;
                 }
             }
      return -1;
 }

put Method

In the put function, first, we generate the hash value of the element we are inserting. Later, we check whether it is present in the hashmap. If it is not present, we add the value to the hashmap. If it is already present, we update the data of the existing value.

If the Load factor exceeds 2.0 then rehash the whole Array

 public void put(K key, V value) {
             int bi = hashFunction(key);
             int di = searchInLL(key, bi);
             if (di == -1) {
                 // Add a new node with the key-value pair to the linked list at the bucket index
                 buckets[bi].add(new Node(key, value));
                 // Increment the total number of key-value pairs in the hash table
                 n++;
             } else { 
                 // Update the value associated with the existing key
                 Node node = buckets[bi].get(di);
                 node.value = value;
             }
             // Calculate the load factor (lambda) of the hash table
             double lambda = (double) n / N;
             if (lambda > 2.0) {
                 rehash();
             }
         }

get Method

For the get function, we calculate the hash value of the value to be retrieved and check if it is present in the hashmap. If it is present, we return the value; otherwise, we return -1

 public V get(K key) {
             int bi = hashFunction(key);
             int di = searchInLL(key, bi);
             if(di == -1) { // key doesn't exist
                 return null;
             } else { // key exists
                 Node node = buckets[bi].get(di); // The get method of the LinkedList class, not the HashMap
                 return node.value;
             }
         }

remove Method

In the remove method, first, we calculate the hash value of the element. If it is present, we remove it, decrement the length of the bucket value, and if it is not present, we return -1.

 public V remove(K key) { // It returns the V (value type in Node)
             int bi = hashFunction(key);
             int di = searchInLL(key, bi);
             if(di == -1) { // key doesn't exist
                 return null;
             } else { // key exists
                 Node node = buckets[bi].remove(di); // we are using the remove method of a linked list
                 n--;
                 return node.value;
             }
         }

containsKey Method

Here, we search in the linked list and return a boolean value indicating whether the element is present or not.

 public boolean containsKey(K key) {
             int bi = hashFunction(key);
             int di = searchInLL(key, bi);
             return di!=-1;
 }

keySet Method

To retrieve all the keys of a hashmap, we create an ArrayList and iterate through every index of the array. We then iterate over each node of the linked list and add the key values to the ArrayList.

 public ArrayList<K> keySet() {
             ArrayList<K> keys = new ArrayList<>();
             for (LinkedList<Node> ll : buckets) { // Iterate over Linked lists in all the Nodes
                 for (Node node : ll) { // Iterate Over each linked List in particular node and store the node value in the key
                     keys.add(node.key); 
                 }
             }
             return keys;
         }

isEmpty Method

Here, we return a boolean value indicating whether the hashmap is empty or not

 public boolean isEmpty() {
       return n == 0; // Returns true if the Hashmap is empty or false if Hashmap is full
 }

Rehash Function

In the rehash function, we take all the values in the array and insert them into a new array of size twice that of the original one. Subsequently, we hash them again and rearrange them

private void rehash() {
            LinkedList<Node>[] oldBucket = buckets;
            buckets = new LinkedList[N*2];
            for(int i=0; i<N*2; i++) {
                buckets[i] = new LinkedList<>();
            }
            for (LinkedList<Node> newBucket : oldBucket) { // Inserting the old Bucket items into new Bucket Linked List Items
                for (Node node : newBucket) { // Inserting Node values (Series of LL) into newBucket Nodes
                    put(node.key, node.value);
                }
            }
}

Visualizing how a HashMap works

Here, we insert the eight character values into the HashMap, mapping them to their respective indexes. Subsequently, we perform rehashing, and the result is shown below, where the length is doubled, and the elements are rearranged

I believe you now understand why HashMap operations have O(1) time complexity. It directly accesses data based on its hash value, avoiding the need for traversal as seen in other data structures.

Final thoughts

HashMap stands as a crucial data structure for efficient data retrieval, deletion, and updating. Its significance extends to various other data structures, enabling a wide range of operations.

If you've found this blog insightful, continue exploring similar content on my profile. I've also provided valuable resources and code with test cases below for a deeper understanding of the internal workings of HashMap.

Resources

The code, along with test cases, can be found here
Oracle's HashMap Documentation for official and detailed information on Java's HashMap implementation.
A Youtube playlist by Arpit Bhayani on hash Table Internals

Implementing a Custom HashMap in Java

Introduction

Understanding HashMap in Java

Understanding How HashMap Works

Understanding each step of implementation

Visualizing how a HashMap works

Final thoughts

Resources

Comments

More from this blog

Understanding webRTC from it's first Principles

Cryptography and Digital Signatures

Understanding Bloom Filters and Building a Custom One

Tackling the Thundering Herd Problem and Designing Idempotent API Endpoints for Payments at Stripe with Rate Limiting

Command Palette

Introduction

Understanding HashMap in Java

Understanding How HashMap Works

Understanding each step of implementation

Visualizing how a HashMap works

Final thoughts

Resources

Comments

More from this blog