Essence of equals() and hashCode() in Java
This post can be better understood after reading through my previous post on Equality Check between Strings in Java. By the end of this post, we should get an idea of when to override equals() and hashCode() and the effects of not overriding both. At first, I will walk you through the Object class’s default equals() method, followed with code examples explaining the essence of overriding equals() and hashCode().
java.lang.Object equals()
Here is the excerpt from the java.lang.Object
equals() javadoc
public boolean equals(Object obj)
Indicates whether some other object is “equal to” this one.
The equals method implements an equivalence relation on non-null object references:
It is reflexive: for any non-null reference value x, x.equals(x) should return true.
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
For any non-null reference value x, x.equals(null) should return false.
The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
Parameters:
obj - the reference object with which to compare.Returns:
true if this object is the same as the obj argument; false otherwise.
Here is the java.lang.Object
class equals() method implementation
public boolean equals(Object obj) {
return (this == obj);
}
From the above snippet of java.lang.Object's equals()
implementation it’s clear that it does referential equality i.e., the passed obj and current object are pointing/referring to the same object in memory.
When to @override java.lang.Object’s equals()
Consider the following Employee
class having 2 fields id
, name
.
// No access modifiers specified for brevity
class Employee {
int id;
String name;
Employee(id, name) {
this.id = id;
this.name = name;
}
}
Here is the Driver code
class Main {
public static void main(String[] args) {
Employee employee1 = new Employee(1, "srk");
Employee employee2 = new Employee(1, "srk");
System.out.println(employee1.equals(employee2));
}
}
employee1.equals(employee2)
returns false as it calls java.lang.Object’s equals() implementation which does referential check i.e., it checks if two objects are referring to same object in memory. As employee1
, and employee2
are two different objects in memory, they are not referentially equal, and therefore it returns false.
But, it’s functionally incorrect to say that two employees with same id and name are unequal.
To correct the behavior of the Employee equality check, we have to override java.lang.Object equals() method in Employee class as shown below.
class Employee {
int id;
String name;
Employee(id, name) {
this.id = id;
this.name = name;
}
@Override
public boolean equals(Object obj) {
if(this == obj) return true;
if(!(obj instanceof Employee)) return false;
Employee employee = (Employee) obj;
return (this.id == employee.id) && (this.name.equals(employee.name));
}
}
Upon rerunning the Driver code it prints true
, because in the overridden equals() method we are explicitly checking the fields equality i.e., id, and name.
Therefore, in order to do equality check between objects in Java, we must override equals() method properly, by determining the required fields/attributes that should be the base for Object’s equality.
However, to confirm that the overridden equals() method is valid and complete, it must comply with following rules.
- Reflexive: For any non-null reference value a,
a.equals(a)
must return true - Symmetric: For any non-null reference values a and b, if
a.equals(b)
returns true thenb.equals(a)
should also return true - Transitive: For any non-null reference value a,b,c, if
a.equals(b)
returns true, andb.equals(c)
returns true thenc.equals(a)
should also return true - Consistent: For any non-null reference values a and b, multiple invocations of
a.equals(b)
should consistently return true or consistently return false. - For any non-null object a
a.equals(null)
should return false
When to @override java.lang.Object’s hashCode()
Let’s assume that we’d like to store the Employees in a Collection like java.util.List
.
Employee employee1 = new Employee(1, "srk");
Employee employee2 = new Employee(1, "srk");
List<Employee> employees = new ArrayList();
employees.add(employee1);
employees.add(employee2);
In the above example, employees
list has functionally duplicate employees i.e., employee1.equals(employee2);
returns true. But, still they are added to list which is a list behavior i.e., it doesn’t prevent duplicate entries.
To prevent duplicates, we’d like to move java.util.list
data to java.util.set
Set<Employee> uniqueEmployees = new HashSet<>();
for (Employee employee: employees) {
uniqueEmployees.add(employee);
}
System.out.println(uniqueEmployees.size());
It prints 2
. Not 1
though we are trying to put employees in a Set
to prevent duplicate entries. But, in java Set’s implementation like HashSet
is internally uses java.util.HashMap
which has key as the set’s entry element and value as the Object.
private static final Object PRESENT = new Object();
And, here is how add method on Set in HashSet is implemented
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
So, java.util.HashSet
internally uses java.util.HashMap
which internally is based on HashTable data structure.
At this point if you’re not clear on how a HashMap works internally, read my post Overview of Hashtable implementation in Java
If you are clear on how a HashMap works internally, you have the reasoning on why the previous code snippet prints 2
.
Let me brief on why it prints 2
Whenever, we add a value to set, since it’s using HashMap internally, it will try to compute array/bucket/bin index. But, to do so hashCode() of the key is required first. In this case, the key is the employee object that we are trying to insert. Since, we did not override hashCode() it calls
java.lang.Object's hashCode()
method. Every new object created in Java has a unique hashCode() unless we override it. As, we didn’t override hashCode(), two employee objects are placed in different array indexes/buckets/bin.
How to fix this?
Override hashCode()
. Here is the updated version of Employee
class having overridden equals() and hashCode()
class Employee {
int id;
String name;
Employee(id, name) {
this.id = id;
this.name = name;
}
@Override
public boolean equals(Object obj) {
if(this == obj) return true;
if(!(obj instanceof Employee)) return false;
Employee employee = (Employee) obj;
return (this.id == employee.id) && (this.name.equals(employee.name));
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + id;
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
}
Now, if we rerun the example having System.out.println(uniqueEmployees.size());
it will print 1
. Here is how it works
When we add(employee1), employee1’s overridden hashCode() is called, which returns a hashCode value, HashMap internally uses this value and computes an index where the {key, value} should be placed.
The returned hashcode from both employee1, employee2 is same, because their id, name are same.
As they both are having the same hashCode, they resolve to same index. First employee1 will be added, by computing the index, and placing {key, value} entry at that index/bucket/bin. Then, upon inserting employee2, it also resolves to the same index, at this point as there is already an entry. HashMap will check if both keys are equal i.e., both employees(employee1, employee2) will go through equality check. If yes, their value is updated. In this case, since we are dealing with a HashSet that is backed with HashMap, value is always constant and is updated in the existing entry of HashMap.
Hence, we can say that when two objects are equal then their hashCode is same.
But the reverse is not always true i.e., When two object are having the same hashCode they may or may not be equal
Use Case: Two objects with same hashCode and are not equal.
class Employee {
int id;
String name;
Employee(id, name) {
this.id = id;
this.name = name;
}
@Override
public boolean equals(Object obj) {
if(this == obj) return true;
if(!(obj instanceof Employee)) return false;
Employee employee = (Employee) obj;
return (this.id == employee.id) && (this.name.equals(employee.name));
}
@Override
public int hashCode() {
return 99;
}
}
With the above Employee code, if we add to list, followed by set, it will print 2
Employee employee1 = new Employee(1, "srk1");
Employee employee2 = new Employee(2, "srk2");
List<Employee> employees = new ArrayList();
employees.add(employee1);
employees.add(employee2);
Set<Employee> uniqueEmployees = new HashSet<>();
for (Employee employee: employees) {
uniqueEmployees.add(employee);
}
System.out.println(uniqueEmployees.size());
That fact that it will print 2
is obvious because we are adding two functionally different employees. The catch is they have the same hashCode i.e., 99. So, internally in HashMap both the map-entries will be stored at the same index, which is a collision phenomenon in HashTable.
Here is another version of the same use case with Strings
Here is the excerpt from one of the so answer
String str1 = "0-42L";
String str2 = "0-43-";
System.out.println(str1.equals(str2));
System.out.println(str1.hashCode() == str2.hashCode());
What could be the output? The first statement str1.equals(str2)
prints false
because it does string value comparison. What does this statement str1.hashCode() == str2.hashCode()
print?
HINT: HashCode is different from memory address.
One thing for certain is that these two strings are at different memory location. The statement str1.hashCode() == str2.hashCode()
prints true
. But, the answer can’t be told instantly because it does some computation like s[0]31^(n-1) + s[1]31^(n-2) + … + s[n-1]
.
Javdaoc excerpt from java.lang.String
’s hashCode()
Returns a hash code for this string. The hash code for a String object is computed as
s[0]31^(n-1) + s[1]31^(n-2) + … + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)
In summary
When two objects are equal they should have the same hashCode, but the reverse is not always true, i.e., two objects having same hashCode() may or may not be equal.
Not mandatory to override hashCode() on custom object like Employee, if we are certain that we are not inserting employees into any Hash based collections like HashSet, or HashMap, or HashTable
Strictly speaking, hashCode() on a custom object must be overridden when we are dealing with Hash based collections like HashSet or HashMap.
Nevertheless, it’s a good practice to override both equals() and hashCode() on a custom object. Why?
Because, today employees may be inserted into a
java.util.List
so, it’s not necessary to override hashCode().Tomorrow, we want to insert employees into hash based collection like HashSet or HashMap, then we must override hashCode() on employee object to get the expected results, but by doing modifications on the existing
Employee
class, we are violating open-closed SOLID principle.Hence, we must override equals() and hashCode() on any custom object in one go.