Sunday, June 17, 2007

GetHashCode() in .NET

So in .NET, there is a virtual function defined for all object types (everything) named GetHashCode(). This function can be used to generated a hash value for an object, but the main caveat is this hash will not be guaranteed unique. The implementation can be used for very small amounts of data. The hash calculation returns a 32-bit integer. This means that at most there 4,294,967,295 different integer values possible.
To help the situation, an overridden GetHashCode() can be used. When first confronted with this dilemma, I was going to just XOR the values I was using to generated the hash.


public override int GetHashCode()
{
return value1.GetHashCode() ^
value2.GetHashCode() ^
value3.GetHashCode();
}


Turns out this is not the best way to do this. If you look deeper at the case where value1 and value2 are switched, you will see the calculation would come out the same (a hence 2 different sets of values would result in the same hash). This is really caused by the fact that addition and multiplication operations are commutative. You should remove this to remove this failure in the algorithm you are constructing.


public override int GetHashCode()
{
int result = 21;
result = result*47 + value1.GetHashCode();
result = result*47 + value2.GetHashCode();
result = result*47 + value3.GetHashCode();

return result;
}


This will work!

No comments: