A popular answer for generating a hashing function in JS is given in Simple (non-secure) hash function for JavaScript? and Generate a Hash from string in Javascript
One example of the code is:
String.prototype.hashCode = function() {
var hash = 0;
if (this.length == 0) {
return hash;
}
for (var i = 0; i < this.length; i++) {
var char = this.charCodeAt(i);
hash = ((hash<<5)-hash)+char;
hash = hash & hash; // Convert to 32bit integer
}
return hash;
}
One line that doesn't make sense to me is hash = ((hash<<5)-hash)+char;
Could someone please explain WHY this is done? I gather that we are doing a 5 bit left shift
on the hash. Is there any reason why it is 5 bits and not 4 or 6? Also why do we then minus the hash and add the char?
A popular answer for generating a hashing function in JS is given in Simple (non-secure) hash function for JavaScript? and Generate a Hash from string in Javascript
One example of the code is:
String.prototype.hashCode = function() {
var hash = 0;
if (this.length == 0) {
return hash;
}
for (var i = 0; i < this.length; i++) {
var char = this.charCodeAt(i);
hash = ((hash<<5)-hash)+char;
hash = hash & hash; // Convert to 32bit integer
}
return hash;
}
One line that doesn't make sense to me is hash = ((hash<<5)-hash)+char;
Could someone please explain WHY this is done? I gather that we are doing a 5 bit left shift
on the hash. Is there any reason why it is 5 bits and not 4 or 6? Also why do we then minus the hash and add the char?
- Pretty sure it doesn't matter - it's just an example of one way to transform a string of characters into a hash. You could use any algorithm you like, using whatever shifts, additions, subtractions, etc you feel like. – CertainPerformance Commented Aug 22, 2018 at 5:27
1 Answer
Reset to default 8(hash << 5)
is (hash * 32)
, so ((hash << 5) - hash)
is (hash * 31)
. And the reason for multiplication by 31 is described in the answers to question
Why does Java's hashCode() in String use 31 as a multiplier?
So, if this is changed to (hash * 31)
, the result is the same. Possibly (hash << 5) - hash
is slightly faster, as shift / subtraction might be faster than multiplication. However, if that's really the case depends on many factors (whether JIT pilation is used, and optimizations in the JIT, and even the processor). So I assume the author of the code tested it, and found it to be faster for his case.