Identical hashes for a guid in both C# and javascript

I have a problem where I need to be able to generate identical uniformly distributed numeric hashs for a GUID in both javascript and C#. I guess that would prevent me from using Guid.GetHashCode() in C#, since I can't reproduce the behavior in JS without reverse engineering the C#.

Is there a fast way to produce hashes from guids/strings in JS? Are all digits of the string uniformly distributed in a .NET generated GUID? Should I just cast/convert the trailing chars into an int?

Share Improve this question asked Aug 18, 2010 at 5:44 Andrew Matthews 3,1662 gold badges30 silver badges47 bronze badges

Add a ment |

3 Answers 3

Sorted by: Reset to default 5

The bytes are apparently not evenly distributed.

I put together some code to sample the .NET Guids and plot the distribution:

First of all the test code, this creates one million Guids and counts the number of different values for each byte in the byte array. It outputs it all into a matrix that I plot in Scilab.

int[,] counter = new int[16, 256];
for (int i = 0; i < 1000000; i++)
{
    var g = Guid.NewGuid();
    var bytes = g.ToByteArray();
    for (int idx = 0; idx < 16; idx++)
    {
        counter[idx, bytes[idx]]++;
    }
}
StringBuilder sb = new StringBuilder();
sb.AppendLine("x = [");
for (int idx = 0; idx < 16; idx++)
{
    for (int b = 0; b < 256; b++)
    {
        sb.Append(counter[idx, b]);
        if (idx != 255)
        {
            sb.Append(" ");
        }
    }
    if (idx != 15)
    {
        sb.AppendLine(";");
    }
}
sb.AppendLine("]");

File.WriteAllText("plot.sce", sb.ToString());

Here are the distributions, the graphs plot the number of each distinct value for each of the positions in the byte array:

The value distribution for the positions 0-6 in the byte array:
The value distribution for the position 7 in the byte array:

The value distribution for the position 8 in the byte array:

The value distribution for the positions 9-15 in the byte array:

For byte positions 0-6 and 9-15 the distribution of values seems to be even, but for byte position 7 and 8 the distribution is fairly limited.

That is, for the guid (with the beginning of the byte positions below, note strange ordering)

{1369ea05-b9f9-408b-ac7c-7ebd0f35d562}
                         1 1 1 1 1 1
 3 2 1 0  5 4  7 6  8 9  0 1 2 3 4 5

The position 7 can take the values from 64 (0x40) to 79 (0x4F).
The position 8 can take the values from 128 (0x80) to 191 (0xBF).
The rest of the bytes are evenly distributed.

Note: The tests was run on .NET4 on a 32 bit Windows 7 machine.

Lesson: don't assume stuff, test.

Answer: To use the .NET Guids for calculating your load balancing, you can use any part except the positions marked 7 and 8 in the Guid above.

Question: Does anybody know WHY the distribution is not evenly spread?

you can create a web service to generate the hash value on the server side, use whatever language you want. on client side, a simple web service call will do the trick.

Reflector says the .NET Guid.GetHashCode() is implemented like this

public override int GetHashCode()
{
    return ((this._a ^ ((this._b << 0x10) | ((ushort) this._c))) ^ ((this._f << 0x18) | this._k));
}

_a, _b, _c and _f is defined in the constructor taking a byte[16] array

public Guid(byte[] b)
{
    if (b == null)
    {
        throw new ArgumentNullException("b");
    }
    if (b.Length != 0x10)
    {
        throw new ArgumentException(Environment.GetResourceString("Arg_GuidArrayCtor", new object[] { "16" }));
    }
    this._a = (((b[3] << 0x18) | (b[2] << 0x10)) | (b[1] << 8)) | b[0];
    this._b = (short) ((b[5] << 8) | b[4]);
    this._c = (short) ((b[7] << 8) | b[6]);
    this._d = b[8];
    this._e = b[9];
    this._f = b[10];
    this._g = b[11];
    this._h = b[12];
    this._i = b[13];
    this._j = b[14];
    this._k = b[15];
}

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Identical hashes for a guid in both C# and javascript - Stack Overflow

3 Answers 3

与本文相关的文章

评论列表(0)