javascript - Distribution of Number of Digits of Random Numbers

I encounter this curious phenomenon trying to implement a UUID generator in JavaScript.

Basically, in JavaScript, if I generate a large list of random numbers with the built-in Math.random() on Node 4.2.2:

var records = {};
var l;
for (var i=0; i < 1e6; i += 1) {
  l = String(Math.random()).length;
  if (records[l]) {
    records[l] += 1;
  } else {
    records[l] = 1;
  }
}
console.log(records);

The numbers of digits have a strange pattern:

{ '12': 1,
  '13': 11,
  '14': 65,
  '15': 663,
  '16': 6619,
  '17': 66378,
  '18': 611441,
  '19': 281175,
  '20': 30379,
  '21': 2939,
  '22': 282,
  '23': 44,
  '24': 3 }

I thought this is a quirk of the random number generator of V8, but similar pattern appears in Python 3.4.3:

12 : 2
13 : 5
14 : 64
15 : 672
16 : 6736
17 : 66861
18 : 610907
19 : 280945
20 : 30455
21 : 3129
22 : 224

And the Python code is as follows:

import random
random.seed()
records = {}
for i in range(0, 1000000):
    n = random.random()
    l = len(str(n))
    try:
        records[l] += 1
    except KeyError:
        records[l] = 1;

for i in sorted(records):
    print(i, ':', records[i])

The pattern from 18 to below is expected: say if random number should have 20 digits, then if the last digit of a number is 0, it effectively has only 19 digits. If the random number generator is good, the probability of that happening is roughly 1/10.

But why the pattern is reversed for 19 and beyond?

I guess this is related to float numbers' binary representation, but I can't figure out exactly why.

I encounter this curious phenomenon trying to implement a UUID generator in JavaScript.

Basically, in JavaScript, if I generate a large list of random numbers with the built-in Math.random() on Node 4.2.2:

var records = {};
var l;
for (var i=0; i < 1e6; i += 1) {
  l = String(Math.random()).length;
  if (records[l]) {
    records[l] += 1;
  } else {
    records[l] = 1;
  }
}
console.log(records);

The numbers of digits have a strange pattern:

{ '12': 1,
  '13': 11,
  '14': 65,
  '15': 663,
  '16': 6619,
  '17': 66378,
  '18': 611441,
  '19': 281175,
  '20': 30379,
  '21': 2939,
  '22': 282,
  '23': 44,
  '24': 3 }

I thought this is a quirk of the random number generator of V8, but similar pattern appears in Python 3.4.3:

12 : 2
13 : 5
14 : 64
15 : 672
16 : 6736
17 : 66861
18 : 610907
19 : 280945
20 : 30455
21 : 3129
22 : 224

And the Python code is as follows:

import random
random.seed()
records = {}
for i in range(0, 1000000):
    n = random.random()
    l = len(str(n))
    try:
        records[l] += 1
    except KeyError:
        records[l] = 1;

for i in sorted(records):
    print(i, ':', records[i])

But why the pattern is reversed for 19 and beyond?

I guess this is related to float numbers' binary representation, but I can't figure out exactly why.

Share Improve this question edited Nov 7, 2015 at 12:21 asked Nov 7, 2015 at 12:06 Andreas Blaesus 2632 silver badges8 bronze badges

1 Some off-topic things: Another way to get that dict: __import__('collections').Counter(len(str(__import__('random').random())) for i in range(0, 1000000)). – Remi Guan Commented Nov 7, 2015 at 12:27
1 if you look at Math.random().toString(2).length you'll see a different pattern, perhaps more to what you'd expect – Jaromanda X Commented Nov 7, 2015 at 12:53
1 @KevinGuan And the JavaScript ES6 counterpart (sortof): Array.apply(null, Array(1e5)).map(() => Math.random()).reduce((records, n) => records[String(n).length] ? (records[String(n).length] += 1, records) : (records[String(n).length] = 1, records), {}) – Andreas Blaesus Commented Nov 7, 2015 at 12:57
A JS version shorter than the previous: Array.apply(null, Array(1e5)).map(() => Math.random()).reduce((records, n) => (records[String(n).length] = records[String(n).length] + 1 || 1, records), {}) – Andreas Blaesus Commented Nov 7, 2015 at 13:13
@Jaromanda X, binary representation will show a similar effect. It has to do with the difference between number of significant digits and the number of digits needed to represent tiny numbers without using exponents. – trincot Commented Nov 7, 2015 at 13:26

| Show 3 more ments

2 Answers 2

Sorted by: Reset to default 8

The reason is indeed related to floating point representation. A floating point number representation has a maximum number of (binary) digits it can represent, and a limited exponent value range. Now when you print this out without using scientific notation, you might in some cases need to have some zeroes after the decimal point before the significant digits start to follow.

You can visualize this effect by printing those random numbers which have the longest length when converted to string:

var records = {};
var l, r;
for (var i=0; i < 1e6; i += 1) {
    r = Math.random();
    l = String(r).length;
    if (l === 23) {
        console.log(r);
    }
    if (records[l]) {
        records[l] += 1;
    } else {
        records[l] = 1;
    }
}

This prints only the 23-long strings, and you will get numbers like these:

0.000007411070483631654
0.000053944830052166104
0.000018188989763578967
0.000029525788901141325
0.000009613635131744402
0.000005937417234758158
0.000021099748521158368

Notice the zeroes before the first non-zero digit. These are actually not stored in the number part of a floating point representation, but implied by its exponent part.

If you were to take out the leading zeroes, and then make a count:

var records = {};
var l, r, s;
for (var i=0; i < 1e6; i += 1) {
    r = Math.random();
    s = String(r).replace(/^[0\.]+/, '');
    l = s.length;

    if (records[l]) {
        records[l] += 1;
    } else {
        records[l] = 1;
    }
}

... you'll get results which are less strange.

However, you will see some irregularity that is due to how javascript converts tiny numbers to string: when they get too small, the scientific notation is used in the string representation. You can see this with the following script (not sure if every browser has the same breaking point, so maybe you need to play a bit with the number):

var i = 0.00000123456789012345678;
console.log(String(i), String(i/10));

This gives me the following output:

0.0000012345678901234567 1.2345678901234568e-7

So very small numbers will get a more fixed string length as a result, quite often 22 characters, while in the non-scientific notation a length of 23 is mon. This influences also the second script I provided and length 22 will get more hits than 23.

It should be noted that javascript does not switch to scientific notation when converting to string in binary representation:

var i = 0.1234567890123456789e-120;
console.log(i.toString(2));

The above will print a string of over 450 binary digits!

It's because some of the values are like this:

0.00012345...

And thus they're longer.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

javascript - Distribution of Number of Digits of Random Numbers - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)