最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Why is using find() on these arrays so slow when within a for loop? - Stack Overflow

programmeradmin1浏览0评论

Background: I came across this issue whilst dealing with arrays of large and small numbers using the findIndex function. I have given a minimum working example below. I can avoid the issue, but I just don't understand why the issue exists in the first place.

In node.js (v12.16.3), why does getting rid of the for loop around the find function in this example cause the performance to increase dramatically? (5600 ms reduced to 250 ms)

The issue does not appear if I change the value in the second array from 1e10 to 1e9 or less, or if I change the value in the first array from 1 to 1e10 or more.

const nSims = 1e8

const arrays = [];
arrays[0] = [1];
arrays[1] = [1e10];

console.time('a')
for (var i = 0; i < nSims; i++) {
    for (var j = 0; j < 2; j++) {
        arrays[j].find((value) => value > 0);
    }
}
console.timeEnd('a') // 5600 ms

console.time('b')
for (var i = 0; i < nSims; i++) {
    arrays[0].find((value) => value > 0);
    arrays[1].find((value) => value > 0);
}
console.timeEnd('b') // 250 ms

Background: I came across this issue whilst dealing with arrays of large and small numbers using the findIndex function. I have given a minimum working example below. I can avoid the issue, but I just don't understand why the issue exists in the first place.

In node.js (v12.16.3), why does getting rid of the for loop around the find function in this example cause the performance to increase dramatically? (5600 ms reduced to 250 ms)

The issue does not appear if I change the value in the second array from 1e10 to 1e9 or less, or if I change the value in the first array from 1 to 1e10 or more.

const nSims = 1e8

const arrays = [];
arrays[0] = [1];
arrays[1] = [1e10];

console.time('a')
for (var i = 0; i < nSims; i++) {
    for (var j = 0; j < 2; j++) {
        arrays[j].find((value) => value > 0);
    }
}
console.timeEnd('a') // 5600 ms

console.time('b')
for (var i = 0; i < nSims; i++) {
    arrays[0].find((value) => value > 0);
    arrays[1].find((value) => value > 0);
}
console.timeEnd('b') // 250 ms

Share Improve this question edited Aug 23, 2020 at 3:54 Anton 2,7131 gold badge8 silver badges15 bronze badges asked Aug 22, 2020 at 20:11 Alex RossAlex Ross 512 silver badges4 bronze badges 13
  • 3 Let's back up first: what on earth are you trying to actually do here? (and on a practical note, which version of Node? Because there are many) – Mike 'Pomax' Kamermans Commented Aug 22, 2020 at 20:14
  • Cause maybe arrays[0] can be inlined while arrays[j] cannot ... – Jonas Wilms Commented Aug 22, 2020 at 20:15
  • 1 @Mike'Pomax'Kamermans This is just a minimum working example of an issue I noticed whilst dealing with arrays of large numbers and the find function. – Alex Ross Commented Aug 22, 2020 at 20:17
  • 1 @Mike'Pomax'Kamermans It looks like you have misread the example code. This is not iterating over 1e8 array positions... it is iterating over 1 array position (for simplicity). The test is performed 1e8 times in order to show the difference in performance. In practice I am accessing an array of 1000 elements, but adding that in would have made the example more plicated than this minimum working example. – Alex Ross Commented Aug 22, 2020 at 21:07
  • 1 @Mike'Pomax'Kamermans Thanks, I didn't know typed arrays were a thing. I will definitely look into this and keep it in mind in future. I didn't think a nested array would care about the contents of another nested array, but I guess I was wrong about that. – Alex Ross Commented Aug 22, 2020 at 21:25
 |  Show 8 more ments

2 Answers 2

Reset to default 6

V8 developer here.

The "slow case" is the true cost of calling Array.find with a callback: for every element, the built-in implementation of Array.find performs a call to the provided callback. Aside from doing this basic work that you asked it to do, the implementation is actually pretty optimized, both the Array.find built-in and the supplied callback.

The fast case benefits from certain additional optimizations in V8: if a call to Array.find has only ever seen arrays of the same type (including internal representation, see below), then there's some special handling in the type feedback collection system and the optimizing piler to emit a special inlined version of it, which in particular has the follow-on benefit that it can also inline the provided callback, specialized for this type of array. As you can see here, this optimization provides a massive speed boost when it is applicable.

The reason that [1e9] and [1e10] are different types of arrays under the hood is because 1e9 is a 30-bit integer, so V8 internally chooses "small integer" (aka "smi", 31-bit signed int) representation for the array's elements. 1e10, however, would require 34 bits, so V8 chooses double (64-bit floating point) representation for the array elements. So if the same occurrence of Array.find encounters both [1e9] (or [1] for that matter) and [1e10], it decides "I've seen more than one type of array here, inlining more than one special case probably costs more than it's worth, let's use the generic version". You could say that this decision is a bit overly pessimistic in this case, but such is the nature of heuristics: engines need rules to decide what to do, and since they can't predict what your code will do in the future, they just have to make some guess -- which could turn out to be a good guess, or a not so good guess :-)

It's not related to having a loop per se; looping over the list of arrays is just one way of making the same Array.find encounter several array types. You could trigger the fallback to the generic path without a loop by using a function that's called with different inputs; or you could have a loop (that loops over something else) while still staying on the fast path.


@Anton wrote:

It seems, that find method has some problems.

I wouldn't put it that way. It's not easy for an engine to optimize Array.find to the same degree as a hand-written for-loop -- for instance, because an engine generally can't inline user-provided callbacks into built-in functions. As explained above, V8 knows enough tricks to be able to pull off such inlining in some situations, but not always.

This is far from the only case where a hand-written replacement for a built-in function can achieve faster performance; in many cases this is because the built-in functions are more general (i.e.: support more weird corner cases) than the hand-written replacement. It's also the case that outside of targeted microbenchmarks it is fairly rare (though certainly not impossible) to find a case where these differences actually matter.

Note: Maybe this is not the correct answer, but it is just a very big ment (I need code snippets to illustrate).

This is example from the question (it takes more that 5 seconds for the a, and less than second for b):

const nSims = 1e8

const arrays = [];
arrays[0] = [1];
arrays[1] = [1e10];

console.time('a')
for (var i = 0; i < nSims; i++) {
    for (var j = 0; j < 2; j++) {
        arrays[j].find((value) => value > 0);
    }
}
console.timeEnd('a') // 5600 ms

console.time('b')
for (var i = 0; i < nSims; i++) {
    arrays[0].find((value) => value > 0);
    arrays[1].find((value) => value > 0);
}
console.timeEnd('b') // 250 ms

This happens if we change 1e10 to 1e9 ("magic" here):

const nSims = 1e8

const arrays = [];
arrays[0] = [1];
arrays[1] = [1e9];

console.time('a')
for (var i = 0; i < nSims; i++) {
    for (var j = 0; j < 2; j++) {
        arrays[j].find((value) => value > 0);
    }
}
console.timeEnd('a') // 5600 ms

console.time('b')
for (var i = 0; i < nSims; i++) {
    arrays[0].find((value) => value > 0);
    arrays[1].find((value) => value > 0);
}
console.timeEnd('b') // 250 ms

It seems, that find method has some problems. Here what happens if we replace it with for iteration (a and b bees close, and less than 1 second):

const nSims = 1e8

const arrays = [];
arrays[0] = [1];
arrays[1] = [1e10];

function find(arr) {
  for (let i = 0; i < arr.length; i++) {
    if (arr[i] > 0) return arr[i];
  }
}

console.time('a')
for (var i = 0; i < nSims; i++) {
    for (var j = 0; j < 2; j++) {
        find(arrays[j]);
    }
}
console.timeEnd('a')

console.time('b')
for (var i = 0; i < nSims; i++) {
    find(arrays[0]);
    find(arrays[1]);
}
console.timeEnd('b')

发布评论

评论列表(0)

  1. 暂无评论