Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!
Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next()
return a new object with properties done
and value
instead of adopting a protocol like C# IEnumerable
and IEnumerator
which allocates no object at the expense of requiring two calls (one to moveNext
to see if the iteration is done, and a second to current
to get the value)?
Are there under-the-hood optimizations that skip the allocation of the object return by next()
? Hard to imagine given the iterable doesn't know how the object could be used once returned...
Generators don't seem to reuse the next object as illustrated below:
function* generator() {
yield 0;
yield 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(result0.value) // 0
console.log(result1.value) // 1
Hm, here's a clue (thanks to Bergi!):
We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.
And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next
is so that a value
can be returned even when done
is true
! Whoa. Furthermore, generators can return
values in addition to yield
and yield*
-ing values and a value generated by return
ends up as in value
when done
is true
!
And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!
Although, now that I think about it, allowing yield*
to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator
protocol could be extended to return an object after moveNext()
returns false
-- just add a property hasCurrent
to test after the iteration is plete that when true
indicates current
has a valid value...
And the piler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?
All these points are raised in this thread discovered by the friendly SO munity. Yet, those arguments didn't seem to hold the day.
However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "plete", right? E.g. most everyone would think the following would log all values returned by an iterator:
function logIteratorValues(iterator) {
var next;
while(next = iterator.next(), !next.done)
console.log(next.value)
}
Except it doesn't because even though done
is false
the iterator might still have returned another value. Consider:
function* generator() {
yield 0;
return 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true
Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...
And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.
Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS munity preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...
Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!
Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next()
return a new object with properties done
and value
instead of adopting a protocol like C# IEnumerable
and IEnumerator
which allocates no object at the expense of requiring two calls (one to moveNext
to see if the iteration is done, and a second to current
to get the value)?
Are there under-the-hood optimizations that skip the allocation of the object return by next()
? Hard to imagine given the iterable doesn't know how the object could be used once returned...
Generators don't seem to reuse the next object as illustrated below:
function* generator() {
yield 0;
yield 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(result0.value) // 0
console.log(result1.value) // 1
Hm, here's a clue (thanks to Bergi!):
We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.
And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next
is so that a value
can be returned even when done
is true
! Whoa. Furthermore, generators can return
values in addition to yield
and yield*
-ing values and a value generated by return
ends up as in value
when done
is true
!
And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!
Although, now that I think about it, allowing yield*
to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator
protocol could be extended to return an object after moveNext()
returns false
-- just add a property hasCurrent
to test after the iteration is plete that when true
indicates current
has a valid value...
And the piler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?
All these points are raised in this thread discovered by the friendly SO munity. Yet, those arguments didn't seem to hold the day.
However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "plete", right? E.g. most everyone would think the following would log all values returned by an iterator:
function logIteratorValues(iterator) {
var next;
while(next = iterator.next(), !next.done)
console.log(next.value)
}
Except it doesn't because even though done
is false
the iterator might still have returned another value. Consider:
function* generator() {
yield 0;
return 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true
Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...
And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.
Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS munity preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...
Share Improve this question edited Jan 4, 2019 at 21:46 Christopher King asked Dec 22, 2018 at 9:52 Christopher KingChristopher King 1,0621 gold badge8 silver badges21 bronze badges 17- Can you link to the spec where it says that a new object needs to be returned? – Felix Kling Commented Dec 22, 2018 at 10:10
- You'll likely no get an answer about specific language design decision here, since the people who work on the spec are not here. You should reach out to them directly. – Felix Kling Commented Dec 22, 2018 at 10:11
- 1 @Bergi: Actually, that only describes the behavior of built-in iterators. The protocol itself doesn't seem to require a new object in each iteration. – Felix Kling Commented Dec 22, 2018 at 10:24
- 1 FWIW, here is an example that reuses the result object: jsfiddle/wp82n07o . The specification of the protocol doesn't seem to require that a different object is returned in each iteration (as far as I can see). So it seems you can get away with only allocating one. However, as I have mentioned before, I would reach out to the people from the TC39 mittee if you want clarification on that. – Felix Kling Commented Dec 22, 2018 at 10:33
- 2 @FelixKling Here's some discussion: esdiscuss/topic/…, esdiscuss/topic/iterator-next-method-returning-new-object. Also I found that reusing the object makes escape analysis harder for the piler... – Bergi Commented Dec 22, 2018 at 10:45
3 Answers
Reset to default 12Are there under-the-hood optimizations that skip the allocation of the object return by
next()
?
Yes. Those iterator result objects are small and usually short-lived. Particularly in for … of
loops, the piler can do a trivial escape analysis to see that the object doesn't face the user code at all (but only the internal loop evaluation code). They can be dealt with very efficiently by the garbage collector, or even be allocated directly on the stack.
Here are some sources:
- JS inherits it functionally-minded iteration protocol from Python, but with results objects instead of the previously favoured
StopIteration
exceptions - Performance concerns in the spec discussion (cont'd) were shrugged off. If you implement a custom iterator and it is too slow, try using a generator function
- (At least for builtin iterators) these optimisations are already implemented:
The key to great performance for iteration is to make sure that the repeated calls to
iterator.next()
in the loop are optimized well, and ideally pletely avoid the allocation of theiterResult
using advanced piler techniques like store-load propagation, escape analysis and scalar replacement of aggregates. To really shine performance-wise, the optimizing piler should also pletely eliminate the allocation of theiterator
itself - theiterable[Symbol.iterator]()
call - and operate on the backing-store of the iterable directly.
Bergi answered already, and I've upvoted, I just want to add this:
Why should you even be concerned about new object being returned? It looks like:
{done: boolean, value: any}
You know, you are going to use the value
anyway, so it's really not an extra memory overhead. What's left? done: boolean
and the object itself take up to 8 bytes each, which is the smallest addressable memory possible and must be processed by the cpu and allocated in memory in a few pico- or nanoseconds (I think it's pico- given the likely-existing v8 optimizations). Now if you still care about wasting that amount of time and memory, than you really should consider switching to something like Rust+WebAssembly from JS.
The iterator protocol using result objects may be one cause for native Iterator helpers (currently implemented only in Google Chrome >= v122) being slower than my own implementation which internally uses a "direct" interface:
interface DirectIterator<T, TReturn = any> {
_next(): T | Symbol.for('done');
_return?(value?: TReturn): TReturn;
}
... and optionally reuses the result object. The following benchmark suite demonstrates it. "Native" benchmarks are only included if globalThis.Iterator
exists: https://jsfiddle/AndiTR/5oz9aw8y/latest