This question relates to the Excel function INDEX()
, which is documented by Microsoft here.
If I have an Excel range in, say, A1:B2:
A | B |
---|---|
1 | 2 |
3 | 4 |
This question relates to the Excel function INDEX()
, which is documented by Microsoft here.
If I have an Excel range in, say, A1:B2:
A | B |
---|---|
1 | 2 |
3 | 4 |
and I use the formula
=INDEX(A1:B2, 0, 0)
then the full range is returned to the worksheet (as a dynamic array). This agrees with the documentation.
If the final argument, col_num
, is missing:
=INDEX(A1:B2, 0, )
then the result is the same, again matching the documentation.
But if col_num
is omitted:
=INDEX(A1:B2, 0)
then a #REF!
error is returned.
However, if the first argument to INDEX()
is changed from a range reference to the equivalent literal array:
=INDEX({1, 2; 3, 4}, 0)
then the full range is returned again.
(Edit for clarity: The value of the second argument, row_num
, is not important, as long as it is not negative or out of range; the behaviour described above is exhibited for any valid value of row_num
, including zero or 'missing'.)
The #REF!
error only occurs if col_num
is omitted and the input reference is two-dimensional (i.e. has multiple rows and multiple columns). If, instead, the reference has only one row and/or only one column, the behaviour reverts to returning the full range, matching the behaviour for a literal array.
I cannot find this apparent anomaly documented anywhere. Moreover, this is the only discrepancy I can find between the two cases of the first argument to INDEX()
being a reference or a literal array; in all other scenarios, the behaviour seems identical between the two.
Is this discrepancy deliberate? Is there any documentation on this behaviour that I have missed?
Share Improve this question edited Jan 30 at 14:17 Neil T asked Jan 30 at 12:45 Neil TNeil T 3,2852 gold badges17 silver badges29 bronze badges 11 | Show 6 more comments2 Answers
Reset to default 5With some help from @Rory (many thanks), I think I can now explain this behaviour.
The key things to understand are:
In Excel, some function parameters can accept either reference or array arguments. The behaviour of the function may differ depending on whether a reference or an array is passed.
INDEX()
is such a function. The documentation refers to the "array form" and the "reference form". The syntax is almost identical (except for the rare "reference form" case where multiple ranges are passed).The documentation states that the
array
parameter of the "array form" can accept "a range of cells or an array constant". Crucially, "range of cells" here does not mean a reference to a cell range on a worksheet such asA1:B2
(this would be a reference). Rather, it means an in-memory cell range, as might be returned by a nested function call to an outer function (examples below).Therefore,
INDEX(A1:B2, ...)
always calls the "reference form" of theINDEX()
fucntion, never the "array form".As pointed out by @Rory in the comments, there is a minor difference in the documentation between the two forms: in the "array form", the
column_num
parameter is optional for a 2D range ("2D" meaning > 1 row, > 1 column). This is not the case for the "reference form".
This explains why, in this unique case of a 2D reference with an omitted column_num
, a #REF!
error is returned, whereas the equivalent call on a literal array (or an in-memory cell range) succeeds because the "array form" supports this.
Examples:
=INDEX(A1:B2, 0) : fails (reference form; col_num is required for a 2D reference)
=INDEX(A1:B2, 0, ) : succeeds (reference form; the comma means that col_num is present, even though 'missing')
=INDEX({1, 2; 3, 4}, 0) : succeeds (array form; col_num is optional for a 2D array)
=INDEX(INDEX(A1:B2, 0, 0), 0) : fails (a _reference_ is passed to the outer INDEX() call)
=INDEX(SQRT(INDEX(A1:B2, 0, 0)), 0) : succeeds (an _array_ is passed to the outer INDEX() call)
This last example is the "range of cells" case: the inner INDEX()
function returns a reference, which is processed by the SQRT()
function, returning an array ("range of cells"), which then calls the "array form" of the outer INDEX()
function.
I still do not understand why this difference is necessary; I can see no reason why the "reference form" should not behave identically to the array form in this regard. Presumably there is a reason, which I would love to hear if anyone knows it!
It's special behavior depending on the number of arguments. If the argument is a vector (1xN or Nx1), the second argument is treated as the index regardless of orientation:
So, there are two cases:
- Two arguments - array and index of a vector;
- Three arguments - array, row index, and column index.
UPD
The related feature is the difference of processing references and dynamic arrays by BYROW and BYCOLUMN functions which I researched earlier:
Summary
It looks like a feature we should be aware.
=INDEX(A1:B2, 0)
return something different from=INDEX({1, 2; 3, 4}, 0)
. Your formula is different and doesn't exhibit the anomaly. – Neil T Commented Jan 30 at 13:12row_num
is not important:#REF!
is always returned whenINDEX()
is applied to a 2D reference with an omittedcol_num
, whereas the equivalent call on the literal array{1, 2; 3, 4}
always succeeds (unlessrow_num
is negative or out of range). This also applies ifrow_num
has the "Missing" value, as in your example, in which case therow_num
parameter takes its default value0
. (I've edited the question to clarify that point.) – Neil T Commented Jan 30 at 14:14