Why np.vectorize doesn't work for awkward arrays?

I have a simple vectorized function (actually my real case is more complicated), as:

@np.vectorize
def f(x):
    if x > 0:
        return 1
    else:
        return -1

Why it is not working with awkward?

x = ak.Array([[], [10], [40, 50, 60]])
f(x)

ValueError: cannot convert to RegularArray because subarray lengths are not regular (in 
compiled code: .cpp#L22)

I understand the message, but I don't know why the fact that length is not regular should be a problem.

I guess a workaround is to flatten:

y = f(ak.flatten(x))
y = ak.unflatten(y, ak.num(x))

I have a simple vectorized function (actually my real case is more complicated), as:

@np.vectorize
def f(x):
    if x > 0:
        return 1
    else:
        return -1

Why it is not working with awkward?

x = ak.Array([[], [10], [40, 50, 60]])
f(x)

ValueError: cannot convert to RegularArray because subarray lengths are not regular (in 
compiled code: https://github/scikit-hep/awkward/blob/awkward-cpp-44/awkward-cpp/src/cpu-kernels/awkward_ListOffsetArray_toRegularArray.cpp#L22)

I understand the message, but I don't know why the fact that length is not regular should be a problem.

I guess a workaround is to flatten:

y = f(ak.flatten(x))
y = ak.unflatten(y, ak.num(x))

Share Improve this question edited Mar 31 at 21:10 asked Mar 31 at 14:30 Ruggero Turra 17.8k19 gold badges91 silver badges146 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

It looks like the function made with @np.vectorize is not a true ufunc in the sense of calling __array_ufunc__ on its arguments if they're not NumPy arrays. Instead, it's trying to cast its arguments as NumPy arrays, which can't be done if they're ragged.

However, Numba's @nb.vectorize does create a ufunc that obeys this protocol, so I would suggest a one-character change:

>>> @nb.vectorize
... def f(x):
...     if x > 0:
...         return 1
...     else:
...         return -1
... 
>>> x = ak.Array([[], [10], [-40, -50, 60]])
>>> f(x)
<Array [[], [1], [-1, -1, 1]] type='3 * var * int64'>

The downside is that you need another library, Numba. The upside is that this vectorized function is actually compiled, whereas @np.vectorize is not. Incidentally, it was this issue that @np.vectorize looks like it's going to "vectorize" your function (in the sense of using a compiled or purely numerical implementation) but doesn't that was the original motivation for Numba (video).

By the way, I was just assuming that this f is an example and you have another function in mind. If you really want the above, you could do

>>> np.sign(x)
<Array [[], [1], [-1, -1, 1]] type='3 * var * int64'>

and no new libraries are involved.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Why np.vectorize doesn't work for awkward arrays? - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)