python - Most efficient way to allocate n copies of an array to n-new dimensions?

The task is to create 361x721x11 copies of ll as fast as possible preferably using the functions in numpy. The following code works

ll = ones((13,25)) # used ones() for illustration purposes. My real array isn't actually just a bunch of ones
ll_ijk = np.repeat(np.repeat(np.repeat(ll[...,None,None,None],361,axis = 2),721,axis = 3),11,axis = 4)

where I used nested calls of the repeat function for each new dimension, but it took about 4-5 seconds to complete. Is there a more/most efficient way of doing this?

The task is to create 361x721x11 copies of ll as fast as possible preferably using the functions in numpy. The following code works

ll = ones((13,25)) # used ones() for illustration purposes. My real array isn't actually just a bunch of ones
ll_ijk = np.repeat(np.repeat(np.repeat(ll[...,None,None,None],361,axis = 2),721,axis = 3),11,axis = 4)

where I used nested calls of the repeat function for each new dimension, but it took about 4-5 seconds to complete. Is there a more/most efficient way of doing this?

Share Improve this question asked Mar 18 at 23:50 Researcher R 1475 bronze badges

3 It might be helpful if you explained why you want so many copies of the array. You might be trying to do something that can be done without allocating so much memory. In other words, this might be an XY problem. – jared Commented Mar 19 at 0:03
3 x=np.empty((361,721,11,13,25)) followed by broadcated assignment, x[...]=ll – hpaulj Commented Mar 19 at 0:11
@jared Yes and no. I am taking hydrodynamic code that I had created in Matlab and the method I'm using to solve my problem, while slow, works. So, I have a clear working goal in mind of what to create in one side, and I need to translate it over to the other. For the most part, it moves over pretty easily. There might be packages or techniques that efficiently bypass the need to create such large arrays, but for right now, that is not the goal. Why I posted this was because using repmat in Matlab takes 2 seconds to complete, and in Python, the alternative I found took 4-5 seconds. – Researcher R Commented Mar 19 at 0:30
@hpaulj works, is much cleaner than what I did, and took about 2-3 seconds. Perfect! Thank you! – Researcher R Commented Mar 19 at 0:38
1 Do you really need copies? Can you work with a broadcast view? – Reinderien Commented Mar 19 at 0:59

| Show 4 more comments

1 Answer 1

Sorted by: Reset to default 1

The result of your code is an array of shape (13, 25, 361, 721, 11) in which each slice ll_ijk[:, :, i, j, k] (for valid indices i, j, and j) has the same shape and all elements equal to your original ll.

If you do not need to modify these values (and just need the array to have a shape compatible with your other arrays), broadcasting should be fine for you. In that case:

shape = (13, 25, 361, 721, 11)
ll_ijk = np.broadcast_to(ll[:, :, None, None, None], shape)

will be your fastest and most memory efficient option by far. Instead of copying the elements, each slice ll_ijk[:, :, i, j, k] refers to your original array ll. As a demonstration, you could change your original array ll, and note that ll_ijk changes, too.

ll[0, 0] = 10
ll_ijk[0, 0, 0, 0, 0]  # np.float64(10.0)
ll_ijk[0, 0, 0, 0, 1]  # np.float64(10.0)

You seemed to be happy with a solution that resulted in a different output shape in which each slice ll_ijk[i, j, k] is equal to your original ll. In that case, you do not need to add the extra dimensions to ll because they will be prepended automatically according to NumPy's standard broadcasting rules. You would write:

shape = (361, 721, 11, 13, 25)
ll_ijk = np.broadcast_to(ll, shape)

But in that case, you probably wouldn't need the explicit broadcasting with np.broadcast_to. Broadcasting would typically happen automatically whenever you perform an operation with arrays of compatible shapes.

I found it interesting that on my machine, if a copy were required, the solutions with output shape (13, 25, 361, 721, 11) (your original shape) were much faster than solutions with shape (361, 721, 11, 13, 25) . This can probably be explained easily in terms of array contiguity, but it's a bit too late for that sort of thought.

np.tile, which is the function that should be ideal for this purpose, is the slowest of the lot.

ll_ijk = np.tile(ll[:, :, None, None, None], (1, 1, 361, 721, 11))
# 2.5153203749941895
ll_ijk = np.tile(ll, (361, 721, 11, 1, 1))
# 6.901196416001767

One of the solutions in the comments was:

ll_ijk = np.empty((13, 25, 361, 721, 11))
ll_ijk[...] = ll[:, :, None, None, None]  # 1.6792952079995302
ll_ijk = np.empty((361, 721, 11, 13, 25))
ll_ijk[...] = ll  # 4.837824542002636

Broadcasting with np.broadcast_to followed by copying is a little faster for me, but YMMV:

ll_ijk = np.broadcast_to(ll[:, :, None, None, None], (13, 25, 361, 721, 11)).copy()
# 1.5269727080012672
ll_ijk = np.broadcast_to(ll, (361, 721, 11, 13, 25)).copy()
# 3.9539862910023658

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - Most efficient way to allocate n copies of an array to n-new dimensions? - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)