Note: I have no idea what one might call this operation, so I've gone with "cut-and-overlap". Please let me know if there is an accepted terminology for something like this!
Intro
I need to implement an operation that can be performed on N-dimensional arrays, which I will describe below. I am working in Python
with numpy
and am looking for a numerical library in that ecosystem which might be able to do this, but am interested in any implementation in any language. While implementing this with for loops for 1D or 2D is possible, although fidly, I need an implementation that would work with N-dimensional arrays (not up to anything crazy, but at least 4D).
Is there some neat way to do this, with some kind of fancy sliding windows and ravelling and reshaping? ;)
The operation
The principle of the operation is to make a number of "cuts" in the array, and then slide the array over itself at the cut locations so it overlaps, and finally aggregate the overlapping values. In my case, that aggregation is a simple sum. I think the easiest way to illustrate it is to give some examples, which are below for 1D and 2D cases. Hopefully the generalisation to N-dimensional arrays follows fairly obviously.
Examples
1D case
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1]
The array is cut like so:
[0,2,5,3|2,1,0|3,4,5,2,3|6,4,1,2,3]
Then, it is slid together like so:
0253
210
34523
64123
Finally, the result is summed, giving the resulting array:
[0,4,9,7,5,2,9,4,1,2,3]
2D case
The 2D case is a bit harder to visualise:
INPUT
OUTPUT
Edge case
An extreme example is if the shift exceeds the array's length. This is possible but rare in my intended application.
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [5]
slide_amount = [20]
Result:
[1,0,3,4,5,2,3,6,4,1,2,3,0,0,0,0,2,5,3,2]
1D implementation for small shifts
def slide_and_overlap(arr, cut_locations, slide_amounts):
right_pads = np.append(len(arr) - cut_locations, [0])
left_pads = np.append([0], cut_locations)
chunks = np.split(arr, cut_locations)
result = np.zeros(len(arr))
for i, chunk in enumerate(chunks):
slide_amount = np.sum(np.append([0], slide_amounts)[:i+1])
padded_chunk = np.pad(chunk, (left_pads[i], right_pads[i]), mode='constant', constant_values=0)
slid_chunk = np.roll(padded_chunk, -1*slide_amount)
result += slid_chunk
# Trim trailing zeros from right side
result = np.trim_zeros(result, trim='b')
return result
Note: I have no idea what one might call this operation, so I've gone with "cut-and-overlap". Please let me know if there is an accepted terminology for something like this!
Intro
I need to implement an operation that can be performed on N-dimensional arrays, which I will describe below. I am working in Python
with numpy
and am looking for a numerical library in that ecosystem which might be able to do this, but am interested in any implementation in any language. While implementing this with for loops for 1D or 2D is possible, although fidly, I need an implementation that would work with N-dimensional arrays (not up to anything crazy, but at least 4D).
Is there some neat way to do this, with some kind of fancy sliding windows and ravelling and reshaping? ;)
The operation
The principle of the operation is to make a number of "cuts" in the array, and then slide the array over itself at the cut locations so it overlaps, and finally aggregate the overlapping values. In my case, that aggregation is a simple sum. I think the easiest way to illustrate it is to give some examples, which are below for 1D and 2D cases. Hopefully the generalisation to N-dimensional arrays follows fairly obviously.
Examples
1D case
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1]
The array is cut like so:
[0,2,5,3|2,1,0|3,4,5,2,3|6,4,1,2,3]
Then, it is slid together like so:
0253
210
34523
64123
Finally, the result is summed, giving the resulting array:
[0,4,9,7,5,2,9,4,1,2,3]
2D case
The 2D case is a bit harder to visualise:
INPUT
OUTPUT
Edge case
An extreme example is if the shift exceeds the array's length. This is possible but rare in my intended application.
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [5]
slide_amount = [20]
Result:
[1,0,3,4,5,2,3,6,4,1,2,3,0,0,0,0,2,5,3,2]
1D implementation for small shifts
def slide_and_overlap(arr, cut_locations, slide_amounts):
right_pads = np.append(len(arr) - cut_locations, [0])
left_pads = np.append([0], cut_locations)
chunks = np.split(arr, cut_locations)
result = np.zeros(len(arr))
for i, chunk in enumerate(chunks):
slide_amount = np.sum(np.append([0], slide_amounts)[:i+1])
padded_chunk = np.pad(chunk, (left_pads[i], right_pads[i]), mode='constant', constant_values=0)
slid_chunk = np.roll(padded_chunk, -1*slide_amount)
result += slid_chunk
# Trim trailing zeros from right side
result = np.trim_zeros(result, trim='b')
return result
Share
Improve this question
edited Mar 21 at 3:24
Heisenbugs
asked Mar 20 at 9:01
HeisenbugsHeisenbugs
275 bronze badges
5
|
1 Answer
Reset to default 1I am not sure an N-D solution in NumPy without Python loops is practical/efficient (even if it is possible). But here is a NumPy solution without loops that should work for 1-D array, including the edge case.
The idea is to pad arr
with zeros at the beginning, at each cut location, and at the end such that it can be reshaped into a 2-D array before summing along the zeroth axis. We figure out the number of zeros we need at each location based on the cut location indices and the cumulative slide amounts.
import numpy as np
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1]
# Cumulative amount to slide each row, including the zeroth row
slide = [0] + slide_amount
slide = np.cumsum(slide)
# Segment `i` of `arr` is given by `arr[segment[i]: segment[i+1]]`
segment = [0] + cut_locations + [len(arr)]
m = len(slide) # number of rows
left = segment[:-1] - slide # left index of segments after slide
right = segment[1:] - slide # right index of segments after slide
nl = np.min(left) # minimum index (could be less than 0)
nr = np.max(right) # maximum index
left_pad = left - nl # number of zeros to prepend to each row
right_pad = nr - right # number of zeros to append to each row
# number of zeros to be added at each index in `segment` - that
# is, before the first segment, in between segments, and after last segment
pad = np.zeros(m + 1, dtype=np.int64)
pad[:-1] = left_pad
pad[1:] += right_pad
# Insert the padding, reshape, and sum
i_pad = np.repeat(segment, pad)
padded = np.insert(arr, i_pad, 0).reshape((-1, nr - nl))
print(padded.sum(axis=0))
# [0 4 9 7 5 2 9 4 1 2 3]
This would be much easier with loops, so I'd suggest prototyping in Python, then using Numba or similar to accelearate for the N-D case. For that, rather than forming a larger array and summing, you probably want to use the ideas above to calculate the shape of the final array, and allocate an all-zero final array. The rest would pretty different - you'd loop over the cut locations and shifts, "stamping" N-D sections from the original array into the appropriate places of the final array +=
operations.
split
, but internally it iterates (at python level) to create the list of slices. Applying the differing shifts will requires the same level of iteration. That's just thinking about the 1d case. Note I usedlist
- that's basic python object. – hpaulj Commented Mar 20 at 14:50