最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Cut-and-overlap reduction of N-dimensional array - Stack Overflow

programmeradmin6浏览0评论

Note: I have no idea what one might call this operation, so I've gone with "cut-and-overlap". Please let me know if there is an accepted terminology for something like this!

Intro

I need to implement an operation that can be performed on N-dimensional arrays, which I will describe below. I am working in Python with numpy and am looking for a numerical library in that ecosystem which might be able to do this, but am interested in any implementation in any language. While implementing this with for loops for 1D or 2D is possible, although fidly, I need an implementation that would work with N-dimensional arrays (not up to anything crazy, but at least 4D).

Is there some neat way to do this, with some kind of fancy sliding windows and ravelling and reshaping? ;)

The operation

The principle of the operation is to make a number of "cuts" in the array, and then slide the array over itself at the cut locations so it overlaps, and finally aggregate the overlapping values. In my case, that aggregation is a simple sum. I think the easiest way to illustrate it is to give some examples, which are below for 1D and 2D cases. Hopefully the generalisation to N-dimensional arrays follows fairly obviously.

Examples

1D case

arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1] 

The array is cut like so: [0,2,5,3|2,1,0|3,4,5,2,3|6,4,1,2,3]
Then, it is slid together like so:

0253
 210
  34523
      64123

Finally, the result is summed, giving the resulting array:
[0,4,9,7,5,2,9,4,1,2,3]

2D case

The 2D case is a bit harder to visualise:

INPUT


OUTPUT


Edge case

An extreme example is if the shift exceeds the array's length. This is possible but rare in my intended application.

arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [5]
slide_amount = [20]

Result: [1,0,3,4,5,2,3,6,4,1,2,3,0,0,0,0,2,5,3,2]

1D implementation for small shifts

def slide_and_overlap(arr, cut_locations, slide_amounts):
    right_pads = np.append(len(arr) - cut_locations, [0])
    left_pads = np.append([0], cut_locations)
    chunks = np.split(arr, cut_locations)
    result = np.zeros(len(arr))
    for i, chunk in enumerate(chunks):
        slide_amount = np.sum(np.append([0], slide_amounts)[:i+1])
        padded_chunk = np.pad(chunk, (left_pads[i], right_pads[i]), mode='constant', constant_values=0)
        slid_chunk = np.roll(padded_chunk, -1*slide_amount)
        result += slid_chunk
    
    # Trim trailing zeros from right side
    result = np.trim_zeros(result, trim='b')
    return result

Note: I have no idea what one might call this operation, so I've gone with "cut-and-overlap". Please let me know if there is an accepted terminology for something like this!

Intro

I need to implement an operation that can be performed on N-dimensional arrays, which I will describe below. I am working in Python with numpy and am looking for a numerical library in that ecosystem which might be able to do this, but am interested in any implementation in any language. While implementing this with for loops for 1D or 2D is possible, although fidly, I need an implementation that would work with N-dimensional arrays (not up to anything crazy, but at least 4D).

Is there some neat way to do this, with some kind of fancy sliding windows and ravelling and reshaping? ;)

The operation

The principle of the operation is to make a number of "cuts" in the array, and then slide the array over itself at the cut locations so it overlaps, and finally aggregate the overlapping values. In my case, that aggregation is a simple sum. I think the easiest way to illustrate it is to give some examples, which are below for 1D and 2D cases. Hopefully the generalisation to N-dimensional arrays follows fairly obviously.

Examples

1D case

arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1] 

The array is cut like so: [0,2,5,3|2,1,0|3,4,5,2,3|6,4,1,2,3]
Then, it is slid together like so:

0253
 210
  34523
      64123

Finally, the result is summed, giving the resulting array:
[0,4,9,7,5,2,9,4,1,2,3]

2D case

The 2D case is a bit harder to visualise:

INPUT


OUTPUT


Edge case

An extreme example is if the shift exceeds the array's length. This is possible but rare in my intended application.

arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [5]
slide_amount = [20]

Result: [1,0,3,4,5,2,3,6,4,1,2,3,0,0,0,0,2,5,3,2]

1D implementation for small shifts

def slide_and_overlap(arr, cut_locations, slide_amounts):
    right_pads = np.append(len(arr) - cut_locations, [0])
    left_pads = np.append([0], cut_locations)
    chunks = np.split(arr, cut_locations)
    result = np.zeros(len(arr))
    for i, chunk in enumerate(chunks):
        slide_amount = np.sum(np.append([0], slide_amounts)[:i+1])
        padded_chunk = np.pad(chunk, (left_pads[i], right_pads[i]), mode='constant', constant_values=0)
        slid_chunk = np.roll(padded_chunk, -1*slide_amount)
        result += slid_chunk
    
    # Trim trailing zeros from right side
    result = np.trim_zeros(result, trim='b')
    return result
Share Improve this question edited Mar 21 at 3:24 Heisenbugs asked Mar 20 at 9:01 HeisenbugsHeisenbugs 275 bronze badges 5
  • Please clarify how the sliding is meant to work (including in the 2D case). The 1D case seems inconsistent at the moment. The last chunk is shifted 1 to the left relative to the previous chunk, but the second to last chunk (34523) is shifted 7 to the left of its own starting point (not 7 left of the 210 chunk). – Alex Duchnowski Commented Mar 20 at 13:13
  • Numpy has a split, but internally it iterates (at python level) to create the list of slices. Applying the differing shifts will requires the same level of iteration. That's just thinking about the 1d case. Note I used list - that's basic python object. – hpaulj Commented Mar 20 at 14:50
  • You gave an extreme example - how common will that sort of thing be? This could affect whether a sparse or dense representation of the data is more appropriate. – Matt Haberland Commented Mar 20 at 16:34
  • @AlexDuchnowski My apologies, the example was incorrect. The shift should be relative to the chunk, or in other words everything right of the cut gets shifted. – Heisenbugs Commented Mar 21 at 3:25
  • @MattHaberland In my application, this kind of extreme shift is likely to be rare, but not impossible. I have added an implementation for a 1D case where there shifts are small, which would not handle such an extreme case. – Heisenbugs Commented Mar 21 at 3:26
Add a comment  | 

1 Answer 1

Reset to default 1

I am not sure an N-D solution in NumPy without Python loops is practical/efficient (even if it is possible). But here is a NumPy solution without loops that should work for 1-D array, including the edge case.

The idea is to pad arr with zeros at the beginning, at each cut location, and at the end such that it can be reshaped into a 2-D array before summing along the zeroth axis. We figure out the number of zeros we need at each location based on the cut location indices and the cumulative slide amounts.

import numpy as np
arr = [0,2,5,3,2,1,0,3,4,5,2,3,6,4,1,2,3]
cut_locations = [4,7,12]
slide_amount = [3,2,1] 

# Cumulative amount to slide each row, including the zeroth row
slide = [0] + slide_amount
slide = np.cumsum(slide)
# Segment `i` of `arr` is given by `arr[segment[i]: segment[i+1]]`
segment = [0] + cut_locations + [len(arr)]

m = len(slide)  # number of rows
left = segment[:-1] - slide  # left index of segments after slide
right = segment[1:] - slide  # right index of segments after slide
nl = np.min(left)  # minimum index (could be less than 0) 
nr = np.max(right) # maximum index

left_pad = left - nl    # number of zeros to prepend to each row 
right_pad = nr - right  # number of zeros to append to each row
# number of zeros to be added at each index in `segment` - that
# is, before the first segment, in between segments, and after last segment
pad = np.zeros(m + 1, dtype=np.int64)
pad[:-1] = left_pad
pad[1:] += right_pad

# Insert the padding, reshape, and sum
i_pad = np.repeat(segment, pad)
padded = np.insert(arr, i_pad, 0).reshape((-1, nr - nl))
print(padded.sum(axis=0))
# [0 4 9 7 5 2 9 4 1 2 3]

This would be much easier with loops, so I'd suggest prototyping in Python, then using Numba or similar to accelearate for the N-D case. For that, rather than forming a larger array and summing, you probably want to use the ideas above to calculate the shape of the final array, and allocate an all-zero final array. The rest would pretty different - you'd loop over the cut locations and shifts, "stamping" N-D sections from the original array into the appropriate places of the final array += operations.

发布评论

评论列表(0)

  1. 暂无评论