python - Finding original indices for rearranged arrays

Given two numpy arrays of equal shape, I want to track how elements from the first array have moved in the second array. Specifically, for each element in the second array, I want to find its original position in the first array. The arrays are not sorted.

Example 1: All elements present

a1 = np.array([1, 2, 3, 4])  # original array
a2 = np.array([2, 1, 3, 4])  # new array

# Result: [1, 0, 2, 3]
# Explanation:
# - 2 was originally at index 1
# - 1 was originally at index 0
# - 3 was originally at index 2
# - 4 was originally at index 3

Example 2: With new elements

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

# Result: [1, 0, -1, 3]
# The value 33 wasn't in the original array, so it gets -1

My solution is:

[a1.tolist().index(v) if v in a1 else -1 for v in a2]

or np.where(a2[:, None] == a1)[1] but this will not work in example 2

Is there a better way to do this? In real life my arrays have million of rows. Columns not that many, less than 10.

Example 1: All elements present

a1 = np.array([1, 2, 3, 4])  # original array
a2 = np.array([2, 1, 3, 4])  # new array

# Result: [1, 0, 2, 3]
# Explanation:
# - 2 was originally at index 1
# - 1 was originally at index 0
# - 3 was originally at index 2
# - 4 was originally at index 3

Example 2: With new elements

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

# Result: [1, 0, -1, 3]
# The value 33 wasn't in the original array, so it gets -1

My solution is:

[a1.tolist().index(v) if v in a1 else -1 for v in a2]

or np.where(a2[:, None] == a1)[1] but this will not work in example 2

Is there a better way to do this? In real life my arrays have million of rows. Columns not that many, less than 10.

Share Improve this question edited Mar 3 at 11:46 mkrieger1 23.6k7 gold badges64 silver badges82 bronze badges asked Feb 28 at 18:42 Aenaon 3,6135 gold badges41 silver badges70 bronze badges

Add a comment |

4 Answers 4

Sorted by: Reset to default 1

You could combine np.argmax with np.any to check whether there was no match at all. Here is a minimal example:

import numpy as np

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

has_match = a2[:, None] == a1
idx = np.argmax(has_match, axis=0)
idx[~np.any(has_match, axis=0)] = -1

This gives:

array([ 1,  0, -1,  3])

Euqivalent to your pure Python solution. The advantage here is that both argmax and any allow to specify the axis they operate along.

You can try pandas.merge to keep track of the index mapping

import pandas as pd

df = pd.DataFrame({'a1': a1, 'a2': a2})
lut = pd.DataFrame({'a1': a2, 'idx': df.index})
idx = pd.merge(df, lut, how = 'left')['idx'].fillna(-1).values.astype(int)

Given a1 = np.array([1, 2, 3, 4]) and a2 = np.array([2, 1, 3, 4]), you will obtain

array([1, 0, 2, 3])

Given a1 = np.array([1, 2, 3, 4]) and a2 = np.array([2, 1, 33, 4]), you will obtain

array([ 1,  0, -1,  3])

You use dict for this.

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

old_indices = {element: index for index, element in enumerate(a1.to_list())}
result = [old_indices.get(i, -1) for element in a2.to_list()]

Another possible solution, which uses np.where to identify the indices and np.full to set to -1 the missing indices:

i, j = np.where(a1 == a2[:, None])
out = np.full(len(a2), -1, dtype=int)
out[i] = j

Output:

array([ 1,  0, -1,  3])

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - Finding original indices for rearranged arrays - Stack Overflow

4 Answers 4

与本文相关的文章

评论列表(0)