最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Finding original indices for rearranged arrays - Stack Overflow

programmeradmin1浏览0评论

Given two numpy arrays of equal shape, I want to track how elements from the first array have moved in the second array. Specifically, for each element in the second array, I want to find its original position in the first array. The arrays are not sorted.

Example 1: All elements present

a1 = np.array([1, 2, 3, 4])  # original array
a2 = np.array([2, 1, 3, 4])  # new array

# Result: [1, 0, 2, 3]
# Explanation:
# - 2 was originally at index 1
# - 1 was originally at index 0
# - 3 was originally at index 2
# - 4 was originally at index 3

Example 2: With new elements

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

# Result: [1, 0, -1, 3]
# The value 33 wasn't in the original array, so it gets -1

My solution is:

[a1.tolist().index(v) if v in a1 else -1 for v in a2]

or np.where(a2[:, None] == a1)[1] but this will not work in example 2

Is there a better way to do this? In real life my arrays have million of rows. Columns not that many, less than 10.

Given two numpy arrays of equal shape, I want to track how elements from the first array have moved in the second array. Specifically, for each element in the second array, I want to find its original position in the first array. The arrays are not sorted.

Example 1: All elements present

a1 = np.array([1, 2, 3, 4])  # original array
a2 = np.array([2, 1, 3, 4])  # new array

# Result: [1, 0, 2, 3]
# Explanation:
# - 2 was originally at index 1
# - 1 was originally at index 0
# - 3 was originally at index 2
# - 4 was originally at index 3

Example 2: With new elements

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

# Result: [1, 0, -1, 3]
# The value 33 wasn't in the original array, so it gets -1

My solution is:

[a1.tolist().index(v) if v in a1 else -1 for v in a2]

or np.where(a2[:, None] == a1)[1] but this will not work in example 2

Is there a better way to do this? In real life my arrays have million of rows. Columns not that many, less than 10.

Share Improve this question edited Mar 3 at 11:46 mkrieger1 23.6k7 gold badges64 silver badges82 bronze badges asked Feb 28 at 18:42 AenaonAenaon 3,6135 gold badges41 silver badges70 bronze badges
Add a comment  | 

4 Answers 4

Reset to default 1

You could combine np.argmax with np.any to check whether there was no match at all. Here is a minimal example:

import numpy as np

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

has_match = a2[:, None] == a1
idx = np.argmax(has_match, axis=0)
idx[~np.any(has_match, axis=0)] = -1

This gives:

array([ 1,  0, -1,  3])

Euqivalent to your pure Python solution. The advantage here is that both argmax and any allow to specify the axis they operate along.

You can try pandas.merge to keep track of the index mapping

import pandas as pd

df = pd.DataFrame({'a1': a1, 'a2': a2})
lut = pd.DataFrame({'a1': a2, 'idx': df.index})
idx = pd.merge(df, lut, how = 'left')['idx'].fillna(-1).values.astype(int)

  • Given a1 = np.array([1, 2, 3, 4]) and a2 = np.array([2, 1, 3, 4]), you will obtain
array([1, 0, 2, 3])
  • Given a1 = np.array([1, 2, 3, 4]) and a2 = np.array([2, 1, 33, 4]), you will obtain
array([ 1,  0, -1,  3])

You use dict for this.

a1 = np.array([1, 2, 3, 4])
a2 = np.array([2, 1, 33, 4])

old_indices = {element: index for index, element in enumerate(a1.to_list())}
result = [old_indices.get(i, -1) for element in a2.to_list()]

Another possible solution, which uses np.where to identify the indices and np.full to set to -1 the missing indices:

i, j = np.where(a1 == a2[:, None])
out = np.full(len(a2), -1, dtype=int)
out[i] = j

Output:

array([ 1,  0, -1,  3])
发布评论

评论列表(0)

  1. 暂无评论