最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python 3.x - Identify identical vectors as part of a multidimensional dot product - Stack Overflow

programmeradmin6浏览0评论

I am wanting to identify identical vectors after a dot product calc. The below works for a single dimension but not multi-dimensionally.

Single Dimension

a = np.array([0.8,0.5])
b = np.array([0.8,0.5])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: 1

Multi-dimensional

a = np.array([[0.8,0.5],[0.4,1]])
b = np.array([[0.8,0.5],[0.4,1]])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: [[0.4097561  0.43902439] [0.35121951 0.58536585]]
#Desired Output [[1 0.43] [0.35 1]] 0.43, and 0.35 will be the wrong values but would just not expect them to be 1.

I would expect this to output at least 2 '1's. Recognise this is likely due to it happening after the @. Is there a way to do this calc as part of it and have the final output as a multidimensional result?

I am wanting to identify identical vectors after a dot product calc. The below works for a single dimension but not multi-dimensionally.

Single Dimension

a = np.array([0.8,0.5])
b = np.array([0.8,0.5])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: 1

Multi-dimensional

a = np.array([[0.8,0.5],[0.4,1]])
b = np.array([[0.8,0.5],[0.4,1]])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: [[0.4097561  0.43902439] [0.35121951 0.58536585]]
#Desired Output [[1 0.43] [0.35 1]] 0.43, and 0.35 will be the wrong values but would just not expect them to be 1.

I would expect this to output at least 2 '1's. Recognise this is likely due to it happening after the @. Is there a way to do this calc as part of it and have the final output as a multidimensional result?

Share Improve this question edited Mar 14 at 22:49 Zac asked Mar 14 at 21:24 ZacZac 1,6372 gold badges13 silver badges15 bronze badges 2
  • 1 With floats you may need to use isclose or allclose to allow for float rounding errors. – hpaulj Commented Mar 14 at 21:44
  • Are you only doing this to compare the vectors? If so there are simpler ways. – Reinderien Commented Mar 15 at 5:37
Add a comment  | 

2 Answers 2

Reset to default 1

Your 2 arrays, the 1d and 2d. No need to repeat them for this demo.

In [16]: a = np.array([0.8,0.5])

In [17]: b = np.array([[0.8,0.5],[0.4,1]])

The 1d dot, and its norm:

In [18]: a@a
Out[18]: np.float64(0.8900000000000001)

In [19]: np.linalg.norm(a)
Out[19]: np.float64(0.9433981132056605)

With those we can get your desired 1:

In [21]: (a@a)/(np.linalg.norm(a)**2)
Out[21]: np.float64(1.0)

The 2d 'dot':

In [22]: b@b
Out[22]: 
array([[0.84, 0.9 ],
       [0.72, 1.2 ]])

But wait, don't we want to use the transpose:

In [23]: [email protected]
Out[23]: 
array([[0.89, 0.82],
       [0.82, 1.16]])

In [24]: b[1]@b[1]
Out[24]: np.float64(1.1600000000000001)

That gives us the .089 from a@a, and a corresponding 1d dot for the 2nd row.

The norm of b is a single number! Read the docs, that's the norm for flattened b. We need to specify the axis, here 1 to get row-wise norms:

In [27]: np.linalg.norm(b[1])
Out[27]: np.float64(1.077032961426901)

In [28]: np.linalg.norm(b[0])     # norm(a)
Out[28]: np.float64(0.9433981132056605)

In [29]: np.linalg.norm(b,axis=1)
Out[29]: array([0.94339811, 1.07703296])

Now we can get the desired 1's for the 2d array:

In [30]: ([email protected])/(np.linalg.norm(b,axis=1)**2)
Out[30]: 
array([[1.        , 0.70689655],
       [0.92134831, 1.        ]])

edit

Norm without the axis:

In [36]: np.linalg.norm(b)
Out[36]: np.float64(1.4317821063276355)
In [37]: np.linalg.norm(b.ravel())
Out[37]: np.float64(1.4317821063276355)
In [39]: np.sqrt(b.ravel()@b.ravel())
Out[39]: np.float64(1.4317821063276355)

Your 1d case, x = a @ b / np.linalg.norm(a) / np.linalg.norm(b), with a and b identical is effectively

x = (a @ a) / (np.sqrt(a@a)**2)

Unfortunately @ is making matrix multiplication and you need element wise dot product.

import numpy as np

a = np.array([[0.8, 0.5], [0.4, 1]])
b = np.array([[0.8, 0.5], [0.4, 1]])

# Compute dot product row-wise
dot_product = np.einsum('ij,ij->i', a, b)

# Compute norms row-wise
norm_a = np.linalg.norm(a, axis=1)
norm_b = np.linalg.norm(b, axis=1)

# Compute cosine similarity row-wise
cosine_similarity = dot_product / (norm_a * norm_b)

print(cosine_similarity)  # Expected Output: [1. 1.]
发布评论

评论列表(0)

  1. 暂无评论