Getting two widely different execution times in the below
import numpy as np
import time
array = np.arange(0, 750000)
param = 20000
t1 = time.time()
for _ in range(param):
array <= 120
print(round(time.time() - t1), _)
# 9 19999
t2 = time.time()
for _ in range(param):
array - 120 <= 0
print(round(time.time() - t2), _)
# 19 19999
Expectation was that execution times would be similar in the two approaches.
What's the rationale behind this diff? Is numpy internally casting 120 as an array in the second approach?
What other similar bottlenecks to be aware for code optimisation? Happy to read docs on that. Thanks!
Getting two widely different execution times in the below
import numpy as np
import time
array = np.arange(0, 750000)
param = 20000
t1 = time.time()
for _ in range(param):
array <= 120
print(round(time.time() - t1), _)
# 9 19999
t2 = time.time()
for _ in range(param):
array - 120 <= 0
print(round(time.time() - t2), _)
# 19 19999
Expectation was that execution times would be similar in the two approaches.
What's the rationale behind this diff? Is numpy internally casting 120 as an array in the second approach?
What other similar bottlenecks to be aware for code optimisation? Happy to read docs on that. Thanks!
Share Improve this question edited yesterday jared 9,0013 gold badges15 silver badges43 bronze badges asked yesterday unfolxunfolx 1031 silver badge2 bronze badges New contributor unfolx is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 1- 1 more computation, more time. in the second case you have two binary operations (subtract and compare), while in the first one you only have one (compare). – Aditya Kurrodu Commented yesterday
1 Answer
Reset to default 2NumPy can't perform array - 120 <= 0
as a single fused operation, or rewrite the expression as array <= 120
. It needs to perform the operation as the two steps written:
array - 120
and
result <= 0
and each of these operations builds a new 750000-element array. One 750000-element array of subtraction results, and one 750000-element array of comparison results.
That's much slower than comparing each element to 120 and building an array of comparison results directly, as array <= 120
does.