在纯NumPy中重写for循环以减少执行时间

本文介绍了在纯NumPy中重写for循环以减少执行时间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我最近被问到如何为科学应用优化Python循环，并且收到一种在NumPy中重新编码的出色，智能的方法，对我来说，执行时间减少了大约100倍！

但是，B值的计算实际上嵌套在其他几个循环中，因为它是在规则的位置网格处求值的.是否有类似的智能NumPy重写来节省此过程的时间?

However, calculation of the B value is actually nested within a few other loops, because it is evaluated at a regular grid of positions. Is there a similarly smart NumPy rewrite to shave time off this procedure?

我怀疑这部分的性能提升会不太明显，其缺点可能是无法将计算进度报告给用户，并且无法将结果写入到计算结果中.输出文件直到计算结束，并且可能一步一步进行将对内存产生影响?有可能规避其中的任何一个吗?

I suspect the performance gain for this part would be less marked, and the disadvantages would presumably be that it would not be possible to report back to the user on the progress of the calculation, that the results could not be written to the output file until the end of the calculation, and possibly that doing this in one enormous step would have memory implications? Is it possible to circumvent any of these?

import numpy as np import time def reshape_vector(v): b = np.empty((3,1)) for i in range(3): b[i][0] = v[i] return b def unit_vectors(r): return r / np.sqrt((r*r).sum(0)) def calculate_dipole(mu, r_i, mom_i): relative = mu - r_i r_unit = unit_vectors(relative) A = 1e-7 num = A*(3*np.sum(mom_i*r_unit, 0)*r_unit - mom_i) den = np.sqrt(np.sum(relative*relative, 0))**3 B = np.sum(num/den, 1) return B N = 20000 # number of dipoles r_i = np.random.random((3,N)) # positions of dipoles mom_i = np.random.random((3,N)) # moments of dipoles a = np.random.random((3,3)) # three basis vectors for this crystal n = [10,10,10] # points at which to evaluate sum gamma_mu = 135.5 # a constant t_start = time.clock() for i in range(n[0]): r_frac_x = np.float(i)/np.float(n[0]) r_test_x = r_frac_x * a[0] for j in range(n[1]): r_frac_y = np.float(j)/np.float(n[1]) r_test_y = r_frac_y * a[1] for k in range(n[2]): r_frac_z = np.float(k)/np.float(n[2]) r_test = r_test_x +r_test_y + r_frac_z * a[2] r_test_fast = reshape_vector(r_test) B = calculate_dipole(r_test_fast, r_i, mom_i) omega = gamma_mu*np.sqrt(np.dot(B,B)) # write r_test, B and omega to a file frac_done = np.float(i+1)/(n[0]+1) t_elapsed = (time.clock()-t_start) t_remain = (1-frac_done)*t_elapsed/frac_done print frac_done*100,'% done in',t_elapsed/60.,'minutes...approximately',t_remain/60.,'minutes remaining'

推荐答案

如果您配置文件您的代码，您将看到99％的运行时间在calculate_dipole中，因此减少此循环的时间实际上不会显着减少执行时间.如果您想使其更快，您仍然需要专注于calculate_dipole.我为此尝试了calculate_dipole的Cython代码，并且使总体时间减少了大约2倍.可能还有其他方法可以改善Cython代码.

If you profile your code, you'll see that 99% of the running time is in calculate_dipole so reducing the time for this looping really won't give a noticeable reduction in execution time. You still need to focus on calculate_dipole if you want to make this faster. I tried my Cython code for calculate_dipole on this and got a reduction by about a factor of 2 in the overall time. There might be other ways to improve the Cython code too.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

在纯NumPy中重写for循环以减少执行时间

与本文相关的文章

评论列表(0)