Numba reimplements many NumPy functions in pure Python and uses LLVM to compile them, resulting in generally efficient performance. However, some Numba implementations show slower performance compared to their optimized NumPy counterparts, such as numpy.sort().
A quick approach to wrap NumPy's optimized functions within Numba-jitted code is to use an "objmode" block:
import numpy as np
from numba import njit, objmode
# Original numpy.sort()
def sort_numpy(arr: np.ndarray) -> np.ndarray:
return np.sort(arr)
# Wrapped numpy.sort() in objmode
@njit(cache=True)
def sort_numpy_obj_njit(arr: np.ndarray) -> np.ndarray:
out = np.empty_like(arr)
with objmode:
out[:] = sort_numpy(arr)
return out
Unfortunately, each time the kernel restarts, the memory address (function pointer) of np.sort can change. This leads to recompilation of the cached function, as Numba's caching mechanism relies on consistent function pointers. Caching to disk is effective only within a single Python session.
Is there a method to wrap NumPy functions using Numba, potentially through external C functions, that enables persistent disk caching across kernel restarts?