How to fix Error of Cuda 12.8 and Numba on High Performance Cluster?

Problem:

I am working on a high performance cluster (HPC3) with a NVDIA A40 GPU installed and I want to run the following code:

from numba import cuda
import numpy as np

@cuda.jit
def increment_by_one(an_array):
   pos = cuda.grid(1)
   if pos < an_array.size:
       an_array[pos] += 1

an_array = np.zeros(10)
increment_by_one[16,16](an_array)

Essentially I am having the same problem as here. However not when using Google Colab but working on HPC3. Running the script gives

numba.cuda.cudadrv.driver.LinkerError: [222] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION
ptxas application ptx input, line 9; fatal   : Unsupported .version 8.7; current version is '8.4'

And as suggested in the above link, I tried to add

from numba import config
config.CUDA_ENABLE_PYNVJITLINK = 1

and to reinstall numba-cuda, but even in this case, I get

pynvjitlink.api.NvJitLinkError: NVJITLINK_ERROR_PTX_COMPILE error when calling nvJitLinkAddData
ptxas application ptx input, line 9; fatal   : Unsupported .version 8.7; current version is '8.4'
ERROR NVJITLINK_ERROR_PTX_COMPILE: JIT the PTX (<cudapy-ptx>)

My NVIDIA Driver Version and CUDA Toolkit Version are

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
Mon Mar 17 11:46:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03             Driver Version: 550.144.03     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     Off |   00000000:83:00.0 Off |                    0 |
|  0%   37C    P0             77W /  300W |       1MiB /  46068MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Does anyone have an idea on how to fix this? (Note that I also posted this on GitHub)

Edit: I now added config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = True to my code, but then the error ValueError: Use CUDA_ENABLE_PYNVJITLINK for CUDA >= 12.0 MVC pops up, trying to set this to True gives the code

from numba import config
config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = True
config.CUDA_ENABLE_PYNVJITLINK = True

from numba import cuda
import numpy as np

@cuda.jit
def increment_by_one(an_array):
    pos = cuda.grid(1)
    if pos < an_array.size:
        an_array[pos] += 1

an_array = np.zeros(10)
increment_by_one[16,16](an_array)

But the error ValueError: Use CUDA_ENABLE_PYNVJITLINK for CUDA >= 12.0 MVC persists even in this case. Should I set CUDA_ENABLE_PYNVJITLINK somewhere else to True? But at least, the version mismatch does not seem to cause a problem anymore.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

How to fix Error of Cuda 12.8 and Numba on High Performance Cluster? - Stack Overflow

与本文相关的文章

评论列表(0)