Problem:
I am working on a high performance cluster (HPC3) with a NVDIA A40 GPU installed and I want to run the following code:
from numba import cuda
import numpy as np
@cuda.jit
def increment_by_one(an_array):
pos = cuda.grid(1)
if pos < an_array.size:
an_array[pos] += 1
an_array = np.zeros(10)
increment_by_one[16,16](an_array)
Essentially I am having the same problem as here. However not when using Google Colab but working on HPC3. Running the script gives
numba.cuda.cudadrv.driver.LinkerError: [222] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION
ptxas application ptx input, line 9; fatal : Unsupported .version 8.7; current version is '8.4'
And as suggested in the above link, I tried to add
from numba import config
config.CUDA_ENABLE_PYNVJITLINK = 1
and to reinstall numba-cuda, but even in this case, I get
pynvjitlink.api.NvJitLinkError: NVJITLINK_ERROR_PTX_COMPILE error when calling nvJitLinkAddData
ptxas application ptx input, line 9; fatal : Unsupported .version 8.7; current version is '8.4'
ERROR NVJITLINK_ERROR_PTX_COMPILE: JIT the PTX (<cudapy-ptx>)
My NVIDIA Driver Version and CUDA Toolkit Version are
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
Mon Mar 17 11:46:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A40 Off | 00000000:83:00.0 Off | 0 |
| 0% 37C P0 77W / 300W | 1MiB / 46068MiB | 5% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Does anyone have an idea on how to fix this? (Note that I also posted this on GitHub)
Edit:
I now added config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = True
to my code, but then the error ValueError: Use CUDA_ENABLE_PYNVJITLINK for CUDA >= 12.0 MVC
pops up, trying to set this to True gives the code
from numba import config
config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = True
config.CUDA_ENABLE_PYNVJITLINK = True
from numba import cuda
import numpy as np
@cuda.jit
def increment_by_one(an_array):
pos = cuda.grid(1)
if pos < an_array.size:
an_array[pos] += 1
an_array = np.zeros(10)
increment_by_one[16,16](an_array)
But the error ValueError: Use CUDA_ENABLE_PYNVJITLINK for CUDA >= 12.0 MVC
persists even in this case. Should I set CUDA_ENABLE_PYNVJITLINK
somewhere else to True? But at least, the version mismatch does not seem to cause a problem anymore.