Set Up GPUDirect RDMA
This guide covers the system configuration required to enable GPUDirect RDMA between an NVIDIA GPU and an FPGA card using the datadev driver.
Warning
You must install NVIDIA Open Kernel Modules
(nvidia-kernel-source-590-open), not the standard CUDA toolkit kernel
driver. The open module provides the nv-p2p.h peer-to-peer interface.
If comp_and_load_drivers.sh outputs
Could not find NVIDIA open drivers in /usr/src, the wrong driver type
is installed.
Note
For a tutorial walkthrough of these steps with more explanation, see Your First GPUDirect Transfer.
Prerequisites
Ubuntu 22.04 x86_64
NVIDIA GPU and FPGA card on the same PCIe bus
sudo access
Step 1 — Disable Display Manager Services
Disable display managers and the NVIDIA persistence daemon to prevent
Module XXX is in use errors during driver replacement:
sudo systemctl disable gdm
sudo systemctl disable lightdm
sudo systemctl disable sddm
sudo systemctl disable nvidia-persistenced
Step 2 — Remove Existing NVIDIA Drivers
sudo apt-get purge nvidia-* -y
sudo apt autoremove -y
Step 3 — Install NVIDIA Open Kernel Modules
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt update
sudo apt install cuda-toolkit-13-1 nvidia-kernel-source-590-open \
libnvidia-compute-590 cuda-compat-13-1 nvidia-firmware
sudo reboot
Warning
Review the package install list. NVIDIA packages may pull in a low-latency kernel image.
Step 4 — Configure GRUB
sudo nano /etc/default/grub
# Set GRUB_CMDLINE_LINUX to:
# GRUB_CMDLINE_LINUX="iommu=off nouveau.modeset=0 rd.driver.blacklist=nouveau"
sudo update-grub
sudo reboot
Step 5 — Build and Load Both Drivers
cd aes-stream-drivers
sudo ./data_dev/driver/comp_and_load_drivers.sh
What the script does:
Locates NVIDIA open driver sources by searching
/usr/srcfornv-p2p.hUnloads existing nvidia and datadev modules
Rebuilds and loads NVIDIA drivers with
NVreg_OpenRmEnableUnsupportedGpus=1andNVreg_EnableStreamMemOPs=1Builds and loads
datadev.ko
Verify
lsmod | grep datadev
lsmod | grep nvidia
Then in C, verify GPUDirect firmware support:
int fd = open("/dev/datadev_0", O_RDWR);
if (gpuIsGpuAsyncSupported(fd)) {
printf("GPUDirect supported\n");
}
For background on DMA buffer modes used by GPUDirect, see DMA Buffer Modes.