NVIDIA CUDA 7.5 on Fedora 23 with NVIDIA Optimus Technology

I have a Dell XPS 15 that has both Intel and NVIDIA graphics using NVIDIA’s Optimus Technology:

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2)

Getting CUDA 7.5 to work was straightforward using Bumblebee, as described on the Fedora Bumblebee Page. I used the managed proprietary option to get the latest NVIDIA driver for my system. Using both optirun and primusrun works, however on glxgears they report different numbers

$ optirun glxgears
10162 frames in 5.0 seconds = 2032.270 FPS

$ primusrun glxgears
292 frames in 5.0 seconds = 58.370 FPS
primus: warning: dropping a frame to avoid deadlock
primus: warning: timeout waiting for display worker

I installed the CUDA Toolkit using the dnf repository provided by NVIDIA. It is important to only install the Toolkit from this repository, not the version of CUDA that is provided since Bumblebee already configured an appropriate Optimus-ready NVIDIA driver. So, instead you can just run

sudo dnf install cuda-toolkit-7-5

Additional configuration works as specified in the User’s Manual and Getting Started Guides. I had trouble getting the examples to compile because Fedora 23, which I’m on, defaults to gcc 5.3.1 and, unfortunately, CUDA 7.5 requires gcc 4.9 or less. I compiled GCC 4.9 from scratch and then manually modified the makefiles of each sample I wanted to compile to point to “g++49” instead of “g++”. You’ll notice that I told the gcc configuration during build time to add the 49 suffix with the –program-suffix=49 option. Aside from that, compilation and execution were simple enough with two catches:

  • It was necessarily to override the library path to point to Bumblebee’s version of the cuda shared library instead of a 32-bit version that somehow ended up in /usr/lib.
  • Contrary to what the Bumblebee documentation suggests, you must use optirun or primusrun to execute the samples. I’m not sure if this is required on more basic programs yet.

So, that looks something like this for the nbody example:

LD_LIBRARY_PATH=/usr/lib64/nvidia-bumblebee primusrun ./nbody

cudaNBody_cropped