Optional

NVIDIA GPU Drivers

Skip this section if you have no NVIDIA GPU. CPU-only setups work fine.

💡

Why Add a GPU?

An NVIDIA GPU dramatically accelerates AI inference. A 7B model that takes 15 seconds on CPU can respond in under 1 second with a GPU. Ollama automatically detects and uses NVIDIA GPUs via CUDA.

Step 1 — Verify GPU is Detected

bash

# Check if your NVIDIA card is visible
lspci | grep -i nvidia
# Example output: 01:00.0 VGA: NVIDIA Corporation GA106 [RTX 3060] (rev a1)

Step 2 — Clean Any Existing NVIDIA Packages

bash

# Only run if you had a previous failed installation
sudo apt-get remove --purge 'libnvidia-.*'
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
sudo apt clean && sudo apt autoremove

Step 3 — Install NVIDIA Drivers

bash

# Add graphics drivers PPA
sudo add-apt-repository ppa:graphics-drivers/ppa --yes
sudo apt-get update
update-pciids

# Install driver (570 is current stable — check nvidia.com for latest)
sudo apt-get install nvidia-driver-570 -y
sudo apt-get reinstall linux-headers-$(uname -r)
sudo update-initramfs -u

# Validate DKMS modules
sudo dkms status

# Reboot
sudo reboot

Step 4 — Verify GPU Driver

bash

# After reboot — check GPU status
nvidia-smi
# Should show your GPU name, VRAM, driver version, CUDA version

Step 5 — Install CUDA Toolkit

bash

# Download and install CUDA keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update

# Install CUDA toolkit
sudo apt-get install cuda-toolkit -y
sudo apt-get install nvidia-gds -y

# Verify and reboot
sudo dkms status
sudo reboot

Step 6 — Verify Ollama Detects the GPU

bash

# After reboot — test a model with GPU
ollama run llama3.2 "Hello" --verbose
# Look for "gpu layers" in the output — confirms GPU is being used

# Check which GPU Ollama is using
ollama ps
# PROCESSOR column should show: GPU (not CPU)

GPU Compatibility Table

GPU	VRAM	Models it Handles	Notes
RTX 3060 / 4060	12 GB	7B–13B models fully in GPU	Great entry GPU for AI
RTX 3090 / 4090	24 GB	13B–34B models in GPU	Excellent performance
RTX 4060 Ti	16 GB	13B–20B models in GPU	Recommended mid-range
Tesla P100	16 GB	13B–20B via CUDA	Great used/refurb option
Tesla V100 32G	32 GB	34B models in GPU	Used servers — great value

← Install Ollama AI Models →