Now that you have a grasp on how to estimate the amount of Video RAM (VRAM) and system RAM needed to run a specific Large Language Model, the next logical step is to determine the capabilities of your own hardware. Knowing your system's specifications is fundamental before attempting to download or run models, ensuring you have the necessary resources.
This section provides practical guidance on how to find the amount of VRAM available on your Graphics Processing Unit (GPU) and the total system RAM installed on your computer for common operating systems.
The amount of VRAM is often the most immediate limiting factor for running larger LLMs locally. Here’s how to check it:
On Windows:
Task Manager: The easiest way is often through the Task Manager.
Ctrl+Shift+Esc
.DirectX Diagnostic Tool:
Win+R
, type dxdiag
, and press Enter.GPU Manufacturer Software: If you have NVIDIA or AMD GPUs, their respective control panels (NVIDIA Control Panel or AMD Radeon Software) usually display hardware information, including VRAM, often under a "System Information" or similar section.
On macOS:
About This Mac: This provides a quick overview.
System Information: For more detail:
On Linux:
NVIDIA GPUs (nvidia-smi
): If you have an NVIDIA GPU and the appropriate drivers installed, the nvidia-smi
command is the standard tool.
nvidia-smi
and press Enter.+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| 30% 40C P8 15W / 350W | 10MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
...
Example output of the
nvidia-smi
command, showing 16384MiB (16GB) of total VRAM.
AMD GPUs (radeontop
or lspci
+ System Info):
radeontop
(may need installation).lspci | grep -i vga
or lspci | grep -i display
to identify your AMD GPU model.glxinfo | grep "Video memory"
might also work if Mesa utils are installed.General (lspci
and System Monitors): The lspci
command combined with graphical system monitoring tools available in most desktop environments (like System Monitor on Ubuntu/GNOME) can often provide details about the detected graphics hardware.
System RAM is needed to load the operating system, applications, and potentially parts of the model or data if VRAM is insufficient (though this is much slower).
On Windows:
Task Manager:
Ctrl+Shift+Esc
).System Information:
Win+R
, type msinfo32
, and press Enter.On macOS:
About This Mac:
System Information:
On Linux:
free
command: A standard command-line tool.
free -h
and press Enter. The -h
flag shows the values in human-readable format (e.g., GiB for Gibibytes).~$ free -h
total used free shared buff/cache available
Mem: 31Gi 5.8Gi 18Gi 1.2Gi 7.4Gi 24Gi
Swap: 2.0Gi 0B 2.0Gi
Example output of
free -h
, showing approximately 31GiB of total system RAM.
htop
or similar tools: Tools like htop
(often needs installation via sudo apt install htop
or sudo yum install htop
) provide a more interactive view of system resources, including total RAM shown typically at the top.
Graphical System Monitors: Most Linux desktop environments include a graphical system monitor (e.g., GNOME System Monitor, KSysGuard) that displays total RAM, usually on a "Resources" or "System" tab.
By following these steps, you can quickly determine the VRAM and RAM available on your system. Comparing these specifications against the estimated requirements for an LLM (using the methods discussed earlier in this chapter) will give you a clear idea of whether your hardware is suitable for running that particular model effectively. Remember that besides memory, the GPU's processing power (related to its model and architecture, like CUDA cores or Tensor cores for NVIDIA) also significantly impacts performance, though VRAM is often the first bottleneck you'll encounter.
© 2025 ApX Machine Learning