All Courses

Understanding LLM Model Sizes and Hardware Requirements

Chapter 1: Introduction to Large Language Models and Size

What is a Large Language Model (LLM)?

Understanding Model Parameters

How Model Size is Measured

Examples of Different Model Sizes

Quiz for Chapter 1

Chapter 2: Essential Hardware Components for AI

The Central Processing Unit (CPU)

Random Access Memory (RAM)

The Graphics Processing Unit (GPU)

Video RAM (VRAM)

Brief Overview of TPUs

Quiz for Chapter 2

Chapter 3: Connecting Model Size to Hardware Needs

Model Parameters and Memory Consumption

Data Types and Precision (FP16, INT8)

Introduction to Quantization

Compute Requirements (FLOPS)

Memory Bandwidth Importance

Quiz for Chapter 3

Chapter 4: Running LLMs: Inference vs. Training

What is Model Inference?

Hardware Needs for Inference

What is Model Training?

Hardware Needs for Training

Focus on Inference Requirements

Quiz for Chapter 4

Chapter 5: Estimating Hardware Needs

Rule of Thumb: Parameters to VRAM

Accounting for Activation Memory

Factors Influencing Actual Usage

Checking Hardware Specifications

Practice: Simple VRAM Estimations

Quiz for Chapter 5

Video RAM (VRAM)

Was this section helpful?

References

CUDA C++ Programming Guide, NVIDIA Corporation, 2023 (NVIDIA Corporation) - Provides detailed documentation on the CUDA programming model and GPU architecture, including a comprehensive explanation of the GPU memory hierarchy, which encompasses VRAM (referred to as global memory).
Computer Architecture: A Quantitative Approach, John L. Hennessy, David A. Patterson, 2017 (Morgan Kaufmann) - A classic textbook offering foundational knowledge on computer architecture, including detailed discussions on memory hierarchies and their performance implications, relevant for understanding VRAM in a broader context.

© 2025 ApX Machine LearningEngineered with