All Courses

Understanding LLM Model Sizes and Hardware Requirements

Chapter 1: Introduction to Large Language Models and Size

What is a Large Language Model (LLM)?

Understanding Model Parameters

How Model Size is Measured

Examples of Different Model Sizes

Quiz for Chapter 1

Chapter 2: Essential Hardware Components for AI

The Central Processing Unit (CPU)

Random Access Memory (RAM)

The Graphics Processing Unit (GPU)

Video RAM (VRAM)

Brief Overview of TPUs

Quiz for Chapter 2

Chapter 3: Connecting Model Size to Hardware Needs

Model Parameters and Memory Consumption

Data Types and Precision (FP16, INT8)

Introduction to Quantization

Compute Requirements (FLOPS)

Memory Bandwidth Importance

Quiz for Chapter 3

Chapter 4: Running LLMs: Inference vs. Training

What is Model Inference?

Hardware Needs for Inference

What is Model Training?

Hardware Needs for Training

Focus on Inference Requirements

Quiz for Chapter 4

Chapter 5: Estimating Hardware Needs

Rule of Thumb: Parameters to VRAM

Accounting for Activation Memory

Factors Influencing Actual Usage

Checking Hardware Specifications

Practice: Simple VRAM Estimations

Quiz for Chapter 5

What is a Large Language Model (LLM)?

Was this section helpful?

References

Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems (NeurIPS) DOI: 10.48550/arXiv.1706.03762 - Describes the Transformer architecture, which is fundamental to modern large language models, explaining the mechanism behind their pattern-matching capabilities.
Language Models are Few-Shot Learners, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei, 2020 Advances in Neural Information Processing Systems (NeurIPS) DOI: 10.48550/arXiv.2005.14165 - Presents GPT-3, demonstrating how increased scale in parameters and training data enables large language models to perform a wide array of tasks with minimal task-specific data.
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu, 2019 Journal of Machine Learning Research (JMLR) DOI: 10.48550/arXiv.1910.10683 - Details the Text-to-Text Transfer Transformer (T5) model, offering a comprehensive study of transfer learning techniques and showing how various NLP tasks can be framed as text-to-text problems.
On the Opportunities and Risks of Foundation Models, Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Dilara Bakar, Percy Liang, et al., 2021 arXiv (Stanford Institute for Human-Centered Artificial Intelligence (HAI)) DOI: 10.48550/arXiv.2108.07258 - Introduces the concept of foundation models, of which large language models are a prominent type, discussing their shared capabilities and implications across various applications.

© 2025 ApX Machine LearningEngineered with