All Courses

Getting Started with Local LLMs

Chapter 1: Introduction to Large Language Models

What Is a Large Language Model (LLM)?

A Simple View of How LLMs Work

Understanding Tokens and Text Generation

Why Run LLMs Locally?

Local vs. Cloud-Based LLMs

Quiz for Chapter 1

Chapter 2: Preparing Your Local Environment

Hardware Considerations: CPU

Hardware Considerations: RAM

Hardware Considerations: GPU and VRAM

Checking Your System Specifications

Operating System Compatibility

Installing Python (Optional but Recommended)

Introduction to the Command Line / Terminal

Quiz for Chapter 2

Chapter 3: Finding and Selecting Local LLMs

Where to Find LLM Models: Hugging Face Hub

Understanding Model Sizes and Parameters

Model Formats: GGUF and Others

Quantization: Making Models Smaller

Reading Model Cards for Information

Model Licenses and Usage Restrictions

Choosing Your First Model

Quiz for Chapter 3

Chapter 4: Running Your First Local LLM

Introduction to Local LLM Runners

Setting up Ollama

Downloading a Model with Ollama

Running a Model with Ollama (Command Line)

Setting up LM Studio

Finding and Downloading Models in LM Studio

Loading and Chatting with a Model in LM Studio

Introduction to llama.cpp (Concept)

Hands-on Practical: Running a Model

Quiz for Chapter 4

Chapter 5: Basic Interaction and Prompting

What is a Prompt?

Your First Prompt: Simple Questions

Giving Instructions

Understanding Context Window

Basic Prompt Formatting Tips

Temperature and Creativity

Common Interaction Patterns

Practice: Simple Prompting Techniques

Quiz for Chapter 5

Hardware Considerations: GPU and VRAM

Was this section helpful?

© 2025 ApX Machine Learning

GPU and VRAM for Local LLM Acceleration