All Courses

Getting Started with Local LLMs

Chapter 1: Introduction to Large Language Models

What Is a Large Language Model (LLM)?

A Simple View of How LLMs Work

Understanding Tokens and Text Generation

Why Run LLMs Locally?

Local vs. Cloud-Based LLMs

Quiz for Chapter 1

Chapter 2: Preparing Your Local Environment

Hardware Considerations: CPU

Hardware Considerations: RAM

Hardware Considerations: GPU and VRAM

Checking Your System Specifications

Operating System Compatibility

Installing Python (Optional but Recommended)

Introduction to the Command Line / Terminal

Quiz for Chapter 2

Chapter 3: Finding and Selecting Local LLMs

Where to Find LLM Models: Hugging Face Hub

Understanding Model Sizes and Parameters

Model Formats: GGUF and Others

Quantization: Making Models Smaller

Reading Model Cards for Information

Model Licenses and Usage Restrictions

Choosing Your First Model

Quiz for Chapter 3

Chapter 4: Running Your First Local LLM

Introduction to Local LLM Runners

Setting up Ollama

Downloading a Model with Ollama

Running a Model with Ollama (Command Line)

Setting up LM Studio

Finding and Downloading Models in LM Studio

Loading and Chatting with a Model in LM Studio

Introduction to llama.cpp (Concept)

Hands-on Practical: Running a Model

Quiz for Chapter 4

Chapter 5: Basic Interaction and Prompting

What is a Prompt?

Your First Prompt: Simple Questions

Giving Instructions

Understanding Context Window

Basic Prompt Formatting Tips

Temperature and Creativity

Common Interaction Patterns

Practice: Simple Prompting Techniques

Quiz for Chapter 5

Understanding Context Window

Was this section helpful?

References

Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems 30 (NIPS 2017) DOI: 10.48550/arXiv.1706.03762 - Introduces the Transformer architecture, which underpins modern Large Language Models and their sequence processing, including the fundamental concept of a fixed-size input context defined by attention mechanisms.
Tokenizers, OpenAI, 2024 (OpenAI) - Explains how text is converted into tokens for Large Language Models, which is crucial for understanding how the context window's size is measured. Also details various tokenization schemes.

© 2025 ApX Machine LearningEngineered with