All Courses

Introduction to Large Language Models

Chapter 1: Understanding Large Language Models

What is Artificial Intelligence? A Brief Overview

Introducing Natural Language Processing (NLP)

Defining Large Language Models (LLMs)

How LLMs Learn from Text Data

Examples of Tasks LLMs Can Perform

Common Misconceptions about LLMs

Quiz for Chapter 1

Chapter 2: The Mechanics of LLMs (Simplified)

Representing Words: Tokens and Embeddings

Predicting the Next Word: The Core Idea

The Role of Training Data Size

Understanding Model Parameters

Introduction to Transformer Architecture (High-Level)

How Context Influences Generation

Quiz for Chapter 2

Chapter 3: Communicating with LLMs: Prompts

What is a Prompt?

Basic Prompting Techniques

Providing Instructions Clearly

Giving Examples (Few-Shot Prompting)

Controlling Output Length and Format

Common Prompting Challenges

Practice: Crafting Your First Prompts

Quiz for Chapter 3

Chapter 4: A Look at Different LLMs

Overview of Foundational Models

General Purpose vs. Specialized Models

Open vs. Closed Models: What's the Difference?

Understanding Model Size and Capabilities

Accessing Models: APIs and Interfaces

Quiz for Chapter 4

Chapter 5: Using Pre-trained LLMs

What are Pre-trained Models?

Finding and Choosing an LLM Service

Interacting via Web Interfaces

Introduction to Using LLM APIs

Sending Your First API Request

Interpreting LLM Responses

Hands-on Practical: Simple Text Generation Task

Quiz for Chapter 5

General Purpose vs. Specialized Models

Was this section helpful?

References

On the Opportunities and Risks of Foundation Models, Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, et al., 2021 arXiv preprint arXiv:2108.07258 (Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI)) - Defines and discusses foundation models, which serve as the base for many specialized LLMs, and their characteristics and societal impact.
Language Models are Few-Shot Learners, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei, 2020 Advances in Neural Information Processing Systems (NeurIPS 2020) DOI: 10.48550/arXiv.2005.14165 - Introduces GPT-3, a prominent example of a general-purpose LLM, demonstrating its ability to perform various tasks with minimal examples.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, 2019 Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (Association for Computational Linguistics) DOI: 10.18653/v1/N19-1423 - Introduces BERT and details the pre-training and fine-tuning paradigm, which is central to creating specialized LLMs from foundational models.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Jinhyuk Lee, Wonjin Yoon, Sungwoo Kim, Donghyeon Kim, Sunkyu Kim, and Jounghee Kim, 2020 Bioinformatics, Vol. 36 (Oxford University Press) DOI: 10.1093/bioinformatics/btaa682 - Presents BioBERT, a specialized model fine-tuned for the biomedical domain, illustrating how general models can be adapted for specific fields.

© 2025 ApX Machine LearningEngineered with