Tables store data, with each row representing a single item or record (like a specific customer, product, or order), and each column representing an attribute or characteristic of that item (like name, price, or order date).
Now, imagine a table with thousands, or even millions, of rows. If you need to find, update, or delete one specific row, how can you guarantee you're working with the correct one? What if multiple customers share the same name, or several products have the same price? We need a reliable way to pinpoint exactly one row.
This is where the Primary Key comes in. A primary key is a column, or sometimes a set of columns, designated to uniquely identify every single row within a table. Think of it like a unique identification number for each record. Just as no two citizens should have the same social security number, no two rows in a table can have the same primary key value.
Primary keys have two essential properties enforced by the Database Management System (DBMS):
NULL values. NULL represents an unknown or missing value. Since the primary key's purpose is to uniquely identify a row, it must have a definite value for every single row. You can't identify something with an "unknown" identifier.Using primary keys offers several significant advantages:
CustomerID primary key ensures accuracy.When designing a table, you need to decide which column(s) will serve as the primary key. There are generally two approaches:
Natural Keys: These are columns that already exist as part of the data and naturally possess the properties of uniqueness and non-nullity (or are expected to). Examples might include:
ISBN (International Standard Book Number) for a Books table.UserEmail for a Users table (assuming email addresses must be unique).ProductSKU (Stock Keeping Unit) for a Products table.While natural keys can be intuitive, they sometimes have drawbacks. What if an organization changes its SKU format? What if a user needs to change their email address? Changing a primary key value can be complex because it might be used to link to other tables. Also, some natural identifiers might contain sensitive information (like a government ID number) or might not be guaranteed to be unique forever.
Surrogate Keys (or Artificial Keys): These are columns added to the table specifically to serve as the primary key, having no inherent meaning outside of identifying the row. The most common type is an auto-incrementing integer. When a new row is added, the database automatically assigns the next sequential integer value to this column (e.g., 1, 2, 3, ...). Examples include CustomerID, ProductID, OrderID.
Surrogate keys are often preferred because:
Let's visualize this with a simple Products table example:
A simple
Productstable usingProductIDas a surrogate primary key (PK). Notice that even though "Wireless Mouse" appears twice, each row is uniquely identifiable by itsProductID.
In this example, ProductID is a surrogate key. It uniquely identifies each product row. Even if we have two different products both named "Wireless Mouse", their ProductID values (101 and 105) are different, allowing us to distinguish between them. Trying to insert another product with ProductID 101 would result in an error from the DBMS.
Sometimes, a single column isn't enough to uniquely identify a row. In such cases, you can define a composite primary key, which consists of two or more columns combined. The combination of values in these columns must be unique for each row.
For example, consider a table storing items within customer orders, perhaps called OrderItems. A single OrderID wouldn't be unique (an order can have multiple items), and a single ProductID wouldn't be unique (a product can be in multiple orders). However, the combination of OrderID and ProductID is likely unique – you typically wouldn't list the exact same product twice within the same order line item. So, (OrderID, ProductID) could serve as a composite primary key for the OrderItems table.
For this introductory course, we'll mostly focus on single-column primary keys, but it's useful to know that composite keys exist for more complex scenarios.
In summary, the primary key is a fundamental concept in relational databases. It enforces uniqueness for each row, provides a reliable way to identify specific records, and plays an essential role in linking related data across tables, which we will explore next when we discuss Foreign Keys.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with