In the world of data, information isn't always complete. Sometimes, a value for a particular column in a row might be missing, unknown, or simply not applicable. SQL represents this absence of value using a special marker called NULL
. It's important to understand that NULL
is not the same as zero (0) for numbers, an empty string ('') for text, or any other specific value. NULL
specifically means "no value present".
When filtering data using the WHERE
clause, you might intuitively try to use standard comparison operators like =
or <>
(or !=
) to find rows with or without NULL
values. For example, you might try:
-- This does NOT work as expected
SELECT product_name, price
FROM products
WHERE price = NULL;
Or maybe:
-- This also does NOT work as expected
SELECT product_name, price
FROM products
WHERE price <> NULL;
Surprisingly, neither of these queries will typically return the rows you expect. This is because NULL
represents an unknown state. Comparing anything to an unknown value, even another unknown value, results in UNKNOWN
. In the context of a WHERE
clause, conditions that evaluate to UNKNOWN
are treated like FALSE
, meaning the rows are not included in the result set.
So, how do you correctly filter for rows where a value is missing or present? SQL provides specific operators just for this purpose: IS NULL
and IS NOT NULL
.
To select rows where a specific column contains a NULL
value, you use the IS NULL
operator.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE column_name IS NULL;
Example:
Imagine we have a customers
table, and some customers haven't provided their email address. To find these customers, you would write:
SELECT customer_id, first_name, last_name
FROM customers
WHERE email IS NULL;
This query will return all rows from the customers
table where the email
column explicitly contains the NULL
marker, indicating a missing email address.
Conversely, to select rows where a specific column does contain a value (i.e., it is not NULL
), you use the IS NOT NULL
operator.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE column_name IS NOT NULL;
Example:
If you want to retrieve only the customers who have provided an email address, you would use:
SELECT customer_id, first_name, last_name, email
FROM customers
WHERE email IS NOT NULL;
This query returns only those rows from the customers
table where the email
column has some value present, excluding those marked as NULL
.
You can easily combine IS NULL
and IS NOT NULL
with other filtering conditions using AND
and OR
.
Example:
Let's find all products that are either out of stock (quantity is 0) or whose price hasn't been set (price
is NULL
):
SELECT product_name, price, quantity
FROM products
WHERE quantity = 0 OR price IS NULL;
Example:
Find all customers from 'California' who have provided an email address:
SELECT first_name, last_name, email
FROM customers
WHERE state = 'CA' AND email IS NOT NULL;
Understanding and correctly using IS NULL
and IS NOT NULL
is fundamental when working with real-world datasets, which often contain missing information. Being able to identify or exclude rows based on the presence or absence of data is a necessary step in data cleaning and preparation for analysis. For data scientists, recognizing and handling missing values appropriately is a significant part of ensuring the reliability of analytical results.
© 2025 ApX Machine Learning