Now that we've covered how digital images are structured, let's get hands-on and use Python with the OpenCV library to inspect these properties directly. This practice will solidify your understanding of dimensions, color channels, data types, and pixel values.
First, ensure you have OpenCV installed (pip install opencv-python
) and an image file available (let's call it example_image.jpg
) in the same directory as your Python script or notebook.
We begin by importing the OpenCV library and loading our image using the imread
function, which we saw briefly in the previous section.
import cv2
import numpy as np # Often used with OpenCV
# Load an image from file
# The '1' flag loads it in color (BGR format). Use '0' for grayscale.
image_path = 'example_image.jpg' # Replace with your image file path
img = cv2.imread(image_path, 1)
# Check if the image was loaded successfully
if img is None:
print(f"Error: Could not read image file at {image_path}")
else:
print("Image loaded successfully!")
# We'll add more code here in the next steps
If the image loads correctly, the img
variable now holds the image data as a NumPy array. If you get the error message, double-check that the image file path is correct and the file exists.
One of the most fundamental properties is the image's size. In NumPy arrays, this is stored in the shape
attribute.
# Make sure the image loaded before proceeding
if img is not None:
# Get the dimensions of the image
height, width, channels = img.shape
print(f"Image Height: {height} pixels")
print(f"Image Width: {width} pixels")
print(f"Number of Color Channels: {channels}")
# Calculate total number of pixels
total_pixels = height * width
print(f"Total Pixels: {total_pixels}")
When you run this code with a color image, img.shape
returns a tuple with three values: (height, width, channels)
.
What if it's a grayscale image?
If you load an image in grayscale (cv2.imread(image_path, 0)
), the shape
attribute will only have two values: (height, width)
. There's no third dimension for color channels because each pixel has only one intensity value. You'd need to adjust the code slightly to handle this:
# Load as grayscale
img_gray = cv2.imread(image_path, 0)
if img_gray is not None:
# Check dimensions - will raise an error if you try to unpack 3 values
if len(img_gray.shape) == 2:
height, width = img_gray.shape
channels = 1 # Grayscale has 1 channel conceptually
print(f"(Grayscale) Height: {height}, Width: {width}, Channels: {channels}")
# Handle potential color image loaded unexpectedly (shouldn't happen with flag 0)
elif len(img_gray.shape) == 3:
height, width, channels = img_gray.shape
print(f"(Color?) Height: {height}, Width: {width}, Channels: {channels}")
else:
print(f"Error: Could not read image file at {image_path}")
Digital images store pixel values using a specific numerical data type. This determines the range of possible values for each pixel component. OpenCV commonly uses uint8
(unsigned 8-bit integer) by default.
# Make sure the image loaded before proceeding
if img is not None:
# Get the data type of the image array
data_type = img.dtype
print(f"Image Data Type: {data_type}")
The output will likely be uint8
. This means each color component (B, G, R) for each pixel is represented by an integer between 0 and 255, inclusive. Understanding the data type is important when performing image manipulations, as operations might depend on this range.
We can access the value(s) of a specific pixel using array indexing. Remember that images use a coordinate system where the origin (0, 0) is at the top-left corner. The first index represents the row (y-coordinate, or height), and the second index represents the column (x-coordinate, or width).
# Make sure the image loaded before proceeding
if img is not None:
# Define coordinates (y, x) - remember y is row, x is column
px_y, px_x = 50, 100 # Example coordinates
# Check if coordinates are within image bounds
height, width, _ = img.shape
if 0 <= px_y < height and 0 <= px_x < width:
# Access the pixel value at (y, x)
pixel_value = img[px_y, px_x]
print(f"Pixel value at ({px_y}, {px_x}): {pixel_value}")
# For a color image (BGR), pixel_value is an array [Blue, Green, Red]
blue = pixel_value[0]
green = pixel_value[1]
red = pixel_value[2]
print(f" Blue: {blue}, Green: {green}, Red: {red}")
else:
print(f"Coordinates ({px_y}, {px_x}) are out of image bounds ({height}x{width}).")
# Example for grayscale (if img_gray was loaded)
if 'img_gray' in locals() and img_gray is not None:
height_g, width_g = img_gray.shape
if 0 <= px_y < height_g and 0 <= px_x < width_g:
pixel_intensity = img_gray[px_y, px_x]
print(f"Grayscale intensity at ({px_y}, {px_x}): {pixel_intensity}")
else:
print(f"Grayscale coordinates ({px_y}, {px_x}) are out of bounds ({height_g}x{width_g}).")
For a color image, img[y, x]
returns a list or array containing the B, G, and R values for that specific pixel (remember OpenCV uses BGR order by default, not RGB). For a grayscale image, img_gray[y, x]
returns a single integer representing the intensity value.
Here's a complete script combining these steps:
import cv2
import numpy as np
# --- Configuration ---
image_path = 'example_image.jpg' # CHANGE THIS to your image file
pixel_coordinate_y = 50 # CHANGE THIS coordinate if needed
pixel_coordinate_x = 100 # CHANGE THIS coordinate if needed
# --- Load Image ---
# Load in color
img_color = cv2.imread(image_path, 1)
# Load in grayscale
img_gray = cv2.imread(image_path, 0)
print(f"--- Properties for Color Image ({image_path}) ---")
if img_color is not None:
# Dimensions
height, width, channels = img_color.shape
print(f"Dimensions (HxWxC): {height} x {width} x {channels}")
print(f"Total Pixels: {height * width}")
# Data Type
print(f"Data Type: {img_color.dtype}")
# Pixel Value
if 0 <= pixel_coordinate_y < height and 0 <= pixel_coordinate_x < width:
bgr_value = img_color[pixel_coordinate_y, pixel_coordinate_x]
print(f"BGR Value at ({pixel_coordinate_y}, {pixel_coordinate_x}): {bgr_value}")
else:
print(f"Coordinates ({pixel_coordinate_y}, {pixel_coordinate_x}) are out of bounds.")
else:
print("Error: Could not load color image.")
print(f"\n--- Properties for Grayscale Image ({image_path}) ---")
if img_gray is not None:
# Dimensions
height_g, width_g = img_gray.shape
print(f"Dimensions (HxW): {height_g} x {width_g}")
print(f"Total Pixels: {height_g * width_g}")
# Data Type
print(f"Data Type: {img_gray.dtype}")
# Pixel Value
if 0 <= pixel_coordinate_y < height_g and 0 <= pixel_coordinate_x < width_g:
intensity = img_gray[pixel_coordinate_y, pixel_coordinate_x]
print(f"Intensity Value at ({pixel_coordinate_y}, {pixel_coordinate_x}): {intensity}")
else:
print(f"Coordinates ({pixel_coordinate_y}, {pixel_coordinate_x}) are out of bounds.")
else:
print("Error: Could not load grayscale image.")
Try This:
pixel_coordinate_y
and pixel_coordinate_x
values to inspect different parts of the image. Try coordinates near the corners (like 0, 0
) and near the center..png
) to see if the properties remain consistent.This hands-on exploration gives you a concrete feel for how images are represented numerically, preparing you for the next chapter where we'll start manipulating these pixel values to perform basic image processing.
© 2025 ApX Machine Learning