Now that we understand how an image is represented as a grid of pixels, each with numerical values for color, let's consider how we store these images as files on a computer. If you just stored the raw pixel values for every image, files would become enormous very quickly, especially for high-resolution photos. This is where image file formats come in.
Image file formats define a standard way to organize and store the pixel data, along with other related information (like image dimensions and color space). A significant aspect differentiating these formats is how they handle compression. Compression techniques aim to reduce the file size, making images easier to store and transmit over networks. There are two main types of compression relevant here:
Let's look at some of the most common file formats you'll encounter:
JPEG is arguably the most popular format for photographic images. Its primary strength is its ability to achieve very high compression ratios, resulting in small file sizes ideal for web pages and sharing. This is achieved through lossy compression. When saving a JPEG, you usually specify a "quality" setting (often a number from 0 to 100). A lower quality setting means more compression, a smaller file size, but more information is discarded, potentially leading to visible distortions called "artifacts," especially around sharp edges or text. A higher quality setting preserves more detail but results in a larger file. JPEG does not typically support transparency.
PNG was developed as a more capable, patent-free alternative to older formats like GIF. It uses lossless compression, meaning no image data is lost when the file is saved. This makes it perfect for graphics where sharp lines, text clarity, and precise colors are important. Unlike JPEG, PNG supports an alpha channel, allowing for varying levels of transparency. This is why logos or icons often use PNG, so they can be overlaid cleanly onto different backgrounds. The trade-off is that for complex photographic images, PNG files are usually significantly larger than equivalent-quality JPEGs.
GIF is an older format known primarily for its ability to store short, looping animations. It uses lossless compression, but with a major limitation: it only supports a maximum of 256 distinct colors per frame. This makes it unsuitable for photographs, which typically contain thousands or millions of colors (rendering them in GIF often results in banding or blotchy appearance). It supports basic transparency (a pixel is either fully transparent or fully opaque, no partial transparency like PNG). While still used for animations, PNG is generally preferred for static images with limited colors due to better compression and features.
BMP is a simple, older format commonly associated with Windows. It typically stores pixel data directly with little to no compression. This results in very large file sizes compared to formats like JPEG or PNG. Because it's uncompressed (or uses very basic lossless compression), it preserves the exact original image data. However, due to the large file sizes and lack of advanced features, it's not commonly used for web distribution or general image sharing today.
Selecting the appropriate file format depends on the image content and your needs:
Understanding these formats helps you appreciate the data you'll be loading in computer vision tasks. Different formats store pixel data differently, and the compression used can sometimes impact the information available for analysis. For most introductory work, you'll frequently load JPEG or PNG files, converting them into the standard numerical array representation we discussed earlier.
© 2025 ApX Machine Learning