Fréchet Inception Distance (FID) is a valuable metric for assessing the quality and diversity of synthetic images. Perform a hands-on calculation of the FID score between a set of real images and a set of synthetically generated images using Python.FID measures the similarity between two image distributions by comparing the statistics of features extracted by a pre-trained neural network, typically the Inception v3 model. A lower FID score indicates that the distribution of synthetic images is closer to the distribution of real images, suggesting better quality and diversity.Prerequisites and SetupBefore we begin, ensure you have the necessary libraries installed. We'll primarily use TensorFlow for loading the Inception v3 model and performing computations, NumPy for numerical operations, and potentially SciPy for matrix calculations if implementing the formula directly.pip install tensorflow numpy scipy PillowYou will also need two sets of images:A directory containing your real images.A directory containing the synthetic images generated by your model.For this example, let's assume you have these images stored in path/to/real/images and path/to/synthetic/images, respectively.Image Loading and PreprocessingThe Inception v3 model expects input images of a specific size (typically 299x299 pixels) and preprocessed in a particular way (pixel values scaled to the range [-1, 1]). We need to ensure both our real and synthetic images conform to these requirements.import tensorflow as tf import numpy as np import os from PIL import Image import warnings # Suppress specific TensorFlow warnings for cleaner output warnings.filterwarnings("ignore", category=FutureWarning) tf.get_logger().setLevel('ERROR') # Define image size expected by Inception v3 IMAGE_SIZE = (299, 299) def preprocess_image(image_path): """Loads and preprocesses an image for Inception v3.""" try: img = tf.keras.preprocessing.image.load_img( image_path, target_size=IMAGE_SIZE ) img_array = tf.keras.preprocessing.image.img_to_array(img) # Scale pixel values to [-1, 1] as expected by InceptionV3 img_array = tf.keras.applications.inception_v3.preprocess_input(img_array) return img_array except Exception as e: print(f"Warning: Skipping file {image_path} due to error: {e}") return None def load_and_preprocess_images(dir_path, max_images=None): """Loads and preprocesses all images from a directory.""" image_paths = [os.path.join(dir_path, fname) for fname in os.listdir(dir_path) if fname.lower().endswith(('.png', '.jpg', '.jpeg'))] if max_images is not None: image_paths = image_paths[:max_images] print(f"Processing a maximum of {max_images} images from {dir_path}") processed_images = [] for path in image_paths: processed = preprocess_image(path) if processed is not None: processed_images.append(processed) if not processed_images: raise ValueError(f"No valid images found or processed in directory: {dir_path}") return np.array(processed_images) # --- Placeholder paths: Replace with your actual directories --- # It's recommended to use at least a few thousand images for stable FID. # For demonstration, we might use fewer, but be aware results will vary. PATH_REAL_IMAGES = 'path/to/real/images' PATH_SYNTHETIC_IMAGES = 'path/to/synthetic/images' MAX_IMAGES_PER_SET = 100 # Use a small number for quick demo; increase for real evaluation print("Loading and preprocessing real images...") # Add error handling for directory existence if not os.path.isdir(PATH_REAL_IMAGES): print(f"Error: Real image directory not found: {PATH_REAL_IMAGES}") print("Please replace 'path/to/real/images' with the correct path.") # Set real_images to None or handle appropriately real_images = None else: real_images = load_and_preprocess_images(PATH_REAL_IMAGES, MAX_IMAGES_PER_SET) print("Loading and preprocessing synthetic images...") if not os.path.isdir(PATH_SYNTHETIC_IMAGES): print(f"Error: Synthetic image directory not found: {PATH_SYNTHETIC_IMAGES}") print("Please replace 'path/to/synthetic/images' with the correct path.") # Set synthetic_images to None or handle appropriately synthetic_images = None else: synthetic_images = load_and_preprocess_images(PATH_SYNTHETIC_IMAGES, MAX_IMAGES_PER_SET) # Check if images were loaded successfully before proceeding if real_images is None or synthetic_images is None: print("\nHalting execution due to missing image data. Please check paths and image files.") # Exit or skip FID calculation if data is missing # For a script, you might use: sys.exit(1) after importing sys else: print(f"Loaded {len(real_images)} real images and {len(synthetic_images)} synthetic images.") print("Preprocessing complete.") Make sure to replace path/to/real/images and path/to/synthetic/images with the actual paths to your image datasets. We also added a MAX_IMAGES_PER_SET variable for demonstration purposes; for reliable FID scores, you should use a larger number of images (often thousands).Calculating Activations with Inception V3The next step is to feed these preprocessed images into the Inception v3 model (pre-trained on ImageNet) and extract activations from one of the deeper layers. The layer typically used is the final pooling layer before the classification output, as its features capture high-level image characteristics.from scipy.linalg import sqrtm # For matrix square root def calculate_activations(images, model): """Calculates activations for a batch of images using the model.""" if images is None or len(images) == 0: return np.array([]) activations = model.predict(images) return activations def calculate_fid(act1, act2): """Calculates the FID score between two sets of activations.""" if act1.size == 0 or act2.size == 0: print("Warning: One or both activation sets are empty. Cannot calculate FID.") return float('inf') # Or handle as an error # Calculate mean and covariance statistics mu1, sigma1 = act1.mean(axis=0), np.cov(act1, rowvar=False) mu2, sigma2 = act2.mean(axis=0), np.cov(act2, rowvar=False) # Calculate sum squared difference between means ssdiff = np.sum((mu1 - mu2)**2.0) # Calculate sqrt of product of cov matrices # Adding a small epsilon for numerical stability might be needed sometimes eps = 1e-6 covmean, _ = sqrtm(sigma1.dot(sigma2), disp=False) # Check and correct imaginary numbers from sqrtm if np.iscomplexobj(covmean): # print("Warning: Complex numbers generated in matrix square root. Taking real part.") covmean = covmean.real # Numerical stability check for covariance matrices if not np.isfinite(sigma1).all() or not np.isfinite(sigma2).all(): print("Warning: Non-finite values found in covariance matrices. FID might be unstable.") return float('inf') # Indicate instability # Calculate score fid = ssdiff + np.trace(sigma1 + sigma2 - 2.0 * covmean) # Check for negative FID (can happen due to numerical instability) if fid < 0: # print(f"Warning: Negative FID calculated ({fid}). Clipping to 0.") # Handle appropriately, e.g., log the issue or clip fid = 0.0 return fid # Proceed only if images were loaded successfully if real_images is not None and synthetic_images is not None: # Load InceptionV3 model pre-trained on ImageNet, excluding the top classification layer # The output will be the features from the global average pooling layer print("Loading InceptionV3 model...") inception_model = tf.keras.applications.InceptionV3( include_top=False, pooling='avg', # Global Average Pooling layer output input_shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3), weights='imagenet' ) print("Model loaded.") print("Calculating activations for real images...") activations_real = calculate_activations(real_images, inception_model) print("Calculating activations for synthetic images...") activations_synthetic = calculate_activations(synthetic_images, inception_model) if activations_real.size > 0 and activations_synthetic.size > 0: print("Calculating FID score...") # Ensure activations are 2D arrays for covariance calculation if activations_real.ndim == 1: activations_real = activations_real.reshape(-1, 1) if activations_synthetic.ndim == 1: activations_synthetic = activations_synthetic.reshape(-1, 1) fid_score = calculate_fid(activations_real, activations_synthetic) print(f"\nCalculated FID Score: {fid_score:.4f}") else: print("\nFID calculation skipped due to empty activation sets.") This code loads the Inception v3 model, calculates the activations for both image sets using the predict method, and then applies the FID formula using our calculate_fid function. We use scipy.linalg.sqrtm for the matrix square root computation, which is often the most numerically challenging part. We've included basic checks for empty activations and potential numerical issues like complex numbers or non-finite values.Interpretation and InsightsThe output fid_score represents the Fréchet Inception Distance. Remember:Lower is better: A score closer to 0 indicates the distributions of real and synthetic image features are more similar.Context matters: FID scores are relative. Compare the FID of your generated images to baseline models or previous iterations of your generator. An FID of 50 might be good for a complex dataset where state-of-the-art is 40, but poor if typical scores are below 10.Sample size: FID is sensitive to the number of images used. Ensure you use a sufficiently large and representative sample from both the real and synthetic datasets (ideally thousands of images) for stable and meaningful results. Using too few images (like our MAX_IMAGES_PER_SET=100 example) will lead to unreliable scores.Computational Cost: Calculating activations for thousands of images can be time-consuming, especially without GPU acceleration.Feature Extractor: While Inception v3 is standard, other feature extractors can be used, potentially leading to different FID scores. Consistency in using the same extractor is important for comparisons.This practical exercise demonstrates the core steps involved in calculating FID. By applying this metric, you gain a quantitative measure of how well your generative model captures the visual characteristics of the real data distribution, which is a significant aspect of evaluating synthetic image quality. Remember to adapt the paths and consider the practical limitations when applying this to your own projects.