All Courses

Advanced CNNs for Computer Vision Applications

Chapter 1: Revisiting CNN Foundations and Modern Architectures

Brief Review of CNN Building Blocks

Evolution of CNN Architectures: AlexNet to ResNet

Understanding Residual Connections and Skip Architectures

Inception Modules and Network-in-Network Concepts

DenseNet: Architecture and Connectivity Patterns

EfficientNet: Compound Scaling for Models

Architectural Design Choices and Trade-offs

Implementing Modern Architectures Practice

Chapter 2: Advanced Training and Optimization Techniques

Advanced Optimization Algorithms

Learning Rate Schedules and Cyclical Learning Rates

Regularization Revisited: Advanced Techniques

Batch Normalization Internals and Alternatives

Weight Initialization Strategies for Deep Networks

Gradient Clipping and Gradient Flow Mitigation

Mixed Precision Training Fundamentals

Debugging and Monitoring Deep CNN Training

Hands-on Practical: Implementing Advanced Training Loops

Chapter 3: Object Detection Algorithms

Two-Stage Detectors: R-CNN Family

Region Proposal Networks Explained

Single-Stage Detectors: YOLO Family

Single-Stage Detectors: SSD and RetinaNet

Anchor Boxes: Design and Refinement

Non-Maximum Suppression Variants

Evaluation Metrics for Object Detection

Implementing an Object Detector Practice

Chapter 4: Image Segmentation Techniques

Semantic Segmentation vs. Instance Segmentation

Fully Convolutional Networks for Segmentation

Encoder-Decoder Architectures: U-Net and SegNet

Dilated (Atrous) Convolutions for Segmentation

DeepLab Family: Atrous Spatial Pyramid Pooling

Instance Segmentation Approaches (Mask R-CNN)

Evaluation Metrics for Segmentation

Hands-on Practical: Building a Semantic Segmentation Model

Chapter 5: Attention Mechanisms and Transformers in Vision

Self-Attention Mechanisms in CNNs

Non-local Neural Networks

Introduction to Vision Transformers

ViT Architecture: Patches, Embeddings, Transformer Encoder

Hybrid CNN-Transformer Models

Comparing CNNs and Transformers for Vision Tasks

Implementing Attention Blocks in CNNs Practice

Chapter 6: Advanced Transfer Learning and Domain Adaptation

Revisiting Transfer Learning Strategies

Fine-tuning vs. Feature Extraction: Advanced Considerations

Adapting Models to Different Data Distributions

Domain Generalization Concepts

Few-Shot Learning with CNNs

Self-Supervised Learning Pre-training for Vision

Hands-on Practical: Fine-tuning Models on Specialized Datasets

Chapter 7: Generative Adversarial Networks for Image Synthesis

GAN Fundamentals Revisited

Challenges in Training GANs

Deep Convolutional GANs (DCGANs)

Conditional GANs for Controlled Generation

StyleGAN Architecture and Style-Based Generation

Evaluation Metrics for GANs

Implementing a DCGAN for Image Generation Practice

Chapter 8: Model Compression and Efficient Deep Learning

Motivation for Efficient Models

Network Pruning Techniques

Knowledge Distillation Methods

Quantization: Reducing Model Precision

Designing Efficient Architectures

Neural Architecture Search Overview

Hands-on Practical: Applying Pruning and Quantization

Few-Shot Learning with CNNs

Was this section helpful?

References

Few-Shot Learning: A Survey, Yaqing Wang, Quanming Yao, James Tin-Yau Kwok, Lionel Ming-shuan Ni, 2020 ACM Computing Surveys (CSUR), Vol. 53 (ACM) DOI: 10.1145/3386252 - Provides a comprehensive overview of few-shot learning, including problem definitions, common paradigms, and various algorithmic approaches like metric learning and meta-learning.
Prototypical Networks for Few-Shot Learning, Snell, Jake and Swersky, Kevin and Zemel, Richard S., 2017 Advances in Neural Information Processing Systems (NeurIPS), Vol. 30 (Neural Information Processing Systems Foundation) DOI: 10.5555/3157382.3157545 - Introduces Prototypical Networks, a key metric-learning method for few-shot classification that computes class prototypes as mean embeddings.
Siamese Neural Networks for One-shot Image Recognition, Koch, Gregory and Zemel, Richard S. and Salakhutdinov, Ruslan R., 2015 Proceedings of the 32nd International Conference on Machine Learning (ICML), Vol. 37 (JMLR: W&CP) - Presents one of the earliest and most impactful applications of Siamese networks to the one-shot learning problem in computer vision.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chelsea Finn, Pieter Abbeel, Sergey Levine, 2017 ICML 2017, Vol. 70 DOI: 10.48550/arXiv.1703.03400 - Proposes Model-Agnostic Meta-Learning (MAML), a general meta-learning algorithm for training models to quickly adapt to new tasks with few gradient steps.

© 2025 ApX Machine Learning