Fast R-CNN, Ross Girshick, 2015Proceedings of the IEEE International Conference on Computer Vision (ICCV)DOI: 10.48550/arXiv.1504.08083 - This paper presents the Fast R-CNN architecture, which significantly improved the training and inference speed of R-CNN by sharing computations across proposals. RPNs were later developed to replace its external proposal generation step.
Rich feature hierarchies for accurate object detection and semantic segmentation, Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, 2014Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE)DOI: 10.48550/arXiv.1311.2524 - The foundational work that established the two-stage object detection paradigm using convolutional neural networks and external region proposals. It provides essential context for the evolution leading to RPNs.