Overcoming catastrophic forgetting in neural networks, James Kirkpatrick, Razvan Pascanu, Gabriel Jimenez Rezende, Adria Puigdomenech Badia, Oriol Vinyals, Fabio Hubert, Zachary Li, Peter Battaglia, Laurent Sifre, Evan Zoph, Martin Reichstein, Dean Hassabis, Iordanis Antonoglou, Charles Blundell, 2017Proceedings of the National Academy of Sciences, Vol. 114 (National Academy of Sciences)DOI: 10.1073/pnas.1611835114 - Introduces Elastic Weight Consolidation (EWC), a regularization method to mitigate catastrophic forgetting by identifying and protecting important parameters for previous tasks.
Learning without Forgetting, Zhizhong Li, Derek Hoiem, 2018IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40 (IEEE)DOI: 10.1109/TPAMI.2017.2756813 - Proposes Learning without Forgetting (LwF), which utilizes knowledge distillation from the old model to preserve knowledge when learning new tasks.
Parameter-Efficient Transfer Learning for NLP, Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly, 2019Proceedings of the 36th International Conference on Machine Learning (ICML), Vol. 97 (PMLR)DOI: 10.48550/arXiv.1902.00751 - Introduces adapter modules, a parameter-efficient fine-tuning method that can be applied for parameter isolation to mitigate catastrophic forgetting.