A Unified Approach to Interpreting Model Predictions, Scott M. Lundberg, Su-In Lee, 2017Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.)DOI: 10.5555/3295222.3295326 - Introduces SHAP values (SHapley Additive exPlanations) for consistent and locally accurate explanations of individual predictions.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier, Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, 2016Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM)DOI: 10.1145/2939672.2939778 - Presents LIME (Local Interpretable Model-agnostic Explanations), a model-agnostic method for explaining individual predictions by approximating local model behavior.