High-Confidence Off-Policy Evaluation, P. S. Thomas, G. Theocharous, M. Ghavamzadeh, 2015Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29(1) (Association for the Advancement of Artificial Intelligence)DOI: 10.1609/aaai.v29i1.9541 - 解决异策略评估中高方差的重大挑战,提出了为重要性采样估计器提供置信区间的方法。