Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze, 2008 (Cambridge University Press) - A classic textbook covering fundamental concepts and metrics in information retrieval, including Precision, Recall, MAP, and NDCG, essential for understanding offline evaluation.
Evaluating Online Search Engines with Interleaving, Olivier Chapelle and Ya Zhang, 2009Proceedings of the Second International Conference on Web Search and Data Mining (WSDM '09) (ACM)DOI: 10.1145/1458082.1458091 - A foundational paper introducing interleaving, an efficient and sensitive method for online evaluation of search engines by combining results from different systems.