Discrete-Time Processing of Speech Signals, John R. Deller, John H. L. Hansen, and John G. Proakis, 2000 (Wiley-IEEE Press) - This book offers a strong foundation in digital signal processing techniques specifically applied to speech, covering the underlying principles of the transforms and filtering used in MFCC computation.
A Scale for the Measurement of the Psychological Magnitude of Pitch, S. S. Stevens, J. Volkmann, and E. B. Newman, 1937Journal of the Acoustical Society of America, Vol. 8 (The Acoustical Society of America)DOI: 10.1121/1.1915893 - This classic paper introduced the Mel scale, a perceptual pitch scale that is a fundamental component of Mel-frequency cepstral coefficients due to its alignment with human auditory perception.