Understanding Data Modalities: Text, Images, Audio
Was this section helpful?
Multimodal Machine Learning: A Survey and Taxonomy, Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency, 2018IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41 (IEEE)DOI: 10.1109/TPAMI.2018.2798607 - This survey provides an overview of multimodal machine learning, including definitions and categorization of different data modalities and their processing challenges.
Computer Vision: Algorithms and Applications, Richard Szeliski, 2022 (Springer) - The second edition of this textbook provides a broad introduction to computer vision, detailing the principles and algorithms for processing and understanding image data.