Anonymizing Health Data, Khaled El Emam, and Lydia Aysegul Neşeoğlu, 2018 (Springer)DOI: 10.1007/978-3-319-94101-2 - This book provides comprehensive coverage of de-identification methods, including masking, generalization, and redaction, for protecting sensitive information in text and other data types.
Data Augmentation in NLP: A Survey, Sanjana Gupta, R. M. K. R. Namburi, Sai Nikhil, M. Ramakrishna, and P. S. R. N. Praveen, 2023EAI Endorsed Transactions on AI and Robotics, Vol. 5 (European Alliance for Innovation)DOI: 10.4108/eetair.v5i2.27448 - A recent survey offering a broad overview of data augmentation techniques specifically for Natural Language Processing, including those relevant to generating synthetic text variants.
Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), Marianne Swanson, Amy Bell, Marshall Abrams, and David Phillips, 2010NIST Special Publication 800-122 (National Institute of Standards and Technology (NIST)) - This NIST publication provides essential guidance on safeguarding PII, including principles and techniques for de-identification and masking to reduce privacy risks.