CDI: Copyrighted Data Identification in Diffusion Models
Published in Conference on Computer Vision and Pattern Recognition (CVPR), 2025
We show that existing membership inference attacks are ineffective for large diffusion models and we propose CDI, a dataset inference approach that aggregates signals across many samples to reliably detect copyrighted training data with over 99% confidence.