Personally identifiable information has been found in DataComp CommonPool, one of the largest open-source data sets used to train image generation models. Millions of images of passports, credit cards ...
A new tool, Data Provenance Explorer, lets users pick through the questionable provenance of many large data sets used for AI training. A new online tool allows users to identify, track and learn ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results