AIAAIC Alert #29
The weekly update on incidents and issues driven by AI, algorithms and automation.
Keep up with AIAAIC via our website | X/Twitter | LinkedIn
In the crosshairs
Image: Irene Suosalo
UK train stations secretly analyse travellers' emotions
In addition to valid concerns about the opacity of the project as a whole, why were eight major train stations allegedly attempting to predict the emotions of its customers and the general public?Dream Machine AI video generator makes porn
Dream Machine can be easily tricked into generating explicit videos and nudity, violating the terms of service of its creator and raising questions about its safety.50 Melbourne school girls targeted using AI nude images
The latest in a spate of AI porn attacks at schools. The mother of one of the targeted students revealed her 16-year-old daughter vomited after seeing the "incredibly graphic" and "mutilated" images online.US college student Taylor Klein's face is deepfaked onto porn
Check out ‘Another Body’ for a glimpse of the inadequacy of US law for porn deepfakes, and the impact is has on victims.Study: Top chatbots spread Russian misinformation
Including ChatGPT, Gemini, Copilot and, of course, Grok.Indian travel company ad uses professional model’s AI likeness without consent
Au revoir personality rights?Meta attempts to train generative AI on user content
Resulting in an eleven-country legal complaint and a hasty retreat, claiming the objection to be ‘a step backward for European innovation’.
Visit the AIAAIC Repository for details of these and 1,500+ other AI, algorithmic and automation-driven incidents and controversies
Report an incident or controversy.
Support our work collecting and examining incidents and issues driven by AI, algorithms and automation, and making the case for technology transparencsy and openness.
System spotlight - Books3 AI training dataset
Image: Shawn Presser
Books3 is a dataset containing 196,640 books in text format by authors including Stephen King, Margaret Atwood, and Zadie Smith. Used to train language models, Books3 was developed in 2020 by open source advocate Shawn Presser and is hosted on The Eye, a website 'dedicated towards archiving and serving publicly available information.'
The yang: Books3 has been touted by open source advocates as democratising access to information and data, thereby leveling the playing field for smaller companies and researchers to take on big technology.
The yin: Books3 is seen to pose significant risks and harms due to its alleged inclusion of pirated and copyrighted books without authorisation from authors or publishers. The dataset was removed from The Pile after a legal complaint by anti-piracy group the Rights Alliance, but remains widely available on the internet.
Incidents associated with Books3:
OpenAI deleted training datasets believed to contain copyrighted books
17 authors sue OpenAI for 'systematic mass-scale copyright infringement'
Mike Huckabee books used to train language models without consent
Books3 dataset shut down after legal notice from Danish anti-piracy group