AIAAIC Alert #26
A monthly round-up of goings-on connected with the AI, algorithmic and automation transparency, openness, and accountability
#26 | May 31, 2024
Keep up with AIAAIC via our Website | X/Twitter | LinkedIn
Is fair use for machines fair?
Author: Alice Poorta, AIAAIC Contributor and former YouTube Strategy Manager
A common piece of advice given to any creative looking to improve their skills is to consume more content. Writers are told to read more, musicians to listen more, and chefs to taste more. We accept that humans wanting to create great works can usefully hone their craft through dedicated study of the work of others. Indeed, many successful creatives will proudly share names of their biggest inspirations without expectation they will share in the profits - Olivia Rodrigo and Taylor Swift aside.
However, in the field of generative AI, where consuming and learning from masses of data is the basis upon which the technology works, a fearsome debate is underway. Lawsuits and warnings abound alleging AI companies are feasting on the data of copyright owners to train their systems without consent.
Notable recent examples include; New York Times vs. Open AI, Authors Guild vs. Open AI, Authors Guild vs. Nvidia, Digital Publishers vs. OpenAI, News Publishers vs. OpenAI, Getty vs. Stability, Sony Warning Letter.
OpenAI argues it has a “fair use” exception to train its models on copyrighted works without permission because the use is “transformative”. Transformative is a key test for fair use in US copyright law, roughly meaning the purpose, expression or message behind the original work is altered. Examples include parody, criticism, news reporting, and education. Tools like ChatGPT are not intended to produce exact replicas of copyrighted works, though this has been reported as an issue in the New York Times lawsuit, but instead to ingest works as raw material and generate entirely different works.
However, something feels instinctively different about AI systems learning from other people’s creative works. It may be the scale and ability to disrupt the original creative industries: as prolific as Ed Sheeran is, he is not capable of rapidly and cheaply producing the many millions of songs that Suno and other music generators can.
It may be that there is a lack of accountability and transparency from some AI companies in recording their sources of training data. In a much ridiculed interview, OpenAI CTO Mira Murati was unwilling or unable to reveal the sources of Sora’s training data. And in its lawsuit with the Authors Guild OpenAI revealed that it had deleted training datasets believed to contain copyrighted books, thereby preventing further investigation.
It may also be that there is a distinct human touch that is added to learnings from creative works that truly transforms them. Reading alone does not make you a good writer, you also need to build skills through years of practice, bring your own life experiences, taste, emotions and perspective.
It remains to be seen how these lawsuits will be resolved. But in the way that copyright protections are only granted to works with human authorship, perhaps it is time to consider if fair use should only be applied only to human transformations?
Support our work collecting and examining incidents and issues driven by AI, algorithms and automation, and making the case for technology transparency and openness.
In the crosshairs
Image: Craig Doty II
Tesla operating in FSD attempts to drive into passing trains
And the Tesla’s dashcam footage proves it. The owner claimed that the incident was the second of its kind in six months to have occurred, and said he was seeking to take legal action against Tesla.
Google, Microsoft image searches list nonconsensual deepfake porn
Further degradation of search engines, courtesy of AI made and/or enabled, by their owners.Google AI Overviews gives wrong and misleading answers
Ditto.OpenAI accused of using Scarlett Johansson's voice without consent
Per Sam Altman: ‘Her’.Voice actors sue AI start-up for “voice theft”
Now we have a better idea where ’hyper realistic’ voice generator Lovo got its real sounds from.Sony warns AI companies to not misuse its data
700 AI developers and music streaming services are publicly warned by the Japanese giant not to use music by Beyoncé, Harry Styles, Adele and others to train their systems.Eight newspapers sue OpenAI and Microsoft for copyright infringement
There will surely be many more media companies going down this path.Singapore writers resist government plan to train AI using their work
By calling out the government’s lack of clarity on how their works would be protected from being used for purposes other than cultural representation. Propaganda, for example.Hoodline use of AI to generate news solicits backlash
Bye bye real journalism?Slack forces users to opt-out of training its AI models
Sneaking out a contentious new policy overnight makes an entirely foreseeable though still tasteless move taste even worse.
Visit the AIAAIC Repository for details of these and 1,500+ other AI, algorithmic and automation-driven incidents and controversies
Report an incident or controversy.
Research & advocacy citations, mentions
Benbouzid D. et al. Pragmatic auditing: a pilot-driven approach for auditing Machine Learning systems (pdf)
DeVriso A. et al. Building, Shifting, & Employing Power: A Taxonomy of Responses From Below to Algorithmic Harm (pdf)
Adalton Cicero Teim Junior et al. Utilizacao da inteligencia artificial na medicina
Center for News, Technology & Innovation. Artificial Intelligence in Journalism
Consumers International. Our vision for Fair and Responsible AI for Consumers (pdf)
Explore more research reports citing or mentioning AIAAIC.
AIAAIC news
AIAAIC is pleased to welcome Franz Mornau as a volunteer. Franz has variously worked as a programmer, application designer, solutions architect, and product owner, mostly in the insurance space. Based in Berkelely, California, he recently led a team of data engineers at Olaplex Inc and is currenly exploring new opportunities.