In an August 28th motion to dismiss a lawsuit filed by two individual authors (plaintiffs) in a US District Court in June, OpenAI (the defendant) contends that isn’t liable for copyright infringement under the Digital Millennium Copyright Act. Nor does it claim to have a direct financial interest in infringement.
After opening its argument by saying that AI is still in its early days, and has the potential to remedy “some of the world’s worst inequities,” OpenAI laid out an analysis that the plaintiffs misinterpret the scope of copyright. OpenAI says that it isn’t profiting directly or engaging in plagiarism by training its ChatGPT model on copyrighted works.
Fair Use doctrine protects consumers. Can a machine invoke it?
One of the more interesting arguments is that OpenAI’s use of copyrighted material constitutes Fair Use, saying that “courts have accepted the use of copyrighted materials by innovators in transformative ways.”
The most familiar use-case of Fair Use Doctrine is the one that allows consumers to keep personal copies of music, video programming or software – as long as they don’t turn around and re-distribute any copies. This was decided in the 1980s by the famous “Betamax case.” However, the same decision granted that the manufacturers of recording devices could not be liable for ‘contributory infringement.’ So it can cut both ways.
The Copyright Alliance says that “Fair use permits a party to use a copyrighted work without the copyright owner’s permission for purposes such as criticism, comment, news reporting, teaching, scholarship, or research.”
They go on: “The first factor mostly focuses on whether the use is commercial or non-commercial and whether the use is transformative. If a use is commercial it is less likely to be fair use and if it is non-commercial it is more likely to be fair use.” ‘Transformative’ refers to whether or not the usage is something new – if yes, then it is more likely to fall under Fair Use.
Others do it, why can’t we?
The motion also cites many prior cases in its Fair Use defense. Some of them have to do with technology solutions that use data for reference purposes, such as Google Search, Google Books, and iParadigms (developer of a plagiarism detection tool).
OpenAI says that its use of copyrighted material doesn’t rise to the level of “unfair competition, ‘negligence,’ (or) unjust enrichment.” It also says that ChatGPT’s output is not derivative work.
Also in OpenAI’s defense, the motion quotes the US Constitution which says that the purpose of copyright is “[t]o promote the Progress of Science and useful Arts.” (Article. 1, § 8, cl. 8). Also quoting the US Supreme Court, it says that “[t]he more artistic protection is favored, the more technological innovation may be discouraged; the administration of copyright law is an exercise in managing the tradeoff.”
Next step
The matter will be heard at 2pm on December 7, 2023, at the US District Court in San Francisco, where several of the counts will be discussed. Futher details are found in the court document linked below.
Why it matters
It is good that copyright law is being tested by cases like this.
Contrary to some popular beliefs, generative AI doesn’t provide its content as a result of “thinking.” Instead, it’s a process of digesting and indexing massive amounts of content to discern patterns, and then use those patterns to generate a result.
Defendant OpenAI contends that its results do not constitute “derivative work,” while the plaintiffs’ claim that they are. “Derivative work” is an abstract concept, much as are the concept of “ideas” and “creativity.” OpenAI cited a challenge to Google’s practice of discerning statistical patterns in syntax and word frequency – another form of abstraction – which was determined not to be subject to copyright protection.
None of this displaces the fact that many copyrighted works have been found within generative AI data sets. Copyright owners and rights holders are finding success in using the equivalent of a “notification and takedown” process to challenge AI platforms that (knowingly or un-knowingly) have protected works in their data sets.
Further reading
Defendants’ Notice of Motion to Dismiss, and Memorandum of Points and Authorities in Support of Motion to Dismiss. Paul Tramblay, an individual, Mona Awad, an individual. v. OpenAI et al, defendants. Case No. 3:23-cv-03223. US District Court for the Northern District of California, San Francisco Division.
Denmark: Authors battle to stop AI engines from training with stolen data sets. Article. August 14, 2023. by Steven Hawley. Piracy Monitor
OpenAI trains its GPT model using pirated e-books, contends authors’ lawsuit. Article. June 30, 2023. by Steven Hawley. Piracy Monitor [ Note: This article also points to the original Complaint and to some third-party legal analysis ]