Copyright in the Age of AI: Examining Ownership of AI-Generated Works

Jul 10, 2023
Share this post
Sharing to FacebookSharing to LinkedInSharing to XSharing to Email

The rapid advancement of generative artificial intelligence (AI) has raised intriguing questions about the copyrightability of AI-generated output. As AI systems become increasingly capable of producing what we would commonly consider original creative works, it becomes essential to examine the copyright implications and ask ourselves who, if anyone, can claim copyright to the output of AI such as ChatGPT’s.There are currently several lawsuits pending from copyright holders of some of the training data. Under the concept of “derivative works,” we will consider whether such authors could make a copyright claim to the output generated by the AI trained on their works. We also examine whether the AI developers could be considered copyright holders regarding the output of the model they built, or whether the individuals who selected the training data and performed the training of the AI could lay such a claim to the generated output. Lastly, we will consider arguments for and against whether the user, if different from the personas listed before, can be the author of a copyrighted work that the AI produced based on their prompt.

What does Copyright Protect and Enable?

Copyright law is a fundamental pillar of intellectual property law that seeks to achieve several key goals and protect the diverse interests of creators, innovators, and society as a whole. At its core, copyright law aims to strike a balance between fostering creativity and ensuring the fair and equitable treatment of creators, on the one hand, and access by the public to creative works on the other.By granting exclusive rights to creators for a certain amount of time, copyright law incentivizes the production of original works, encouraging innovation, artistic expression, and the advancement of knowledge. Copyright legislation thereby seeks to safeguard the economic interests of creators, allowing them to reap the rewards of their creative efforts by controlling the reproduction, distribution, and public display of their works. Others may generally only reproduce the work of an author with permission; e.g., by purchasing a licence to do so.Simultaneously, copyright law also recognizes the broader societal interests in access to and the dissemination of creative works, promoting cultural enrichment, educational purposes, and the public's right to enjoy and benefit from artistic and intellectual creations. Hence, copyright laws commonly make an exception from the prohibition to reproduce an author’s work for purposes such as private study, criticism, news reporting, and education.

Training Data Copyright Holders as Authors or Owners of ChatGPT’s Output

AI now has the ability to mimic the style of an author, painter, and even singer or composer. Trained on vast amounts of copyrighted work, ChatGPT could then produce a work that closely resembles that of a human author and write a sequence almost identical to, say, Harry Potter. It does not seem unreasonable at all that people would buy such content, and given the incredible speed of the output generation, no human could ever compete with that. This poses a real threat to copyright holders, financially and reputationally. It is in stark conflict with copyright’s intention to incentivise the creation of original works.One possibility of protecting the interests of copyright holders on works the LLM has been trained on could be to grant the authors copyright on the output. Full disclosure, this proposition is a real stretch given the current existing legal frameworks and serious limitations regarding the explainability of how output was produced by generative AI systems. The US Copyright Act’s concept of “derivative works” protects authors against the production of a work that is based on or derived from an existing, copyrighted work. Without the author’s permission, no one may produce such a derivative work. The US Copyright Office says: “where a copyrighted work is used without the permission of the copyright owner, copyright protection will not extend to any part of the work in which such material has been used unlawfully. The unauthorized adaptation of a work may constitute copyright infringement.” The last sentence here is important, though. The original author will not be granted copyright on the work that is based on theirs. Rather, a copyright infringement will be found by the one violating the law.

AI System as Copyright Holder

When we have no information regarding the genesis of the output, we will often not be able to tell whether a particular text was written by a human author or by ChatGPT. There is no arguing that AI produces original content. Nevertheless, many copyright laws require either explicitly or implicitly a creative act by a human author to extend copyright protection to the work. This has a principled reason. As explained above, copyright law aims to provide an incentive to produce original works, and no such incentive could ever motivate AI.An argument before a court or copyright authority to the effect that ChatGPT should hold a copyright on its output will likely fail. For example, the US Copyright Office explicitly requires a human to have created the work. Similarly, in Canada, the Copyright Act requires the copyright holder to be “a citizen or subject of, or a person ordinarily resident in, a treaty country,” thus implying the author to be a natural person. Hence, our second contestant must be disqualified as a potential copyright owner.Furthermore, as we have seen above, if the output is based on copyrighted input and constitutes a derivative work, no copyright protection to any part of the new work will be granted. This argument will likely only be important if the output is obviously mimicking the style or uses the characters of an author who holds a copyright to their work.

Developers as Copyright Owners

Considering the pivotal role of developers in designing and developing the AI system itself, we can argue that developers should be recognized as the authors or owners of the AI-generated output. Developers create the framework, algorithms, and training methods that enable the AI system's creative abilities. Perhaps copyright protection should be granted to developers as they are responsible for the AI system's capabilities and its ability to generate original works. Conversely, developers as such do not have direct creative control over the AI system's output, and their role is primarily technical, thereby limiting their claim to copyright ownership.

LLM Users as Copyright Owners

An argument could be made that LLM users who purposefully and carefully curate the training data of an LLM model to ensure that it learns only a particular style are comparable to other creators of copyrighted works would use their tools and skills when performing their craft. There may even be instances where someone trains an AI model exclusively on data they generated themselves. In that case, ownership or authorship under copyright law may be achievable. Yet, on the other end of the spectrum there is of course the more common user who is not engaged with selecting or generating the training data. They would merely write a few lines of prompts and ChatGPT provides an entire article. Everything in between is also possible. Since an LLM is trained on a vast and almost indiscriminate amount of data, but also able to learn from a user’s input, it is conceivable that a user makes a very skillful and creative effort to tailor the prompt and then reiterates on it and asks the LLM to change its output to their liking. In such a scenario the user may well be able to meet the requirements for a copyright on the output. Given this broad range of effort that users may exhibit, it is difficult to make a sweeping judgment on whether the AI user should be granted copyright on the output.But then there is also the contractual aspect of copyright law. Looking at OpenAI’s Terms of use, we can see that if there was any uncertainty as to whether OpenAI or the user of their models are entitled to the generated output, OpenAI assigns its rights to the user, but also the obligations that come with that:Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms. OpenAI may use Content to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms.

Conclusion

The question of whether the output of generative AI can be copyrighted remains complex and subject to ongoing debate. As AI systems continue to evolve and push the boundaries of creative expression, it is essential to carefully evaluate the perspectives of different personas involved in the creation process. Achieving a consensus on copyrightability will require legal frameworks that adapt to technological advancements, recognizing both the contributions of AI systems and the roles of human creators and developers. Striking the right balance will promote innovation, protect the rights of creators, and foster responsible AI development in a rapidly changing digital landscape.

Data Left Behind: AI Scribes’ Promises in Healthcare

Data Left Behind: Healthcare’s Untapped Goldmine

The Future of Health Data: How New Tech is Changing the Game

Why is linguistics essential when dealing with healthcare data?

Why Health Data Strategies Fail Before They Start

Private AI to Redefine Enterprise Data Privacy and Compliance with NVIDIA

EDPB’s Pseudonymization Guideline and the Challenge of Unstructured Data

HHS’ proposed HIPAA Amendment to Strengthen Cybersecurity in Healthcare and how Private AI can Support Compliance

Japan's Health Data Anonymization Act: Enabling Large-Scale Health Research

What the International AI Safety Report 2025 has to say about Privacy Risks from General Purpose AI

Private AI 4.0: Your Data’s Potential, Protected and Unlocked

How Private AI Facilitates GDPR Compliance for AI Models: Insights from the EDPB's Latest Opinion

Navigating the New Frontier of Data Privacy: Protecting Confidential Company Information in the Age of AI

Belgium’s Data Protection Authority on the Interplay of the EU AI Act and the GDPR

Enhancing Compliance with US Privacy Regulations for the Insurance Industry Using Private AI

Navigating Compliance with Quebec’s Act Respecting Health and Social Services Information Through Private AI’s De-identification Technology

Unlocking New Levels of Accuracy in Privacy-Preserving AI with Co-Reference Resolution

Strengthened Data Protection Enforcement on the Horizon in Japan

How Private AI Can Help to Comply with Thailand's PDPA

How Private AI Can Help Financial Institutions Comply with OSFI Guidelines

The American Privacy Rights Act – The Next Generation of Privacy Laws

How Private AI Can Help with Compliance under China’s Personal Information Protection Law (PIPL)

PII Redaction for Reviews Data: Ensuring Privacy Compliance when Using Review APIs

Independent Review Certifies Private AI’s PII Identification Model as Secure and Reliable

To Use or Not to Use AI: A Delicate Balance Between Productivity and Privacy

To Use or Not to Use AI: A Delicate Balance Between Productivity and Privacy

News from NIST: Dioptra, AI Risk Management Framework (AI RMF) Generative AI Profile, and How PII Identification and Redaction can Support Suggested Best Practices

Handling Personal Information by Financial Institutions in Japan – The Strict Requirements of the FSA Guidelines

日本における金融機関の個人情報の取り扱い - 金融庁ガイドラインの要件

Leveraging Private AI to Meet the EDPB’s AI Audit Checklist for GDPR-Compliant AI Systems

Who is Responsible for Protecting PII?

How Private AI can help the Public Sector to Comply with the Strengthening Cyber Security and Building Trust in the Public Sector Act, 2024

A Comparison of the Approaches to Generative AI in Japan and China

Updated OECD AI Principles to keep up with novel and increased risks from general purpose and generative AI

Is Consent Required for Processing Personal Data via LLMs?

The evolving landscape of data privacy legislation in healthcare in Germany

The CIO’s and CISO’s Guide for Proactive Reporting and DLP with Private AI and Elastic

The Evolving Landscape of Health Data Protection Laws in the United States

Comparing Privacy and Safety Concerns Around Llama 2, GPT4, and Gemini

How to Safely Redact PII from Segment Events using Destination Insert Functions and Private AI API

WHO’s AI Ethics and Governance Guidance for Large Multi-Modal Models operating in the Health Sector – Data Protection Considerations

How to Protect Confidential Corporate Information in the ChatGPT Era

Unlocking the Power of Retrieval Augmented Generation with Added Privacy: A Comprehensive Guide

Leveraging ChatGPT and other AI Tools for Legal Services

Leveraging ChatGPT and other AI tools for HR

Leveraging ChatGPT in the Banking Industry

Law 25 and Data Transfers Outside of Quebec

The Colorado and Connecticut Data Privacy Acts

Unlocking Compliance with the Japanese Data Privacy Act (APPI) using Private AI

Tokenization and Its Benefits for Data Protection

Private AI Launches Cloud API to Streamline Data Privacy

Processing of Special Categories of Data in Germany

End-to-end Privacy Management

Privacy Breach Reporting Requirements under Law25

Migrating Your Privacy Workflows from Amazon Comprehend to Private AI

A Comparison of the Approaches to Generative AI in the US and EU

Benefits of AI in Healthcare and Data Sources (Part 1)

Privacy Attacks against Data and AI Models (Part 3)

Risks of Noncompliance and Challenges around Privacy-Preserving Techniques (Part 2)

Enhancing Data Lake Security: A Guide to PII Scanning in S3 buckets

The Costs of a Data Breach in the Healthcare Sector and its Privacy Compliance Implications

Navigating GDPR Compliance in the Life Cycle of LLM-Based Solutions

What’s New in Version 3.8

How to Protect Your Business from Data Leaks: Lessons from Toyota and the Department of Home Affairs

New York's Acceptable Use of AI Policy: A Focus on Privacy Obligations

Safeguarding Personal Data in Sentiment Analysis: A Guide to PII Anonymization

Changes to South Korea’s Personal Information Protection Act to Take Effect on March 15, 2024

Australia’s Plan to Regulate High-Risk AI

How Private AI can help comply with the EU AI Act

Comment la Loi 25 Impacte l'Utilisation de ChatGPT et de l'IA en Général

Endgültiger Entwurf des Gesetzes über Künstliche Intelligenz – Datenschutzpflichten der KI-Modelle mit Allgemeinem Verwendungszweck

How Law25 Impacts the Use of ChatGPT and AI in General

Is Salesforce Law25 Compliant?

Creating De-Identified Embeddings

Exciting Updates in 3.7

EU AI Act Final Draft – Obligations of General-Purpose AI Systems relating to Data Privacy

FTC Privacy Enforcement Actions Against AI Companies

The CCPA, CPRA, and California's Evolving Data Protection Landscape

HIPAA Compliance – Expert Determination Aided by Private AI

Private AI Software As a Service Agreement

EU's Review of Canada's Data Protection Adequacy: Implications for Ongoing Privacy Reform

Acceptable Use Policy

ISO/IEC 42001: A New Standard for Ethical and Responsible AI Management

Reviewing OpenAI's 31st Jan 2024 Privacy and Business Terms Updates

Comparing OpenAI vs. Azure OpenAI Services

Quebec’s Draft Regulation Respecting the Anonymization of Personal Information

Version 3.6 Release: Enhanced Streaming, Auto Model Selection, and More in Our Data Privacy Platform

Brazil's LGPD: Anonymization, Pseudonymization, and Access Requests

LGPD do Brasil: Anonimização, Pseudonimização e Solicitações de Acesso à Informação

Canada’s Principles for Responsible, Trustworthy and Privacy-Protective Generative AI Technologies and How to Comply Using Private AI

Private AI Named One of The Most Innovative RegTech Companies by RegTech100

Data Integrity, Data Security, and the New NIST Cybersecurity Framework

Safeguarding Privacy with Commercial LLMs

Cybersecurity in the Public Sector: Protecting Vital Services

Privacy Impact Assessment (PIA) Requirements under Law25

Elevate Your Experience with Version 3.5

Fine-Tuning LLMs with a Focus on Privacy

GDPR in Germany: Challenges of German Data Privacy (Part 2)

Comply with US Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence using Private AI

How to Comply with EU AI Act using PrivateGPT