Copyright Protection and Enforcement Mechanisms for AI-Managed Vietnamese Literary Databases
📌 1. Authors Guild, Inc. v. HathiTrust (U.S. Fair Use Decision)
Jurisdiction: United States (Second Circuit)
Year: 2012–2014
Key Legal Issue: Whether a large digital repository of scanned books infringed copyright by making them searchable and accessible.
Facts: The HathiTrust Digital Library used millions of scanned books from libraries. Authors claimed this “digitized database” infringed their exclusive rights by reproducing and distributing works without permission.
Court’s Reasoning:
The court found the search functionality and distribution limited to users with print disabilities were transformative and thus constituted fair use.
The decision emphasized that search and access to facts in texts can be non-infringing if the use doesn’t substitute for the original market.
Outcome:
The Second Circuit affirmed that making books searchable via an AI-like digital database can be fair use.
This is one of the earliest cases on how AI-style indexing/search functions interact with copyright — a core issue for AI-managed archives.
Relevance: Useful for discussing transformative use vs. direct reproduction in AI systems that index or curate literature.
📌 2. Hachette v. Internet Archive (Controlled Digital Lending)
Jurisdiction: United States (S.D.N.Y.)
Year: 2023–2024
Key Legal Issue: Whether digitizing and lending full books from a scanned database infringed copyright.
Facts: A major online library (Internet Archive) scanned copyrighted books and lent full digital copies via its controlled digital lending (CDL) system. © Publishers (Hachette, Penguin Random House, etc.) accused it of infringement.
Court’s Reasoning:
The court rejected the Archive’s arguments that CDL was a safe harbor.
Lending full copyrighted works without permission was not “fair use” merely because a library held a physical copy.
Outcome:
Permanent injunction against CDL distribution of certain copyrighted books.
Decision reinforced that merely having a database of works does not immunize an operator from liability if full works are shared without permission.
Relevance: Highlights limits on digital archives and AI databases distributing complete works.
📌 3. China HTTPS Ultraman AI Copyright Case
Jurisdiction: China (Guangzhou Internet Court)
Year: ~2024 (first AI liability case)
Key Legal Issue: Whether AI generation of derivative images infringed copyright without permission.
Facts: An AI website allowed users to generate images based on prompts like “Ultraman Dyna”, producing outputs substantially similar to copyrighted characters. Rights-holder sued for reproduction and derivative infringement.
Court’s Reasoning:
The court found AI outputs could infringe when they are substantially similar to protected works.
The defendant’s website bore responsibility because it enabled users to generate infringing outputs and had not prevented them.
Outcome:
Infringement found; damages awarded (though modest).
Set a key precedent in China for AI service provider liability, especially where outputs infringe.
Relevance: Valuable for comparative analysis, showing other Asian jurisdictions evolving AI copyright enforcement, analogous to what Vietnam’s courts might face.
📌 4. Meta Fair Use Rulings — Kadrey v. Meta Platforms & Related Cases
Jurisdiction: United States (Federal Court, California)
Year: 2025
Key Legal Issue: Whether training a large language model (LLM) on copyrighted books without permission is infringement or fair use.
Facts: Multiple authors sued Meta, claiming its LLM (LLaMA) was trained on copyrighted books without licenses.
Court’s Reasoning:
Judge Chhabria granted summary judgment for Meta on fair use grounds.
Found that plaintiffs failed to show evidence of significant market harm or direct copying of valid text samples.
Outcome:
Meta won the case, but the judge clarified this decision doesn’t universally legalize AI training on copyrighted works — it depends on evidence and arguments.
Relevance: Shows how U.S. courts are beginning to apply fair use to AI training — a core defense mechanism for AI databases and literature archives.
📌 5. Anthropic Settlement with Authors (Largest AI Copyright Resolution)
Jurisdiction: United States (Federal Court)
Year: 2025
Key Legal Issue: Settlement arising from claims that AI training used unauthorized copyrighted books.
Facts: Authors alleged Anthropic used hundreds of thousands of copyrighted books without permission to train its models.
Outcome:
Judge preliminarily approved a $1.5 billion settlement between Anthropic and the authors.
Authors may receive monetary compensation per book included in the training dataset.
Relevance: One of the largest monetary resolutions tied to AI’s use of copyrighted content — illustrates compensation and enforcement beyond fair use arguments.
📌 6. European and Japanese AI Copyright Litigation
Here are other relevant global disputes (not fully adjudicated but shaping law):
🇩🇪 Germany — GEMA v OpenAI
A Munich court ruled AI training using song lyrics violated German copyright, emphasizing AI operators must respect local rights society rules.
🇯🇵 Japan — Media groups sue Perplexity
Nikkei and Asahi alleged an AI search engine stored and reused articles without permission, seeking damages and deletion of infringing content.
🔍 Key Legal Themes These Cases Illustrate
đź§ 1. Transformative Use & Fair Use Doctrine
Courts often balance whether AI training or database operations transform works sufficiently not to compete with or replace the original. U.S. courts emphasize this in fair use analyses (Meta, Anthropic).
📚 2. Distribution vs. Search vs. Full Access
Making works searchable or indexable might be permissible (HathiTrust), but distributing full works without permission (Hachette v. Internet Archive) can be infringing.
🤖 3. AI Output Liability
AI vendors can be responsible if their systems produce outputs substantially similar to protected works (China Ultraman case), stressing that liability may attach to service providers.
đź’µ 4. Settlements & Monetization
Anthropic’s settlement signals that authors and rights holders can negotiate compensation outside or after litigation, shaping future enforcement mechanisms.
📜 5. International Divergence
Different jurisdictions (Germany vs. U.S. vs. China) apply different standards — global AI copyright enforcement isn’t uniform yet.
📌 Conclusion
These cases provide concrete judicial responses to the complexities of AI-trained databases, copyright, and enforcement. They show:
Courts are willing to apply traditional copyright doctrines (like fair use) to digital archives and AI training.
But unauthorized distribution of full works remains actionable.
Liability can extend to AI service providers if their systems enable infringement.
Global differences in enforcement show the need for clear statutory frameworks, especially in emerging markets like Vietnam.

comments