Copyright Implications For Genetic Sequencing Outputs And Identity-Driven Databases.
I. Introduction
Genetic sequencing outputs and identity-driven databases involve:
Raw DNA/RNA/protein sequences.
Annotated variants, phenotypic correlations, and curated genomic profiles.
Identity-driven databases linking genetic data to individuals or populations.
Legal questions include:
Can raw genetic sequences be copyrighted?
Are curated genomic databases protected under copyright or database rights?
Can AI-assisted analyses or visualizations of sequences be protected?
What are privacy and ethical implications intersecting with copyright?
II. Key Case Laws
1. Association for Molecular Pathology v. Myriad Genetics – Gene Patent / Copyright Analogy
Background
Myriad Genetics discovered BRCA1 and BRCA2 gene sequences and obtained patents for isolated DNA.
Holding
Naturally occurring DNA sequences are products of nature, not patentable.
Synthetic cDNA is patentable.
Application
Raw DNA sequences themselves are facts → cannot be copyrighted.
Database operators cannot claim copyright on unaltered genomic sequences.
Protection exists only for original annotations, arrangements, or curated analyses.
2. Feist Publications, Inc. v. Rural Telephone Service Co. – Original Selection/Arrangement
Background
Feist copied a phone book; Rural claimed copyright based on effort (“sweat of the brow”).
Holding
Facts are not copyrightable.
Copyright only protects creative selection or arrangement.
Application
Raw sequences are like facts.
Human curation, variant annotation, or functional grouping of sequences can be protected if it shows creativity.
3. British Horseracing Board Ltd v William Hill Organization Ltd – EU Database Rights
Background
The British Horseracing Board claimed database rights over racing data.
Holding
Investment in creating data does not confer EU database protection.
Protection applies to obtaining, verifying, or presenting existing data (sui generis right).
Application
Identity-driven databases that collect and validate genetic data may be protected under EU law if they invest substantial resources in data verification or organization.
Raw sequences alone, even if costly to generate, may not qualify.
4. Atlas v. Nixdorf Computer – Map and Data Compilation Protection
Background
Atlas created a digital map database; Nixdorf copied it.
Holding
Courts recognized copyright in creative selection and arrangement.
Application to Genetic Databases
A genomic database may be protected if it features creative arrangement, such as:
Grouping genes by functional pathways,
Annotating variants with phenotypic data,
Integrating multi-omic layers.
5. Baker v. Selden – Idea-Expression Dichotomy
Background
Baker copied bookkeeping methods from Selden’s book.
Holding
Copyright protects expression, not methods or systems.
Application
Computational pipelines, sequence alignment algorithms, and analytical workflows are functional → not copyrightable.
Only the specific curated output, presentation, or report may be protected.
6. Oracle America, Inc. v. Google LLC – APIs and Fair Use
Background
Google copied Java APIs for Android; Oracle sued.
Holding
APIs are copyrightable.
Copying may be fair use if transformative and necessary for interoperability.
Application
Software used to analyze genetic sequences (visualization, annotation, or reporting) may be protected.
Use of APIs from bioinformatics platforms may qualify as fair use for research or interoperability purposes.
7. Authors Guild v. Google – Training Data & Fair Use
Background
Google scanned books to create searchable snippets; authors sued.
Holding
Transformative use of copyrighted works can qualify as fair use.
Application
AI trained on proprietary genomic annotations or literature may be used for analysis, storytelling, or predictive modeling if:
The output is transformative,
No wholesale copying occurs.
Essential for identity-driven databases integrating public and proprietary datasets.
III. Key Legal Principles
Raw Genetic Data = Facts
Unmodified sequences cannot be copyrighted (Myriad + Feist logic).
Database Protection Depends on Creativity / Investment
US: Creative selection or arrangement required (Feist, Atlas).
EU: Sui generis right protects investment in obtaining, verifying, or presenting data (BHB v. William Hill).
Software and Pipelines
Algorithms and methods are functional → not copyrightable (Baker v. Selden).
Code implementing pipelines or visualization may be protected.
Derivative Work & Substantial Similarity
Using existing curated databases or annotations may require licensing.
Fair use applies if outputs are transformative (Authors Guild v. Google).
Human Curation is Key
Selection, annotation, organization, and reporting of sequences give copyrightable expression.
IV. Practical Implications for Genetic Sequencing Databases
Raw sequences: Not protected; can be shared openly.
Curated datasets: Copyrightable if creatively organized or annotated.
Software tools: Code protected; workflows and algorithms are functional.
AI-assisted analysis: Protectable if human curation shapes outputs.
EU considerations: Investment in data verification and presentation may provide additional sui generis rights.
Privacy & Ethical Compliance: Identity-linked genetic data requires strict data protection; copyright must not override ethical obligations.
V. Summary Table
| Aspect | Copyright Status | Key Cases | Notes |
|---|---|---|---|
| Raw DNA/RNA/protein sequences | Not copyrightable | Myriad, Feist | Considered facts |
| Curated databases | Protected if creative | Atlas, Feist | Creative selection, annotation, functional grouping |
| Identity-driven datasets | EU: protected if investment | BHB v. William Hill | US: protection only if creative |
| Software tools / pipelines | Code protected; methods not | Baker v. Selden, Oracle v. Google | Functional workflows not protected |
| AI-assisted analysis | Protected if human-directed | Authors Guild v. Google | Transformative, curated outputs |
Genetic sequencing outputs and identity-driven databases exist at the intersection of fact vs. expression. Raw sequences are unprotected, but creative curation, annotation, and organization can receive copyright protection. AI-assisted workflows must be human-directed to qualify, and EU database rights add a layer of protection based on investment in data verification.

comments