Copyright Implications For Genetic Sequencing Outputs And Identity-Driven Databases.

I. Introduction

Genetic sequencing outputs and identity-driven databases involve:

Raw DNA/RNA/protein sequences.

Annotated variants, phenotypic correlations, and curated genomic profiles.

Identity-driven databases linking genetic data to individuals or populations.

Legal questions include:

Can raw genetic sequences be copyrighted?

Are curated genomic databases protected under copyright or database rights?

Can AI-assisted analyses or visualizations of sequences be protected?

What are privacy and ethical implications intersecting with copyright?

II. Key Case Laws

1. Association for Molecular Pathology v. Myriad Genetics – Gene Patent / Copyright Analogy

Background

Myriad Genetics discovered BRCA1 and BRCA2 gene sequences and obtained patents for isolated DNA.

Holding

Naturally occurring DNA sequences are products of nature, not patentable.

Synthetic cDNA is patentable.

Application

Raw DNA sequences themselves are facts → cannot be copyrighted.

Database operators cannot claim copyright on unaltered genomic sequences.

Protection exists only for original annotations, arrangements, or curated analyses.

2. Feist Publications, Inc. v. Rural Telephone Service Co. – Original Selection/Arrangement

Background

Feist copied a phone book; Rural claimed copyright based on effort (“sweat of the brow”).

Holding

Facts are not copyrightable.

Copyright only protects creative selection or arrangement.

Application

Raw sequences are like facts.

Human curation, variant annotation, or functional grouping of sequences can be protected if it shows creativity.

3. British Horseracing Board Ltd v William Hill Organization Ltd – EU Database Rights

Background

The British Horseracing Board claimed database rights over racing data.

Holding

Investment in creating data does not confer EU database protection.

Protection applies to obtaining, verifying, or presenting existing data (sui generis right).

Application

Identity-driven databases that collect and validate genetic data may be protected under EU law if they invest substantial resources in data verification or organization.

Raw sequences alone, even if costly to generate, may not qualify.

4. Atlas v. Nixdorf Computer – Map and Data Compilation Protection

Background

Atlas created a digital map database; Nixdorf copied it.

Holding

Courts recognized copyright in creative selection and arrangement.

Application to Genetic Databases

A genomic database may be protected if it features creative arrangement, such as:

Grouping genes by functional pathways,

Annotating variants with phenotypic data,

Integrating multi-omic layers.

5. Baker v. Selden – Idea-Expression Dichotomy

Background

Baker copied bookkeeping methods from Selden’s book.

Holding

Copyright protects expression, not methods or systems.

Application

Computational pipelines, sequence alignment algorithms, and analytical workflows are functional → not copyrightable.

Only the specific curated output, presentation, or report may be protected.

6. Oracle America, Inc. v. Google LLC – APIs and Fair Use

Background

Google copied Java APIs for Android; Oracle sued.

Holding

APIs are copyrightable.

Copying may be fair use if transformative and necessary for interoperability.

Application

Software used to analyze genetic sequences (visualization, annotation, or reporting) may be protected.

Use of APIs from bioinformatics platforms may qualify as fair use for research or interoperability purposes.

7. Authors Guild v. Google – Training Data & Fair Use

Background

Google scanned books to create searchable snippets; authors sued.

Holding

Transformative use of copyrighted works can qualify as fair use.

Application

AI trained on proprietary genomic annotations or literature may be used for analysis, storytelling, or predictive modeling if:

The output is transformative,

No wholesale copying occurs.

Essential for identity-driven databases integrating public and proprietary datasets.

III. Key Legal Principles

Raw Genetic Data = Facts

Unmodified sequences cannot be copyrighted (Myriad + Feist logic).

Database Protection Depends on Creativity / Investment

US: Creative selection or arrangement required (Feist, Atlas).

EU: Sui generis right protects investment in obtaining, verifying, or presenting data (BHB v. William Hill).

Software and Pipelines

Algorithms and methods are functional → not copyrightable (Baker v. Selden).

Code implementing pipelines or visualization may be protected.

Derivative Work & Substantial Similarity

Using existing curated databases or annotations may require licensing.

Fair use applies if outputs are transformative (Authors Guild v. Google).

Human Curation is Key

Selection, annotation, organization, and reporting of sequences give copyrightable expression.

IV. Practical Implications for Genetic Sequencing Databases

Raw sequences: Not protected; can be shared openly.

Curated datasets: Copyrightable if creatively organized or annotated.

Software tools: Code protected; workflows and algorithms are functional.

AI-assisted analysis: Protectable if human curation shapes outputs.

EU considerations: Investment in data verification and presentation may provide additional sui generis rights.

Privacy & Ethical Compliance: Identity-linked genetic data requires strict data protection; copyright must not override ethical obligations.

V. Summary Table

AspectCopyright StatusKey CasesNotes
Raw DNA/RNA/protein sequencesNot copyrightableMyriad, FeistConsidered facts
Curated databasesProtected if creativeAtlas, FeistCreative selection, annotation, functional grouping
Identity-driven datasetsEU: protected if investmentBHB v. William HillUS: protection only if creative
Software tools / pipelinesCode protected; methods notBaker v. Selden, Oracle v. GoogleFunctional workflows not protected
AI-assisted analysisProtected if human-directedAuthors Guild v. GoogleTransformative, curated outputs

Genetic sequencing outputs and identity-driven databases exist at the intersection of fact vs. expression. Raw sequences are unprotected, but creative curation, annotation, and organization can receive copyright protection. AI-assisted workflows must be human-directed to qualify, and EU database rights add a layer of protection based on investment in data verification.

LEAVE A COMMENT