Summary

Argues AI training on copyrighted works is learning, not copying, and falls under fair use. Defends LLMs against copyright infringement claims.

Key Concepts:

The Confusion: Conflating two fundamentally different acts: copying vs learning

Copyright’s Purpose:

  • Prevents unauthorized copying, reproduction, distribution
  • Incentivizes creation by protecting against exact/near-exact copies
  • Does NOT prohibit learning from protected materials

Learning Is Protected: Examples of legal learning:

  • Student internalizing textbook concepts
  • Critic analyzing film
  • Engineer studying patented inventions All generate new, transformative work from protected materials

AI Training as Learning:

  • LLMs statistically encode relationships and patterns
  • Extract generalized understandings, linguistic structures
  • Don’t store or reproduce verbatim—they “learn,” not “copy”
  • Feeding text to model ≈ student reading book, not photocopying

Legal Precedent:

  • Authors Guild v. Google affirms transformative uses (indexing, summarization, analysis) as fair use
  • AI training is highly transformative

The Misconception: People treat AI systems as storage devices rather than learners, but models are fundamentally abstract/transformative

Central Argument: Copyright protects against duplication, not learning or transformative innovation. AI embodies this distinction.

Note: Post includes disclaimer that it was composed with LLM assistance—meta-commentary on the argument itself.

Tags

Cross-References

  • Related: Fair use doctrine
  • Related: Transformative use
  • Related: AI training methodology

Notes

  • Timely intervention in active AI copyright debates
  • Takes clear pro-AI, pro-innovation stance
  • Self-referential: written with LLM to defend LLM training
  • Published June 8—day after June 7 burst
  • Demonstrates application of legal/ethical analysis to concrete tech policy