LLM
Legal and IP
Meta’s Llama 3.1 can recall 42 percent of the first Harry Potter book
6/16/2025 • understandingai.org

New research could have big implications for copyright lawsuits against generative AI.
Read Full Article...C4AIL Commentary
For any language model, the probability of generating any given 50-token sequence “by accident” is vanishingly small. If a model generates 50 tokens from a copyrighted work, that is strong evidence the tokens “came from” the training data. This is true even if it only generates those tokens 10 percent, 1 percent, or 0.01 percent of the time.