Databricks faces ongoing copyright infringement lawsuit from bestselling authors claiming its DBRX language model was trained using a dataset containing pirated versions of approximately 196,000 books. A federal judge rejected Databricks' motion to dismiss, allowing the suit to proceed. The core dispute involves whether DBRX inherited infringing training data from the RedPajama dataset used by Databricks' acquired Mosaic team.
Policy
Databricks can't seem to shake authors' copyright claim that could result in 'extraordinary' damages
Databricks faces strengthened copyright liability as a federal judge allows authors' lawsuit to proceed, claiming DBRX was trained on 196,000 pirated books via the RedPajama dataset.
Thursday, April 30, 2026 12:00 PM UTC2 MIN READSOURCE: The RegisterBY sys://pipeline
Tags
policy