In order to help train its AI models, Meta (and others) have been using pirated versions of copyrighted books, without the consent of authors or publishers. The company behind Facebook and Instagram faces an ongoing class-action lawsuit brought by authors including Richard Kadrey, Sarah Silverman, and Christopher Golden, and one in which it has already scored a major (and surprising) victory: The Californian court concluded last year that using pirated books to train its Llama LLM did qualify as fair use.

You’d think this case would be as open-and-shut as it gets, but never underestimate an army of high-priced lawyers. Meta has now come up with the striking defense that uploading pirated books to strangers via BitTorrent qualifies as fair use. It further goes on to claim that this is double good, because it has helped establish the United States’ leading position in the AI field.

Meta further argues that every author involved in the class-action has admitted they are unaware of any Llama LLM output that directly reproduces content from their books. It says if the authors cannot provide evidence of such infringing output or damage to sales, then this lawsuit is not about protecting their books but arguing against the training process itself (which the court has ruled is fair use).

Judge Vince Chhabria now has to decide whether to allow this defense, a decision that will have consequences for not only this but many other AI lawsuits involving things like shadow libraries. The BitTorrent uploading and distribution claims are the last element of this particular lawsuit, which has been rumbling on for three years now, to be settled.

  • Snot Flickerman@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    3 months ago
    1. Shorter and more reasonable copyright lengths would make this a moot point because then there would sufficient literature in the public domain to pull from.

    2. These kind of charges are what put the Pirate Bay admins in prison and caused Aaron Swartz to kill himself because of a threat of lifetime in prison. The claim that they did this either with the goal of profit or actually successful profit and that this was a serious crime. Neither TPB or Swartz at that point in time had ever moved as much data as Meta has for these claims, nor did they ever have the profit or possibility of profit Meta aims to make from their AI offerings.

    3. Now Meta is claiming they’ve profited so hard you can’t possibly hold them accountable.

    It will be the biggest “fuck you” in history to anyone ever hit with civil charges for piracy in the early 2000s, let alone the TPB admins and Swartz, if they let this go. Which means they probably will because in America, apparently if you crime hard enough and big enough they stop putting you in prison and start patting you on the back and calling it good business sense.

    • Airfried@piefed.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      in America, apparently if you crime hard enough and big enough they stop putting you in prison and start patting you on the back and calling it good business sense.

      There’s a story about Alexander the great capturing a pirate and scolding him for raiding villages along the coast line. Alexander asked if the pirate feels ashamed and wants to beg for forgiveness. However, the pirate had something else to say. He said that Alexander was doing the same thing, but infinitely worse. The only difference was that Alexander called himself king and plundered entire lands while the pirate only raided small villages. The pirate reminded Alexander of the many lives he had destroyed in his conquest. So the pirate’s only crime was not to be the biggest baddie in the hood, so to speak.

      Alexander replied by stating that the title of king forces his hand and that he couldn’t just stop what he was doing. The pirate on the other hand was just an individual who could easily change course. And so Alexander set the pirate free, stating that he himself will start changing his own ways right there and then if the pirate makes a fresh start first.

      I don’t know if there is any truth to this but it’s a fable often used to explain how legitimacy changes the perception people have of wrong doing and heroism on a fundamental level. Alexander’s reply sounds like an excuse and I think that’s on purpose. The pirate outwitted him in the end by stating a basic truth.

  • ryathal@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 months ago

    Arguing that training models isn’t fair use us going to be a massive uphill battle, it’s basically reading the book but with a computer. It’s not actually a big deal to people, unless you hold the copyright to a ton of works and want to get a percentage of all the AI income these companies have made.

    Torrenting the books is likely absolutely copyright infringement, but that has relatively low payout compared to the money these companies are getting for their models. The training being fair use means that rights holders can’t try to take any money from the model’s use. The statutory limits for infringement even at per work levels aren’t significant compared to the legal cost of proving it happened.

    • OfCourseNot@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      There’s an argument to be made that it is, in fact, not ‘reading’. The training of the model could be considered a lossy compression of the data. And streaming movies in a lossy compression format is not fair use, is it?

      • Fatal@piefed.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        It’s not the storage of the information that matters as much as the presentation. Google’s search index stores a huge amount of copyrighted material, even losslessly. But they only present small snippets at a time which is not considered copyright infringement. The question really is whether or not the information being presented by the models is in a format which is considered copyright infringement. So far, courts have not found that they are.

  • Iconoclast@feddit.uk
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    I’m getting the feeling that the average Lemming is a pro-piracy advocate only for as long as it’s them financially benefiting from it but the script interestingly flips when a company they don’t like does the same thing.

    If money wasn’t an issue, there’s be no reason to pirate anything. It’s a financial decision. There’s no practical difference between earning fifty bucks and saving that much - in both cases you’re left with 50 more bucks to spend.

    • kossa@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      I feel you have it the wrong way around. The “average Lemming” is pissed, because private piracy is prosecuted and punished while Meta’s is not.

      I, for once, couldn’t care less whether Meta pirates the shit out of all the books if I am allowed to do the same ¯\_(ツ)_/¯

    • sonofearth@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      A person downloading a pirated copy of a book w/o any DRM for their own leisure use on their own device is different from a multi trillion dollar corporation who is using those books to train an LLM to make AI Slop and make money from it w/o even crediting the authors for their work.

      • Iconoclast@feddit.uk
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        The difference is only in scale. Stealing is stealing independent of if it’s for personal use or not.

        • CoolCat@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          Scale is not the only difference. The companies who do this end up making money with something trained on someone’s else’s work. If a regular Joe Shmoe pirates a book, they don’t earn anything with it.

          • Iconoclast@feddit.uk
            link
            fedilink
            English
            arrow-up
            0
            arrow-down
            2
            ·
            3 months ago

            they don’t earn anything with it.

            That’s not entirely true either. There’s no practical difference between saving 50 bucks and earning 50 bucks. In both cases you’re left with more money to spend. Piracy is equally a financial decision even if it’s just for personal use. You’re saving what ever amount it would’ve cost to buy that media.

        • sonofearth@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          Nothing is being stolen here. Just an illegal copy. Copy is made for varying reasons here and have different moral aspects.

          • Iconoclast@feddit.uk
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            I’m using theft an an example due to it being the closest equivalent. The point still stands: if it’s wrong for an company to do it at scale, then it’s wrong when an idividual does it too.

  • Entertainmeonly@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    By this logic i should be able to copy paste Moby Dick and change all instances of the name to Mopy Dick and now it’s output no longer matches the imput. I’m about to be the next Stefani King.

    • lmmarsano@group.lt
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      3 months ago

      Moby Dick

      Public domain.

      You could also try understanding the law

      §107. Limitations on exclusive rights: Fair use

      Notwithstanding the provisions of sections 106 and 106A, the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include-

      1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
      2. the nature of the copyrighted work;
      3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
      4. the effect of the use upon the potential market for or value of the copyrighted work.

      with particular attention to factors 1 (especially transformation) & 4.

      If that’s not for you, though, then you should definitely try that with a copyright work (Disney?) & report back on how that went.

      • ThomasWilliams@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Meta have paid the copyright fee but uploaded material from Ann’s Archive because it wasn’t financially feasible to scan in each book individually.

        Fair use is irrelevant.

        • ChunkMcHorkle@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 months ago

          Meta have paid the copyright fee

          Lol, no. “Copyright fees” are what you pay your government in order to register your copyright or keep your copyright registration active.

          Or to put it another way, copyright fees have fuck all to do with fair use.

          You’re trying make it sound as though Meta obtained consent and paid authors for their own work when in fact, Meta obtained consent from no one, and paid nothing at all to anyone, in exchange for the use of their works.

          Even a light skim of the attached article would have told you that much. What do you think a copyright suit is about?

          “Meta have paid the copyright fee,” lol. That’s some r/ConfidentlyIncorrect shit right there. Why did you even bother?

        • lmmarsano@group.lt
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 months ago

          Don’t need to: their lawyers understood the law & lawyered successfully so far.

            • lmmarsano@group.lt
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              3 months ago

              Are you referring to yourself by claiming your ignorance somehow matches legal expertise? Cool ad hominem, by the way: fallacies (including strawman of the transformative use argument), blame-shifting when you can’t back claims with credible evidence, & self-indulgent vanity are the hallmarks of trolls. Way to out yourself, buddy. 😄