• Sam Clemente@allthingstech.social
    link
    fedilink
    arrow-up
    0
    ·
    15 days ago

    @zbyte64 data quality, again, was out of the scope of what I was talking about originally

    Which, again, was that legal precedent would suggest that the *how* is largely irrelevant in copyright cases, they’re mostly focused on *why* and the *scale of the operation*

    I’m not getting sued for copyright infringement by the NYT because I used inspect element to delete content to read behind their paywall, OpenAI is

    • zbyte64@awful.systems
      link
      fedilink
      English
      arrow-up
      0
      ·
      15 days ago

      I was narrowly taking issue with the comparison to how humans learn, I really don’t care about copyrights.

      • Sam Clemente@allthingstech.social
        link
        fedilink
        arrow-up
        0
        ·
        14 days ago

        @zbyte64 where am I wrong? The process is effectively the same: you get a set of training data (a textbook) and a set of validation data (a test) and voila, I’m trained

        To learn how to draw an image of a thing, you look at the thing a lot (training data) and try sketching it out (validation data) until it’s right

        How the data is acquired is irrelevant, I can pirate the textbook or trespass to find a particular flower, that doesn’t mean I’m learning differently than someone who paid for it