• zbyte64@awful.systems
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    2 months ago

    I don’t understand how when I say “agency” or “an aspect of the process” one would think I’m talking about the volume of information and not the quality.

    • Sam Clemente@allthingstech.social
      link
      fedilink
      arrow-up
      0
      ·
      2 months ago

      @zbyte64 1) In no way is quality a part of that equation and 2) In what other contexts is quality ever a part of the equation? I mean I can go look at some Monets and paint some shitty water lillies, is that somehow problematic?

      • zbyte64@awful.systems
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        I can go look at some Monets and paint some shitty water lillies, is that somehow problematic?

        If we’re using your paintings as training data for a Monet copy, then it could be.

        Are we even talking about AI if we’re saying data quality doesn’t matter?

        • Sam Clemente@allthingstech.social
          link
          fedilink
          arrow-up
          0
          ·
          2 months ago

          @zbyte64 data quality, again, was out of the scope of what I was talking about originally

          Which, again, was that legal precedent would suggest that the *how* is largely irrelevant in copyright cases, they’re mostly focused on *why* and the *scale of the operation*

          I’m not getting sued for copyright infringement by the NYT because I used inspect element to delete content to read behind their paywall, OpenAI is

          • zbyte64@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            2 months ago

            I was narrowly taking issue with the comparison to how humans learn, I really don’t care about copyrights.

            • Sam Clemente@allthingstech.social
              link
              fedilink
              arrow-up
              0
              ·
              2 months ago

              @zbyte64 where am I wrong? The process is effectively the same: you get a set of training data (a textbook) and a set of validation data (a test) and voila, I’m trained

              To learn how to draw an image of a thing, you look at the thing a lot (training data) and try sketching it out (validation data) until it’s right

              How the data is acquired is irrelevant, I can pirate the textbook or trespass to find a particular flower, that doesn’t mean I’m learning differently than someone who paid for it