• BetaDoggo_@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    24 days ago

    How many times is this same article going to be written? Model collapse from synthetic data is not a concern at any scale when human data is in the mix. We have entire series of models now trained with mostly synthetic data: https://huggingface.co/docs/transformers/main/model_doc/phi3. When using entirely unassisted outputs error accumulates with each generation but this isn’t a concern in any real scenarios.

    • Something Burger 🍔@jlai.lu
      link
      fedilink
      English
      arrow-up
      0
      ·
      24 days ago

      As the number of articles about this exact subject increases, so does the likelihood of AI only being able to write about this very subject.