Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”

  • Vipsu@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Well Reddit should just sue these companies and see if these companies are actually breaking any laws. Holding sizeable chunk of the internet hostage also sounds like something the EU and US might want to look in to as it very much sounds like anti-competitive conduct or market manipulation.

    Also if these companies want to have greater ownership over the content generated by their users they should also be much more liable for the content posted to their sites. I mean when something like the Section 230 was written they probably did not take this in to account. If these companies want to start selling user generated content then they should simply lose the immunity from liability.

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      Reddit would lose badly that’s why they don’t sue. US’ 9th circuit ruled that scraping Linkedin is legal and Bing is not even scraping but indexing the data. Easiest case ever.

      It’s almost impossible to block web scraping especially someone with Microsoft or Perplexity resources.

      Its clearly an attempt to blackmail indexers into license deal as paying something to reddit could be actually cheaper than battling anti robots.

    • commie@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      they should also be much more liable for the content posted to their sites.

      why do people insist on making me defend reddit.

    • mint_tamas@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      While I don’t disagree with the general idea, Section 230 would introduce an uncontrollable risk into running any website with user-generated content and would essentially shut them down.

      • Passerby6497@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        If the site isn’t selling data, they wouldn’t lose 230 protection. So that would only be a risk for the companies selling their users’ data, not your regular forum or something.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          That gets really murky though. For example:

          • news sites w/ comment sections - they’re profiting from ads and subscriptions, so how much of that has to do with the comments?
          • ecommerce - reviews on Amazon and eBay could be considered advertising for the product. Who’s liable, the ecommerce site, the merchant, or the poster?
          • product websites - how much are posted “reviews” considered advertising for the product? There may not be direct sales on the website, but surely someone’s review would impact sales elsewhere
          • for-profit services with a discussion forum - these would be on a separate site from the revenue-generating service, but still associated with the brand and thus likely contributing to advertisements for the product

          It’s a lot more obvious for social media sites like Facebook since user-generated content is the service, but there are a lot of for-profit entities where user-generated content is highly relevant, but not the core service. Would those sites be essentially forced to either moderate or eliminate user interaction?

          There’s a lot of complexity here.