Social media platforms like Twitter and Reddit are increasingly infested with bots and fake accounts, leading to significant manipulation of public discourse. These bots don’t just annoy users—they skew visibility through vote manipulation. Fake accounts and automated scripts systematically downvote posts opposing certain viewpoints, distorting the content that surfaces and amplifying specific agendas.
Before coming to Lemmy, I was systematically downvoted by bots on Reddit for completely normal comments that were relatively neutral and not controversial at all. Seemed to be no pattern in it… One time I commented that my favorite game was WoW, down voted -15 for no apparent reason.
For example, a bot on Twitter using an API call to GPT-4o ran out of funding and started posting their prompts and system information publicly.
https://www.dailydot.com/debug/chatgpt-bot-x-russian-campaign-meme/
Bots like these are probably in the tens or hundreds of thousands. They did a huge ban wave of bots on Reddit, and some major top level subreddits were quiet for days because of it. Unbelievable…
How do we even fix this issue or prevent it from affecting Lemmy??
Isn’t there code / the magic incantation of prompt text that we can deploy to get bots to reveal themselves? Even if it take more than one response?
Create a bot that reports bot activity to the Lemmy developers.
You’re basically using bots to fight bots.
Love that name too. Rock 'Em Sock 'Em Robots.
While a good solution in principle, it could (and likely will) false flag accounts. Such a system should be a first line with a review as a second.
It’s reporting activity, not banning people (or bots)
Are you willing to sift through all the reports?
Cause that’s gunna be A LOT of work
Let AI do it! See? Easy!
Fundamentally the problem only has temporary solutions unless you have some kind of system that makes using bots expensive.
One solution might be to use something like FIDO2 usb security tokens. Assuming those tokens cost like 5€. Instead of using an email you can create an account that is anonymous (assuming the tokens are sold anonymously) and requires a small cost investment. If you get banned you need to buy a new fido2 token.
PS: Fido tokens still cost too much but also you can make your own with a raspberry pico 2 and just overwrite and make a new key. So this is no solution either without some trust network.
Make your own bot account that randomly(or not randomly) posts something bots will reply to, a system based response preferably. Last I was looking at bots they were simply programs, and have dev commands that can return information on things like system resources, or OS version. Your bot posts commands built in from the bot apps Dev, the bots reply like bots do with their version, system resources, or whatever they have built in. Boom - Banned instantly.
Some sort of “report as bot” --> required captcha pipeline would be useful
Captcha is already mostly machine breakable, I’ve seen some new interesting pattern-based stuff but nothing that you couldn’t do image training against.
At some point not too far in the future you won’t be able to use captcha to stop bots from posting. It simply won’t even be a hurdle, a couple extra pennies of computational power.
There’s probably some power in detecting accounts that are blocked by many people. The problem is no matter what we do we’re heading towards blocking them with an algorithm or AI. And I’d hate to see that for Lemmy.
This place is just the stuff you follow with the raw up and down votes. We don’t hide unpopular posts making brigading less useful.
I feel like the real answer is and has been for a long time some sort of distributed moderation system. Any individual user can take moderation actions. These actions produce visible effects for themself, and to anyone who subscribes to their actions. Create bot users who auto-detect certain types of behavior (horrible stuff like cp or gore) and take actions against it. Auto-subscribe users to the moderation actions of the global bots and community leaders (mods/admins) and allow them to unsubscribe.
We’d probably still need some moderation actions to be absolute and global, though, like banning illegal content.
The indieweb already has an answer for this: Web of Trust. Part of everyone social graph should include a list of accounts that they trust and that they do not trust. With this you can easily create some form of ranking system where bots get silenced or ignored.
A system like that sounds like it could be easily abused/manipulated into creating echo chambers of nothing but agreed-to right-think.
That would be only true if people only marked that they trust people that conform with their worldview.
which already happens with the stupid up/downvote system.
Where popular things, not right things, frequently get uplifted.
Well, I am on record saying that we should get rid of one-dimensional voting systems so I see your point.
But if anything, there is nothing stopping us from using both metrics (and potentially more) to build our feed.
Yeah, the up/down system is what prompted lots of bots to get created in the first place. because it leads to super easy post manipulation.
Get rid of it and go back to how web forums used to be. No upvotes, No downvotes, no stickers, no coins, no awards. Just the content of your post and nothing more. So people have to actually think and reply, rather than joining the mindless mob and feeling like they did something.
I was thinking about something like this but I think it’s ultimately not enough. You have essentially just two possible ends stages for this:
-
you only trust people that you personally meet and you verified their private key directly and then you will see only posts/interactions from like 15 people. the social media looses its meaning and you can just have a chat group on signal.
-
you allow some length of chains (you trust people [that are trusted by the people]^n that you know) but if you include enough people for social media to make sense then you will eventually end up with someone poisoning your network by trusting a bot (which can trust other bots…) so that wouldn’t work unless you keep doing moderation similar as now.
i would be willing to buy a wearable physical device (like a yubikey) that could be connected to my computer via a bluetooth interface and act as a fido2 second factor needed for every post but instead of having just a button (like on the yubikey) it would only work if monitoring of my heat rate or brainwaves would check out.
The way I imagine it working is if I notice a bot in my web, I flag it, and then everyone involved in approving the bot loses some credibility. So a bad actor will get flushed out. And so will your idiot friend that keeps trusting bots, so their recommendations are then mostly ignored.
that is an interesting idea. still… you can create an account (or have a troll farm of such accounts) that will mainly be used to trust bots and when their reputation goes down you throw them away and create new ones. same as you would do with traditional troll accounts… you made it one step more complicated but since the cost of creating bot accounts is essentially zero it doesn’t help much.
Just add “account age” to the list of metrics when evaluating their trust rank. Any account that is less than a week old has a default score of zero.
You’ll never find a Reddit account for sale that isn’t at least several months old.
Ok, which part of “multiple metrics” is not clear here?
Every risk analysis will have multiple factors. The idea is not to always have an absolute perfect ranking system, but to build a classifier that is accurate enough to filter most of the crap.
Email spam filters are not perfect, but no one inbox is drowning in useless crap like we used to have 20 years ago. Social media bots are presenting the same type of challenge, why can’t we solve it in the same way?
I didn’t read very far up into the thread. Sorry.
Automated filters will just drive determined botters to play the system and perfect their craft until they can no longer be automatically identified, in my opinion. I’m more of the stance that accounts should be reviewed manually so that a leap into convincing bot accounts will need to be much more dramatic, and therefore difficult. If it’s done the hard way from the start with staff who know how to identify these accounts, it may keep it from growing into an issue to begin with.
Any threshold to be automatically flagged for review should be relatively low, but the process should also be quick and efficient. Adding more metrics to the flagging process only means botters will have a narrower gaze to avoid. Once they start crunching the numbers and streamline mimicking real user accounts it’s game over.
But those bots don’t have any intersection with my network, so their trust score is low.
If they do connect via one of my idiot friends, that friend loses credit, too, and the system can trust his connections less.
The trust level is from my perspective, not global.
Why does have it to be one or the other?
Why not use all these different metrics to build a recommendation system?
-
Every time I see this implemented, it always seems like screwing over the end user who is trying to join for the first time. Platforms like reddit and Tumblr benefit from a friction-free sign up system.
Imagine how challenging it is for someone joining Lemmy for the first time and suddenly having to provide trust elements like answering a few questions, or getting someone to vouch for them.
They’ll run away and call Lemmy a walled garden.
Platforms like Reddit and Tumblr need to optimize for growth. We need to have growth, but it is does not be optimized for it.
Yeah, things will work like a little elitist club, but all newcomers need to do is find someone who is willing to vouch for them.
My instance requires that users say a little about why they want to join. Works just fine.
If someone isn’t willing to introduce themselves, why would they even want to register? If they just want to lurk, they can do so anonymously.
EDIT I just noticed we’re from the same instance lol, so you definitely know what I’m talking about 😆
lol reddit isnt friction free anymore, most subs want you to wait weeks or months before you post.
Same story, no experience, need work for experience, can’t get work without experience.
Platforms like reddit and Tumblr benefit from a friction-free sign up system.
Even on Reddit new accounts are often barred from participating in discussion, or even shadowbanned in some subs, until they’ve grinded enough karma elsewhere (and consequently, that’s why you have karmafarming bots).
How would I join a community without knowing anyone with that setup?
No current social network can be bot-proof. And Lemmy is in the most unprotected situation here, saved only by his low fame. On Twitter, I personally have already banned about 15000 Russian bots, but that’s less than 1% of the existing ones. I’ve seen the heads of bots with 165000 followers. Just imagine that all 165000 will register accounts on Lemmy, there is nothing to oppose them. I used to develop a theory for a new social network, where bots could exist as much as he want, but could not influence your circle of subscriptions and subscribers. But it’s complicated…
Also, the “bot”/“human” distinction doesn’t have to be binary. Say one has an account that mostly has a bot post generated text, but then if it receives a message, hands it off to a human to handle. Or has a certain percentage of content be human-crafted. That may potentially defeat a lot of approaches for detecting a bot.
This is another reason why a lack of transparency with user votes is bad.
As to why it is seemingly done randomly in reddit, it is to decrease your global karma score to make you less influential and to discourage you from making new comments. You probably pissed off someone’s troll farm in what they considered an influential subreddit. It might also interest you that reddit was explicitly named as part of a Russian influence effort here: https://www.justice.gov/opa/media/1366201/dl - maybe some day we will see something similar for other obvious troll farms operating in Reddit.
Keep the user base small and fragmented
If bots have to go to thousands of websites/instances to reach their targets then they lose their effectiveness
Thankfully we can federate bot posts to make that easier :P
To help fight bot disinformation, I think there needs to be an international treaty that requires all AI models/bots to disclose themselves as AI when prompted using a set keyphrase in every language, and that API access to the model be contingent on paying regain tests of the phrase (to keep bad actors from simply filtering out that phrase in their requests to the API).
It wouldn’t stop the nation-state level bad actors, but it would help prevent people without access to their own private LLMs from being able to use them as effectively for disinformation.
Considering you can run LLMs on off the shlwf hardware, thats going to be as enforcable as piracy is…
I can download a decent size LLM such as Llama 3.1 in under 20 seconds then immediately start using it. No terminal, no complicated git commands, just pressing download in a GUI.
They’re trivial to run yourself. And most are open source.
I don’t think this would be enforceable at all.
How do we even fix this issue or prevent it from affecting Lemmy??
Simple. Just scream that everyone whose opinion you dislike is a bot.
I disagree with this statement, so Ensign_Crab must be a bot. Reported.
I admit I’ve been guilty of this in the past, so sarcasm aside I cannot recommend this as a strategy for detecting actual bots … even though if you’re parroting the opinion those who have power & control bots wish you to believe, expressing that opinion makes one’s post functionally equivalent to that of a bot. I KNOW, SUE ME 🤷♂️
I cannot recommend this as a strategy for detecting actual bots
That’s because it isn’t one. It’s a means by which people attempt to impose orthodoxy.
Which is expensive right? Like it’s not cheap to call that API
You can’t get rid of bots, nor spammers. The only thing is that you can have a more aggressive automated punishment system, which will unevitably also punish good users, along with the bad users.
As others said you can’t prevent them completely. Only partially. You do it four steps:
- Make it unattractive for bots.
- Prevent them from joining.
- Prevent them from posting/commenting.
- Detect them and kick them out.
The sad part is that, if you go too hard with bot eradication, it’ll eventually inconvenience real people too. (Cue to Captcha. That shit is great against bots, but it’s cancer if you’re a human.) Or it’ll be laborious/expensive and not scale well. (Cue to “why do you want to join our instance?”).
Actual human content will never be undesirable for bots who must vacuum up content to produce profit. It’ll always be attractive to come here. The rest sound legit strategies though
Bots can view content without being able to post, which is what people are aiming to cut down. I don’t super care if bots are vacuuming up my shitposts (even my shit posts), but I don’t particularly want to be in a community that’s overrun with bots posting.
Yeah, after all, we post on the internet for it to be visible by everyone, and that includes bots. If we didn’t want bots to find our content, then other humans couldn’t find them either; that’s my stance on this.
You’re right that it won’t be completely undesirable for bots, ever. However, you can make it less desirable, to the point that the botters say “meh, who cares? That other site is better to bot”.
I’ll give you an example. Suppose the following two social platforms:
- Orange Alien: large userbase, overexcited about consumption, people get banned for mocking brands, the typical user is as tech-illiterate enough to confuse your bot with a human.
- White Rat: Small userbase, full of communists, even the non-communists tend to outright mock consumption, the typical user is extremely tech-savvy so they spot and report your bot all the time.
If you’re a botter advertising some junk, you’ll probably want to bot in both platforms, but that is not always viable - coding the framework for the bots takes time, you don’t have infinite bandwidth and processing power, etc. So you’re likely going to prioritise Orange Alien, you’ll only bot White Rat if you can spare it some effort+resources.
The main issue with point #1 is that there’s only so much room to make the environment unattractive to bots before doing it for humans too. Like, you don’t want to shrink your userbase on purpose, right? You can still do things like promoting people to hold a more critical view, teaching them how to detect bots, asking them to report them (that also helps with #4), but it only goes so far.
[Sorry for the wall of text.]
This is the sort of thoughtful reasoning that I’m glad to see here, so a wall of text was warranted! Thanks for taking the time to add to the discussion 👍🙏