OpenAI has developed technology to reliably detect AI-generated text, according to inside sources and documents reported by the Wall Street Journal. However, the company is reluctant to release it, likely due to concerns about its own business model.
The issue with that is: Releasing nothing is even worse than releasing something that could be circumvented. I don’t see this as a valid argument.
I’m not an expert on text watermarking and how that degrades output. But if they want some stealthy solution that isn’t known to the public… Maybe they could attach two watermarks. A simple one that is known to everyone, and an additional, secret one only they know about. It’d be similar to what we do with bank notes. There are some characteristics everyone knows and can use to judge if it’s fake money. And they have some additional secret markings in banknotes that only the central bank knows about.
I’m pretty sure a similar thing could be done here. Maybe not for a 280 character tweet. But certainly for other use-cases with longer texts. And in case it has a 0% false positive rate, every match helps someone. Even if it’s circumventable. I think even a non-perfect solution that helps several thousands of people is better than helping no-one.
I agree with not releasing it, but I do find that it defeats the purpose talking about it because if you have it but aren’t sharing if what’s the point of having it
The issue with that is: Releasing nothing is even worse than releasing something that could be circumvented. I don’t see this as a valid argument.
I’m not an expert on text watermarking and how that degrades output. But if they want some stealthy solution that isn’t known to the public… Maybe they could attach two watermarks. A simple one that is known to everyone, and an additional, secret one only they know about. It’d be similar to what we do with bank notes. There are some characteristics everyone knows and can use to judge if it’s fake money. And they have some additional secret markings in banknotes that only the central bank knows about.
I’m pretty sure a similar thing could be done here. Maybe not for a 280 character tweet. But certainly for other use-cases with longer texts. And in case it has a 0% false positive rate, every match helps someone. Even if it’s circumventable. I think even a non-perfect solution that helps several thousands of people is better than helping no-one.
I agree with not releasing it, but I do find that it defeats the purpose talking about it because if you have it but aren’t sharing if what’s the point of having it