After much debate, the new policy is in effect: Wikipedia authors are not allowed to use LLMs for generating or rewriting article content. There are two primary exceptions, though.
First, editors can use LLMs to suggest refinements to their own writing, as long as the edits are checked for accuracy. In other words, it’s being treated like any other grammar checker or writing assistance tool. The policy says, “ LLMs can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited.”
The second exemption for LLMs is with translation assistance. Editors can use AI tools for the first pass at translating text, but they still need to be fluent enough in both languages to catch errors. As with regular writing refinements, anyone using LLMs also has to check that incorrect information hasn’t been injected.
The takeaway from all LLM-based AI is the user needs to be smart enough to do whatever they’re asking anyway. All output needs to be verified before being used or relied upon.
The “AI” is just streamlining the process to save time.
Relying on it otherwise is stupid and just proves instantly that you are incompetent.
the user needs to be smart enough to do whatever they’re asking anyway
I’m gonna say that’s ideal but not quite necessary. What’s needed is that the user is capable of properly verifying the output. Which anyone who could do it themselves definitely can, but it can be done more broadly. It’s an easier skill to verify a result than it is to obtain that result. Think: how film critics don’t necessarily need to be filmmakers, or the P=NP question in computer science.
This is where domain expertise would come in, no? It’s speeding up the work but it usually outputs generic content, and whatever else it injects while hallucinating. Therefore the validation part holds up I’d say.
But if the output has issues, what’re you going to do, prompt it again? If you are only able to verify but not do the task, you cannot correct the AI’s mistakes yourself.
If you’re unable to brute-force verification (research, testing, consulting the ancient texts), there’s where you stop what you’re doing, and take a breath. Then, consult an expert. Just like the film critic analogy, it’s easier to verify than to create, so you’re saving the expert time and effort while learning about something that you were obviously already passionate enough about to have started this endeavor.
At the risk of sounding like an overly obsequious AI… You know what, you’re completely right. I’m honestly not sure what use case I was imagining when I wrote that last comment.
You were thinking logically about a normal production chain. In that case, QA or whoever says “This is wrong, rework it and correct the issue” and that’s that. With AI, it does the whole thing over again and may or may not come back with the same issue or an entirely new one.
Seems pretty reasonable to use it as a grammar checker. As long as it’s not changing content, just form or readability, that seems like a pretty decent use for it, at least with a purely educational resource like Wikipedia.
Seems like there should be a third exception. For those occasions where the article is about LLM generated text. They should be able to quote it when it’s appropriate for an article.
That is a reasonable exception to no-AI policies in research papers and newspaper articles, but not for Wikipedia. As a tertiary source, Wikipedia has a strict “no original research” policy. Using AI to provide examples of AI output would be original research, and should not be done.
Quoting AI output shared in primary and secondary sources should be allowed for that reason, though.
Eh, that’s not quite original research. There are plenty of other examples of images and sound files created for Wikipedia. A representative example isn’t research, it’s just indicating what something is.
The Wikipedia article on AI slop and generative AI has a few instances of content that’s representative to illustrate a sourced statement, as opposed to being evidence or something.
It’s similar to the various charts and animations.
Saved you a click:
Treating it like a tool instead of treating it like a God. What a novel idea !
AIbros: we’re creating God!!!
AI users: it can do translation & reformating pretty well but you got to check it’s not chatting shit
The takeaway from all LLM-based AI is the user needs to be smart enough to do whatever they’re asking anyway. All output needs to be verified before being used or relied upon.
The “AI” is just streamlining the process to save time.
Relying on it otherwise is stupid and just proves instantly that you are incompetent.
I’m gonna say that’s ideal but not quite necessary. What’s needed is that the user is capable of properly verifying the output. Which anyone who could do it themselves definitely can, but it can be done more broadly. It’s an easier skill to verify a result than it is to obtain that result. Think: how film critics don’t necessarily need to be filmmakers, or the P=NP question in computer science.
This is where domain expertise would come in, no? It’s speeding up the work but it usually outputs generic content, and whatever else it injects while hallucinating. Therefore the validation part holds up I’d say.
But if the output has issues, what’re you going to do, prompt it again? If you are only able to verify but not do the task, you cannot correct the AI’s mistakes yourself.
If you’re unable to brute-force verification (research, testing, consulting the ancient texts), there’s where you stop what you’re doing, and take a breath. Then, consult an expert. Just like the film critic analogy, it’s easier to verify than to create, so you’re saving the expert time and effort while learning about something that you were obviously already passionate enough about to have started this endeavor.
As someone who codes, it’s not always easier to verify than to create.
At the risk of sounding like an overly obsequious AI… You know what, you’re completely right. I’m honestly not sure what use case I was imagining when I wrote that last comment.
You were thinking logically about a normal production chain. In that case, QA or whoever says “This is wrong, rework it and correct the issue” and that’s that. With AI, it does the whole thing over again and may or may not come back with the same issue or an entirely new one.
To save you another few clicks: this is the discussion (RfC) that implemented the changes, and the policy is linked at the top.
Seems pretty reasonable to use it as a grammar checker. As long as it’s not changing content, just form or readability, that seems like a pretty decent use for it, at least with a purely educational resource like Wikipedia.
So, it should be used reasonably, as it should have always been.
Seems like there should be a third exception. For those occasions where the article is about LLM generated text. They should be able to quote it when it’s appropriate for an article.
That is a reasonable exception to no-AI policies in research papers and newspaper articles, but not for Wikipedia. As a tertiary source, Wikipedia has a strict “no original research” policy. Using AI to provide examples of AI output would be original research, and should not be done.
Quoting AI output shared in primary and secondary sources should be allowed for that reason, though.
Eh, that’s not quite original research. There are plenty of other examples of images and sound files created for Wikipedia. A representative example isn’t research, it’s just indicating what something is.
The Wikipedia article on AI slop and generative AI has a few instances of content that’s representative to illustrate a sourced statement, as opposed to being evidence or something.
It’s similar to the various charts and animations.