Wikipedia talk:WikiProject AI Cleanup/Archive 3

This is an archive of past discussions about Wikipedia:WikiProject AI Cleanup. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

New edit filters

After a few days of discussion at Wikipedia:Edit filter noticeboard and Wikipedia:Edit filter/Requested, we now have two new AI-related edit filters, and a big update to an existing one!

Special:AbuseFilter/1346 now catches more text from LLM-generated citations, such as oai_citation, contentReference and turn0search0.
Special:AbuseFilter/1369 looks for Markdown-formatted text, which is natively generated by LLMs and often directly copy-pasted.
Special:AbuseFilter/1370 logs spurious actions related to AfC templates, such as "fake declines" sometimes generated alongside drafts.

Chaotic Enby (talk · contribs) 22:48, 16 July 2025 (UTC)

Thanks for doing the groundwork to get these filters up and running. With limited volunteer time, we need automated tools like these to help address an automated problem. — Newslinger talk 12:47, 17 July 2025 (UTC)

Idea lab: New CSD criteria for LLM content

RfC launched--thanks for the help revising the criterion!

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

There have been multiple proposals for a new CSD criteria for patently LLM-generated articles [1], but they failed gain much traction due to understandable concerns about enforce-ability and redundancy with WP:G3.

This time, I thinking of limiting the scope to LLM-generated text that were obviously not reviewed by a human. The criteria could include some of the more surefire WP:AITELLS such as collaborative communication and non-existent references, which would have been weeded out if reviewed by a human. I think it would help to reduce the high bar set by WP:G3 (hoax) criteria and provide guidance on valid ways of detecting LLM generations and what is and is not valid use of LLMs.

Here is my rough draft of the above idea; feedback is welcome.

A12. LLM-generated without human review

This applies to any article that obviously indicates that it was generated by a large language model (LLM) and no human review was done on the output. Indicators of such content include collaborative communication (e.g. "I hope this helps!"), non-existent references, and implausible citations (e.g. source from 2020 being cited for a 2022 event). The criterion should not be invoked merely because the article was written with LLM assistance or because has reparable tone issues.

{{Db-a12}}, {{Db-ai}}, {{Db-llm}}

Ca ^{talk to me!} 00:50, 18 July 2025 (UTC) Update: I have posted a revised version below based on feedback. 15:59, 19 July 2025 (UTC)

Oppose. This is very vague and would see a lot of disagreement based on differing subjective opinions about what is and isn't LLM-generated, what constitutes a "human review" and what "tone issues" are repairable. Secondly, what about repairable issues that are not related to tone?

I could perhaps support focused, objective criteria that cover specific, identifiable issues, e.g. "non-existent or implausible citations" rather than being based on nebulous guesses about the origin (which will be used to assume bad faith of the contributor, even if the guess was wrong). Thryduulf (talk) 01:21, 18 July 2025 (UTC)

If it's limited to only cases where there is obvious WP:AITELLS#Accidental disclosure or implausible sources it could be fine. Otherwise I agree with Thryduulf with the vagueness; an editor skimming through the content but not checking any of the sources counts as a "human review". And sources that may seem non-existent at first glance might in fact do exist. I think the "because has reparable tone issues" should go as well since if it's pure LLM output, we don't want it even if the tone is fine. Jumpytoo Talk 04:33, 18 July 2025 (UTC)

Ca, I am very supportive of anything that helps reduce precious editor time wasted on content generated by LLMs that cannot be trusted. For a speedy deletion criteria, I think that we would need a specific list of obvious signs of bad LLM generation, something like:

collaborative communication
- for example, "I hope this helps!"
knowledge-cutoff disclaimers
- for example, "Up to my last training update"
prompt refusal
- for example, "As a large language model, I can't..."
non-existent / invented references
- for example, books whose ISBNs raise a checksum error, unlisted DOIs
implausible citations
- for example, a source from 2020 being cited for a 2022 event

And only those signs may be used to nominate for speedy deletion. Are there others? Maybe those very obvious criteria that are to be used could be listed at the top of WP:AISIGNS rather than within the CSD documentation, to allow for future updating.

The other thing that comes to mind with made-up sources or implausible citations is, how many of them must there be to qualify for speedy deletion? What if only one out of ten sources was made up? Cheers, SunloungerFrog (talk) 09:48, 18 July 2025 (UTC)

Regarding the number of sources, I don't think it matters – editors are expected to have checked all the sources they cite, and using AI shouldn't be an excuse to make up sources. If even one source is made up, we can't guarantee that the other sources, even if they do exist, support all the claims they are used for. Chaotic Enby (talk · contribs) 10:06, 18 July 2025 (UTC)

I'd be very happy with that. I only mentioned it because I imagine there might be a school of thought that would prefer more than one source to be made up, to cement the supposition that the article is an untrustworthy LLM generation. Cheers, SunloungerFrog (talk) 11:21, 18 July 2025 (UTC)

If someone deliberately makes up an entire source, that's just as much of an issue in my opinion. In both cases, all the sources will need to be double-check as there's no guarantee anymore that the content is in any way consistent with the sources. I wouldn't be opposed to expanding G3 (or the new proposed criterion) to include all cases of clear source fabrication by the author, AI or not. Chaotic Enby (talk · contribs) 11:42, 18 July 2025 (UTC)

I would also support it, but only for issues that can only plausibly be generated by LLMs and would have been removed by any reasonable human review. So, stylistic tells (em-dashes, word choices, curly apostrophes, Markdown) shouldn't be included.

It is reasonably plausible that an editor unfamiliar with the MOS would try to type Markdown syntax or curly apostrophes, or keep them in an AI output they double-checked. It is implausible that they would keep "Up to my last training update".

I would also tend to exclude ISBN issues from the list of valid reasons, as it is possible that an ISBN might be mistyped by a honest editor, or refer to a different edition. However, if the source plainly doesn't exist at all, it should count. Editors should cross-check any AI-generated output to the sources it claims to have used. Chaotic Enby (talk · contribs) 10:04, 18 July 2025 (UTC)

The main issue with strict tells is that they may change over time as llms update. They'll probably change at a slow enough rate and within other factors that means editors would be able to stay mostly abreast of them, but I'm not sure CSD criteria could keep up. What may help with or without a CSD is perhaps a bit of expansion at the WP:TNT essay on why llm-generated articles often need to be TNTed, which helps make clear the rationale behind any PROD, CSD, or normal MFD. CMD (talk) 10:20, 18 July 2025 (UTC)

I think lot of the WP:TNT-worthy AI issues (dead on arrival citations, generic truthy content attached to unrelated citations, malformed markup, etc) can be addressed by just removing the AI content, then seeing if the remaining content is enough to save the article from WP:A3/WP:A7/etc. -- LWG ^talk 16:16, 18 July 2025 (UTC)
If the article is generated by AI, then it is all AI content. Removing the AI content would be TNT. CMD (talk) 16:57, 18 July 2025 (UTC)
The ideal procedure on discovering something like this is:
Remove all the actively problematic content that can only be fixed by removal (e.g. non-existent and/or irrelevant citations)

Fix and/or remove any non-MediaWiki markup

Evaluate what remains:
If it is speedily deletable under an existing criterion (A1, A3, A7/A9, A11 and G3 are likely to be the most common), then tag it for speedy deletion under the relevant criterion

If it would be of benefit to the project if cleaned up, then either clean it up or mark it for someone else to clean up.

If it isn't speedily deletable but would have no value to the project even if cleaned up, or TNT is required then PROD or AfD.
If there are a lot of articles going to PROD or AfD despite this then propose one or more new or expanded CSD criteria at WT:CSD that meets all four of the requirements at WP:NEWCSD. In all of this it is important to remember that whether it was written by AI or not is irrelevant - what matters is whether it is encyclopaedic content or not. Thryduulf (talk) 18:58, 18 July 2025 (UTC)
But I think that whether it's written by AI is relevant. On an article written by a human, it's reasonable to assume good faith. On an article written by an AI, one cannot assume good faith, because they are so good at writing convincing sounding rubbish, and so, e.g., the job of an NPP reviewer is hugely disproportionately more work, to winkle out the lies, than it took the creating editor in the first place to type a prompt into their LLM of choice. And that's the insidious bit, and why we need a less burdensome way to deal with such articles. Cheers, SunloungerFrog (talk) 19:16, 18 July 2025 (UTC)
If you are assuming anything other than good faith then you shouldn't be editing Wikipedia. If the user is writing in bad faith there will be evidence of that (and using an LLM is not evidence of any faith, good or bad) and so no assumptions are needed. Once text has been submitted there are exactly three possibilities:
The text is good and encyclopaedic how it is. In this situation it's irrelevant who or what wrote it because it's good and encyclopaedic.

The text needs some cleanup or other improvement but it is fundamentally encyclopaedic. In this situation it's irrelevant who or what wrote it because, when the cleanup is done (by you or someone else, it doesn't matter) it is good and encyclopaedic.

The text, even if it were cleaned up, would not be encyclopaedic. In this situation it's irrelevant who wrote it because it isn't suitable for Wikipedia either way. Thryduulf (talk) 19:38, 18 July 2025 (UTC)

I agree with your core point that content problems, not content sources, are what we should be concerned about, and my general approach to LLM content is what you described as the ideal approach above, but I would point out that assumption of good faith can only be applied to a human. In the context of content that appears to be LLM-generated, AGF means assuming that the human editor who used the LLM reviewed the LLM content for accuracy (including actually reading the cited sources) before inserting it in the article. If the LLM text has problems that any human satisfying WP:CIR would reasonably be expected to notice (such as the cited sources not existing or being irrelevant to the claims), then the fact that those problems weren't noticed tells me that the human didn't actually review the LLM content. Once I no longer have reason to believe that a human has reviewed a particular piece of LLM content, I have no reason to apply AGF to that content, and my presumption is that such content fails WP:V, especially if I am seeing this as a pattern across multiple edits for a given article or user. -- LWG ^talk 20:05, 18 July 2025 (UTC)
assumption of good faith can only be applied to a human - exactly, and I'm always delighted to apply AGF to fellow human editors. But not to ChatGPT or Copilot, etc. Cheers, SunloungerFrog (talk) 20:18, 18 July 2025 (UTC)
We have seen plenty of instances of good faith users generating extremely poor content. Good faith isn't relevant to the content, it's relevant to how the content creator (behind the llm, not the llm itself) is addressed. CMD (talk) 14:41, 19 July 2025 (UTC)

You should not be applying faith of any sort (good, bad, indifferent it doesn't matter) to LLMs because they are incapable of contributing in any faith. The human who prompts the LLM and the human who copies the output to Wikipedia (which doesn't have to be the same human) have faith, but that faith can be good or bad. Good content can be added in good or bad faith, bad content can be added in good or bad faith. Thryduulf (talk) 18:36, 19 July 2025 (UTC)
Support for articles composed of edits with indicators that are very strongly associated with LLM-generated content, such as the ones listed in WP:AISIGNS § Accidental disclosure and WP:AISIGNS § Markup. I would also apply the criterion to less obvious hoax articles that cite nonexistent sources or sources that do not support the article content, if the articles also contain indicators that are at least moderately associated with LLM-generated content, such as the ones listed in WP:AISIGNS § Style. — Newslinger talk 21:34, 18 July 2025 (UTC)
Support: Using a model to generate articles is fast, reviewing and cleaning it up is slow. This asymmetry in effort is a genuine problem which this proposal would help address. There is also a policy hole of sorts: An unreviewed generated edit with fatal flaws made to an existing article can be reverted, placing the burden to carefully review and fix the content back on the original editor. An unreviewed generated edit with fatal flaws made to a new page cannot. Promo gets G11, I don't see why this shouldn't get a criteria also.

I also support the distinction that Chaotic Enby has made that candidate edits should be ones "that can only plausibly be generated by LLMs and would have been removed by any reasonable human review". fifteen thousand two hundred twenty four (talk) 23:21, 18 July 2025 (UTC)

Also adding that assessing whether an article's prose is repairable or not, in the context of G11, is also a judgement call to some extent. So I don't believe that deciding whether issues are repairable should be a complete hurdle to a new criterion, although I still prefer to play it safe and restrict it to my stricter distinction above. Chaotic Enby (talk · contribs) 23:36, 18 July 2025 (UTC)

Agreed, and its not just G11 that requires judgement: G1, G3, G4 and G10 all do to differing extents. Good luck to anybody who tries to rigorously define what "sufficiently identical" means for G4. fifteen thousand two hundred twenty four (talk) 23:51, 18 July 2025 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

RfC workshop

Thanks for all the feedback! I have created a revised criteria with areas of vagueness ironed out and incorporating wordings proposed by User:Chaotic Enby and User:SunloungerFrog. I hope to finalize the criterion wording before I launch a formal RfC.

A12. LLM-generated without human review

This applies to any article that exhibits one or more of the following signs which indicate that the article could only plausibly have been generated by Large Language Models (LLM)^[1] and would have been removed by any reasonable human review:^[2]

Communication intended for the user: This may include collaborative communication (e.g., "Here is your Wikipedia article on..."), knowledge-cutoff disclaimers (e.g., "Up to my last training update ..."), self-insertion (e.g., "as a large language model"), and phrasal templates (e.g., "Smith was born on [Birth Date].")
Implausible non-existent references: This may include external links that are dead on arrival, ISBNs with invalid checksums, and unresolvable DOIs. Since humans can make typos and links may suffer from link rot, a single example should not be considered definitive. Editors should use additional methods to verify whether a reference truly does not exist.
Nonsensical citations: This may include citations of incorrect temporality (e.g a source from 2020 being cited for a 2022 event), DOIs that resolve to completely unrelated content (e.g., a paper on a beetle species being cited for a computer science article), and citations that attribute the wrong author or publication.

In addition to the clear-cut signs listed above, there are other signs of LLM writing that are more subjective and may also plausibly result from human error or unfamiliarity with Wikipedia's policies and guidelines. While these indicators can be used in conjunction with more clear-cut indicators listed above, they should not, on their own, serve as the sole basis for applying this criterion.

This criterion only applies to articles that would need to be fundamentally rewritten to remove the issues associated with unreviewed LLM-generated content. If only a small portion of the article exhibits the above indicators, it is preferable to delete the offending portion only.

{{Db-a12}}, {{Db-ai}}, {{Db-llm}}
Category:Candidates for speedy deletion as unreviewed LLM-generated content (1)

References

^ The technology behind AI chatbots like ChatGPT and Google Gemini
^ Here, "reasonable human review" means that a human editor has 1) thoroughly read and edited the LLM-generated text and 2) verified that the generated citations exist and verify corresponding content. For example, even a brand new editor would recognize that a user-aimed message like "I hope this helps!" is wholly inappropriate for inclusion if they had read the article carefully. See also Wikipedia:Large language models.

To notify: WP:NPP, WP:AFC, WP:LLMN, T:CENT, WP:VPP, WT:LLM

— Preceding unsigned comment added by Ca (talk • contribs) 16:05, 19 July 2025 (UTC)

Discussion

I don't agree with the last section requiring articles need to be "fundamentally rewritten to remove the issues associated with unreviewed LLM-generated content", it largely negates the utility of the criteria. If there are strong signs that the edits which introduced content were not reviewed, that should be enough, otherwise it is again shifting the burden to other editors to perform review and fixes on what is raw LLM output. A rough alternate suggestion:

"This criterion only applies to articles where, according to the above indicators, a supermajority of their content is unreviewed LLM-generated output. ~~If only a small portion of the article indicates it was unreviewed, it is preferable to delete the offending portion only.~~" (struck as redundant and possibly confusing) fifteen thousand two hundred twenty four (talk) 16:46, 19 July 2025 (UTC)

I agree that if content shows the fatal signs of unreviewed LLM use listed above then we shouldn't put the onus on human editors to wade through it to see if any of the content is potentially salvageable. If the content is that bad, it's likely more efficient to delete the offending content and rewrite quality content from scratch. So we lose nothing by immediate deletion, and by requiring a larger burden of work prior to nomination we increase the amount of time this bad content is online, potentially being mirrored and contributing to citogenesis. LLM content is already much easier to create and insert than it is to review, and that asymmetry threatens to overwhelm our human review capacity. As one recent example, it took me hours to examine and reverse the damage done by this now-blocked LLM-using editor even after I stopped making any effort to salvage text from them that had LLM indicators. Even though that user wasn't creating articles and therefore wouldn't be touched by this RFC, that situation illustrates the asymmetry of effort between LLM damage and LLM damage control that necessitates this kind of policy action. -- LWG ^talk 17:21, 19 July 2025 (UTC)

I would also like to suggest an indicator for usage of references that, when read, clearly do not support their accompanying text. I've often found model output can contain references to real sources that are broadly relevant to the topic, but which obviously do not support the information given. An article making pervasive use **Not just** of these — but also — “other common signs” [1], is a very strong indicator of unreviewed model-generated text. Review requires reading sources after all. fifteen thousand two hundred twenty four (talk) 17:27, 19 July 2025 (UTC)

I agree with the idea of the criterion, although I agree with User:fifteen thousand two hundred twenty four that the burden shouldn't be on the editor tagging the article. It's a question of equivalent effort: if little effort was involved in creating (and not reviewing) the article, then little effort should be expected in cleaning it up before tagging it for deletion. Or, in other words, what can be asserted without evidence can also be dismissed without evidence.

However, I also have an issue with the proposal of only deleting the blatantly unreviewed portions. If the whole article was written at once, and some parts show clear signs of not having been reviewed, there isn't any reason to believe that the rest of the article saw a thorough review. In that case, the most plausible option is that the indicators aren't uniformly distributed, instead of the more convoluted scenario where part of the AI output was well-reviewed and the rest was left completely unreviewed. Chaotic Enby (talk · contribs) 19:06, 19 July 2025 (UTC)

"I also have an issue with the proposal of only deleting the blatantly unreviewed portions ... " – Agree with this completely. I attempted to address this with my suggestion that "This criterion only applies to articles where, according to the above indicators, a supermajority of their content is unreviewed LLM-generated output." (I've now struck the second maladapted sentence as redundant and possibly confusing.)

It deliberately doesn't ask that indicators be thoroughly distributed or have wide coverage, just that they exist and indicate a majority of the article is unreviewed, aka "the most plausible option" you mention. But the clarity is absolutely lacking and I'm not happy with the wording. Hopefully other editors can find better ways to phrase it. fifteen thousand two hundred twenty four (talk) 19:37, 19 July 2025 (UTC)

How about we simply remove the paragraph? I agree with the concerns raised here, and situations where it would apply would be extremely rare. I think that such exceptional circumstances can be left to common sense judgment. Ca ^{talk to me!} 08:19, 20 July 2025 (UTC)

It should be removed as CSD is for deletion. This CSD would not stop another editor coming in and rewriting the article, just as other CSDs do not. CMD (talk) 08:31, 20 July 2025 (UTC)

New(?) weirdness in Help me requests

For those not familiar with it, the {{Help me}} template can be used to request (sort of) real-time help from editors who watch Category:Wikipedians looking for help or monitor the #wikipedia-en-help IRC channel. In the past 24 hours I've found two requests with a new-to-me pattern that includes what look like variable placeholders $2 and $1 at the start and end of the request: Special:Permalink/1301173503, Special:Permalink/1301272334.

Has anyone else seen this kind of pattern elsewhere? ClaudineChionh (she/her · talk · email · global) 00:56, 19 July 2025 (UTC)

Interesting, found one more with an opening "$2", but not a closing "$1" at Special:PermanentLink/1301015012#Help me! 2.

Here is the search I used which also finds the two you linked. Unsure what would cause this. fifteen thousand two hundred twenty four (talk) 01:44, 19 July 2025 (UTC)

Looks like this was an error introduced by @Awesome Aasim in Special:Diff/1300926998 and fixed in Special:Diff/1301447495 – so not an LLM at all. ClaudineChionh (she/her · talk · email · global) 03:36, 20 July 2025 (UTC)

I was working on the unblock wizard and working on the preloads as fallbacks in case the unblock wizard does not work. If I knew all the links that use the help me preloads I can reinstate my change and update them all to the new format. Alternatively I can create a second preload template with parameters that can be filled in. Aasim (話す（はなす）) 03:58, 20 July 2025 (UTC)

(This is just a note, not a chastisement) If you are changing a preload/script/etc, always do an insource: search for the page name. Always interesting the places people use things. Primefac (talk) 09:38, 20 July 2025 (UTC)

Okay let me quicklink to that for future reference when I get back to my computer. Special:Search/insource:Help:Contents/helpmepreload Aasim (話す（はなす）) 12:55, 20 July 2025 (UTC)

Oh my God. That is linked dozens of times. This should be protected as high risk. I will create a separate template to allow for use by the unblock wizard. Aasim (話す（はなす）) 13:00, 20 July 2025 (UTC)

It's semi-protected and has never been vandalised, I think we're okay. Primefac (talk) 13:42, 20 July 2025 (UTC)

Discouraging AI use in Article Wizard

Done -- Sohom (talk) 11:13, 30 July 2025 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Wikipedia:Article_wizard/CommonMistakes has a list of practices to avoid when creating articles. I wonder whether we might add another bullet to discourage LLM use. Something like:

Using AI to write articles
Although large language models like ChatGPT might create articles that look OK on the surface, the content they generate is untrustworthy. Ideally, don't use them at all.

Grateful for others' thoughts. Cheers, SunloungerFrog (talk) 14:06, 20 July 2025 (UTC)

That would definitely help, especially with the amount of AI content we've been seeing at AfC. The "look OK" part might be a bit too informal, maybe it could be replaced by might create articles that appear well-written? Chaotic Enby (talk · contribs) 14:22, 20 July 2025 (UTC)

I'd strongly support this, with @Chaotic Enby's wording. qcne (talk) 14:47, 20 July 2025 (UTC)

With look OK I had intended to encompass both nice prose and decently sourced, and I wonder whether your wording, Chaotic Enby leans towards the former rather than the latter? But that is maybe dancing on the head of a pin, and I'm happy enough with the suggested amendment. Cheers, SunloungerFrog (talk) 15:00, 20 July 2025 (UTC)

That is indeed a relevant aspect, maybe appear suitable for Wikipedia would also cover both? Chaotic Enby (talk · contribs) 15:21, 20 July 2025 (UTC)

Brilliant, yes! Cheers, SunloungerFrog (talk) 15:22, 20 July 2025 (UTC)

Seems a good idea in general to have some sort of advice, many are not aware of the potential problems in llm output. CMD (talk) 15:03, 20 July 2025 (UTC)

There is a danger associated with this type of general warning. Some new editors have not used AI to write articles because they have not thought of this possibility. So the warning could have the opposite effect on some of them by making them aware. Phlsph7 (talk) 17:09, 20 July 2025 (UTC)

That is a fair point. I suppose I'd rebut it by noting that the Article Wizard page in question has warnings about several undesirable article creation practices (COI, copyvio, puffery, poor sourcing) and all I'm proposing is that we add another to those. If we were concerned that such warnings would cause editors to exhibit such behaviours, I imagine we would not have the page at all? My sense - not objective - is that a warning about using LLMs would deter well-meaning editors, who might have used them thoughtlessly, from using them at all, and that that would be a net benefit. Cheers, SunloungerFrog (talk) 06:17, 21 July 2025 (UTC)

Should we also warn that a chatbot can also help you submit and decline your own draft in one easy step? I just saw another example of that via WP:AFCHD. ClaudineChionh (she/her · talk · email · global) 22:48, 20 July 2025 (UTC)

While that could be helpful, current work on Special:AbuseFilter/1370 (which deals with these fake declines) might hopefully make that issue moot soon enough. Chaotic Enby (talk · contribs) 22:59, 20 July 2025 (UTC)

@ClaudineChionh, I would probably say not, noting that the other statements on that page are quite short. Presumably in the hope that they don't get ignored, and that the core message sticks. Maybe we should consider some more extensive words about not using LLMs in Help:Your first article § Writing your draft or similar places? Cheers, SunloungerFrog (talk) 06:25, 21 July 2025 (UTC)

Support this point looks good. I do think we should discourage the use of generative AI for article creation. There are good uses for copyediting and wording fixes but otherwise generating something from scratch is not a good idea. Aasim (話す（はなす）) 17:28, 21 July 2025 (UTC)

Done with slightly different wording. -- Sohom (talk) 10:29, 27 July 2025 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Jimbo Wales' idea on improving ACFH using AI

See User talk:Jimbo Wales#An AI-related idea, if anyone wants to give feedback. qcne (talk) 11:25, 18 July 2025 (UTC)

"Am I so out of touch? No, it's the children who are wrong." Apocheir (talk) 19:07, 18 July 2025 (UTC)

User consistently using AI

User:EditorSage42 popped in march of this year. Initally, they don't seem to be using AI to create articles and talk comments. The first contribution that seems suspicious is here 1, usually you would see someone write a paragraph, but this seems to seems to be ai generated. Surely, when you run it through zerogpt it pops out as 35% ai generated. Again, we see [another contribution] where there are lists where it doesn't seem like a person would incorporate them. "Alternative: If retention is preferred, consider merger with Probability theory or related articles per WP:MERGE." clearly hasn't read probability theory article, there aren't many mentions of researchers on it. Here you see the same person badgering some poor person who rightly voted keep. There are many signs of ai use here "fundamental issue" "Most tellingly" "You're right to call out my errors, and I apologize for the repetitive approach. However" AI often doubles down after apoligizing. Most importantly, you see EditorSage submitting two of the same articles about AI book generation here and here. The second one wasn't even submitted, they just created it outright. Basically the third edit on this account was done using Ai. Then, almost every edit except the ones that were just adding links was ai generated, you can run any of their responses into gptzero, turnitin, or whatever and you will see that the response is completely ai geneerated at worse or polished with ai at best. The editing pattern for both the ai literature articles also seem suspicious because they were both just created in one fell swoop. The various articles that the user often cites seem either like ai generation, or a very bad grasp on the reliable sources guideline. I think the correct action is to block this single purpose account that continues to use ai, hallucinating various things up and possibly violating copyright. Easternsahara (talk) 14:35, 21 July 2025 (UTC)

I would take this to Wikipedia:Administrators' noticeboard/Incidents. qcne (talk) 14:44, 21 July 2025 (UTC)

The user in question has just been blocked for "obvious AI misuse". Cheers, SunloungerFrog (talk) 15:18, 21 July 2025 (UTC)

Discussion at Wikipedia talk:Speedy deletion § RFC: New CSD for unreviewed LLM content

You are invited to join the discussion at Wikipedia talk:Speedy deletion § RFC: New CSD for unreviewed LLM content. Ca ^{talk to me!} 17:08, 21 July 2025 (UTC)

AI-generated and article for deletion?

Hi, everyone. According to discussions on fr-WP, the article Farm Management is generated by AI and uses a title which was already used by Farm management (without a capital letter for "management") redirecting to Agricultural science. On the French Wiki, a discussion for deletion is underway and what would be the procedure here? Is using Template:AI-generated and Template:Article_for_deletion ok ? Fabius Lector (talk) 10:56, 30 July 2025 (UTC)

I've WP:BLARed it. Sohom (talk) 11:12, 30 July 2025 (UTC)

Request to review articles for AI hallucination issues

I work for Arabella Advisors, a D.C.-based consulting company, and I just posted a long message on the AA Talk page outlining glaring errors on the AA and New Venture Fund articles. Some of these errors seem to be indicative of AI hallucinations, as there are numerous instances where the cited sources don't support the footnoted claim. Is this something that experienced editors here could review? Any help would be appreciated. JJ for Arabella (talk) 19:25, 30 July 2025 (UTC)

I see no evidence that the concerns raised at Talk:Arabella Advisors stem from LLM usage. The 990 claim currently in the article is from 2020 [2] and appears based on an earlier incarnation of the claim that existed in the initial version of the page [3], which was removed [4] after a different coi editreq brought attention to the unreliable sources supporting it [5].

That is one raised issue, a review of the others does not indicate LLM usage either. Stating that the use of the term "Subsidiaries" vs "Clients" may be a hallucination is quite a leap, and the New Venture Fund lede sourcing problem can be easily attributed to WP:SYNTH (see the entry for Eric Kessler [6]). I see that you asked about the latter at the Teahouse without providing specific examples, but Cullen328 still advised that such errors can stem from original research [7].

People have been getting things wrong for as long as they've existed, no model use required, and no model use is evident here. fifteen thousand two hundred twenty four (talk) 21:05, 30 July 2025 (UTC)

Thank you for the quick response, Fifteen thousand two hundred twenty four. Your argument that the factual inaccuracies and citation errors that I flagged don't stem from LLM usage makes sense and is honestly reassuring. If these issues are simply a reflection of sloppy research then hopefully they can be addressed by reviewing editors. Thank you again for your response and sorry for the false alarm! JJ for Arabella (talk) 14:31, 31 July 2025 (UTC)

AI images of historical figures

Hey all. Theres currently a discusion (at Talk:James McQueen (writer)) about adding an AI generated image to an article on a named histotrical figure, which in my view goes very much against WP:AIIMAGE. However, the uploader is insisting that they is some exception becasue there's no free image (which again is covered by WP:AIIMAGE). If somebody with more expertise in this area wants to take a look and tell me if I'm off base here that would be very helpful. Cakelot1 ☞️ talk 11:47, 2 August 2025 (UTC)

If there was one thing with overwhelming consensus in that RFC, it was that AI generated images should not be used to depict actual real-world people under any circumstances. -- LWG ^talk 20:09, 2 August 2025 (UTC)

User:EncycloSphere

I am concerned about the edits by User:EncycloSphere, and left a talk page message for them. That editor's response stated that AI was used in the drafting, but "content quality and sources matter more than method". I also discussed this here with User:Chaotic Enby. My concern is the unencyclopedic tone of the enormous edits being made. Thank you! --Magnolia677 (talk) 10:14, 2 August 2025 (UTC)

To note, "content quality and sources matter more than method" is a weak argument when, as I pointed out, some sources like Mexico Meets Paris: The Rise of Haute Taquerías in Special:Diff/1302780528 don't seem to exist at all (even after looking for an archived version). Chaotic Enby (talk · contribs) 11:01, 2 August 2025 (UTC)

Spot checking their edit history it seems all AI. Unreviewed too, the diff Chaotic posted is all dead links and fails WP:V. Jumpytoo Talk 05:49, 3 August 2025 (UTC)

Could someone look at the contributions of MaineMax04843 and confirm/infirm my suspicions?

I found clear evidence of AI slop in their recent contributions, and I suspect many if not most of their older ones also including AI slop in them.

Could someone look that that contribution history and sanity check me here? I don't want to escalate prematurely. Headbomb {t · c · p · b} 00:04, 4 August 2025 (UTC)

@Magnolia677: this may interest you. Headbomb {t · c · p · b} 00:05, 4 August 2025 (UTC)

I looked at one of their earliest edits, to third culture kid. The doi for new reference Tan, Koh, & Lim 2021 goes to a different paper by different authors on a related topic, and the reference appears not to exist. New reference "The global nomad experience: Living in liminality" exists offline but is dated 2009 when the actual publication date appears to be 1999. New reference Doyen, Dhaene, et al 2016 is given with a doi that goes to an unrelated paper and has a title from a paper by different authors with a different publication year. New reference Lee & Bain 2007 has a doi that goes to an unrelated paper and does not appear to exist. New reference Cottrell 2002 duplicates an existing reference but with a different book title that does not appear to exist, different page numbers, and malformatted citation template. New reference Cariola 2020 has a doi that goes to an unrelated paper and does not appear to exist. At this point I gave up checking the rest, as I was already convinced that this is unchecked AI slop. —David Eppstein (talk) 00:47, 4 August 2025 (UTC)

I reviewed the one article they've created, Midcoast Villager, and did not find any signs of LLM use via faulty references like above. However the History section was copied closely from provided sources, and so I have removed and tagged it for revdel. There is also a WP:CRYSTAL issue since the article relies heavily on a source that predates events that are asserted to have happened, I've elected to draftify it to allow for corrections before reintroduction into articlespace. fifteen thousand two hundred twenty four (talk) 01:14, 4 August 2025 (UTC)

The timing of the edits strongly suggests LLM use, especially considering that they are tagged as mobile web edits. It is highly unlikely that an editor would manually create these seven edits within the span of an hour using the mobile website: Special:Diff/1298369455, Special:Diff/1298368748, Special:Diff/1298367530, Special:Diff/1298367022, Special:Diff/1298365691, Special:Diff/1298364779, Special:Diff/1298363103.

I have partially blocked MaineMax04843 (talk · contribs · count) from article space, and invited them to participate in the discussion here. — Newslinger talk 20:25, 8 August 2025 (UTC)

Draft:Citizen developer

I think this might be of interest to this WikiProject?

While checking if there was an article for Citizen developer, I found this draft that was declined at AfC for being LLM-generated. I think it's salvageable so I'm rewriting it. I'd appreciate some help with this! Rosaece ♡ talk ♡ contributions 22:20, 8 August 2025 (UTC)

My main advice for rewriting this would be to take the sections that are bulleted lists and rewrite them as prose (non-list text), since prose is better understood in most cases. SuperPianoMan9167 (talk) 23:11, 8 August 2025 (UTC)

I'm not sure the topic warrants a separate article and would suggest considering expanding Software development#Workers, Programmer, or Software development process#Examples and making a redirect instead. fifteen thousand two hundred twenty four (talk) 23:19, 8 August 2025 (UTC)

Where to request LLM generated reference checks?

Hi -- Priyanshu.sage has made a lot of edits to various articles in 2024 that are almost certainly AI generated, based on this diff. A lot of these contributions have been revdelled so I can't check them all, but that in and of itself is a warning sign.

Where would I request a reference check here? Apologies for not doing it myself, unfortunately I am unfamiliar with this subject area and am not the best person for that. Gnomingstuff (talk) 19:38, 9 August 2025 (UTC)

Kind of piggybacking off this, how would I do this in general for editors who have created dozens/hundreds of AI edits in the past? I am finding a great deal of users who appear to be serial adders of AI content, but who have been inactive for a year or so so contacting them for confirmation is unlikely to work. User:Vallee is the most recent one whose (massive amounts of) edits I am working through -- see the userpage in this case.

I've been tagging these edits and adding a talk page message, but there are a lot of them to tag, and the pages and talk pages don't seem to be super active. I have not been deleting these sections since I have no real proof besides the userpage and AI writing tells.

Let me know if I should be doing something else. I am sorry for creating more work for people -- although arguably these editors are the ones who created the work and I am just flagging it. Gnomingstuff (talk) 20:26, 9 August 2025 (UTC)

I wonder if we need an AI noticeboard to centralized efforts, like we have for fringe stuff. Headbomb {t · c · p · b} 22:45, 9 August 2025 (UTC)

This talk page was marked as a noticeboard, as per Wikipedia talk:WikiProject AI Cleanup/Archive 2 § Wikipedia talk:WikiProject AI Cleanup/Archive 2#WP:LLMN?, again. As noted in that section, it wasn't based on very much discussion, though. While personally I think it would be helpful to keep project discussion separate from discussion of specific situations, whatever works best for those actively involved is fine. isaacl (talk) 23:11, 9 August 2025 (UTC)

Apologies, thought it would be OK based on the other topics here about individual users. Gnomingstuff (talk) 02:23, 10 August 2025 (UTC)

I think posting about cleaning up instances of AI use to WikiProject AI Cleanup makes perfect sense. This talk page is, in my view, the de facto LLM noticeboard. fifteen thousand two hundred twenty four (talk) 02:41, 10 August 2025 (UTC)

OK, thanks.

Expectation setting: There's probably going to be a lot. The way I'm doing this is searching for combinations of AI tell phrases and then checking the sources/contribution history on the diffs. The current search I am working through has 260 results. And obviously a lot of these will be false positives or inconclusive, but that's just one search. Gnomingstuff (talk) 14:42, 10 August 2025 (UTC)

Possible new indicator of LLM usage? (broken markup)

A draft article that I nominated for G15 speedy deletion has a very strange markup feature in it. The draft, Draft:Aleftina Evdokimova, was obviously generated by ChatGPT because of the "oai_citation" and "utm_source=chatgpt.com" codes, but it also has this strange markup in it attached to every reference, like the other codes:

({"attribution":{"attributableIndex":"1009-1"}})

The four-digit index increases going down the page. Are there any editors that are able to tell what this is? It seems like a possible sign of LLM output, but I'm not so sure of it yet. SuperPianoMan9167 (talk) 22:55, 10 August 2025 (UTC)

I know Reddit isn't a reliable source since it is user-generated, but this post gives a pretty strong confirmation that this is another strange ChatGPT bug: [8] (Also, I realize now that this is just JSON.) SuperPianoMan9167 (talk) 23:00, 10 August 2025 (UTC)

Looking for other pages with "attributableIndex", I couldn't find any, but, given your research, it is pretty likely to be a ChatGPT bug (although the post makes it a bit too early for GPT-5). It should probably be incorporated into Special:AbuseFilter/1346 sooner rather than later. Chaotic Enby (talk · contribs) 23:04, 10 August 2025 (UTC)

I searched for it (as a "find this exact text" search) on Google and pretty much every result that has it also has Markdown and/or "oai_citation" in it. It definitely appears to be a ChatGPT quirk. SuperPianoMan9167 (talk) 23:08, 10 August 2025 (UTC)

I just added it to WP:AISIGNS. Pinging @Sohom Datta to undelete the draft so it can be used as a documentation example there. Chaotic Enby (talk · contribs) 23:11, 10 August 2025 (UTC)

User:Sohom Datta/attributeIndex should be a dump of the text! Sohom (talk) 23:12, 10 August 2025 (UTC)

Thanks a lot! Also finding it used in this fr.wiki diff, although it isn't clear whether it was new text or an en.wiki translation. Chaotic Enby (talk · contribs) 23:14, 10 August 2025 (UTC)

Added "attributableIndex" to 1346. Sam Walton (talk) 05:46, 11 August 2025 (UTC)

Complex and multifaceted

The search for "is a complex and multifaceted", a common AI turn of phrase, is giving a lot of hits. 2601:182:B00:2B10:4078:A7BF:64F:2B35 (talk) 20:08, 11 August 2025 (UTC)

Hi -- for that you'll want to include quotes in the search query, "complex and multifaceted." Skimming the ~100 search results from that, I don't see anything that immediately jumps out to investigate, but it's always good to have people looking for this stuff! Gnomingstuff (talk) 20:29, 11 August 2025 (UTC)

Project design

I've been trying out a small revamp of the top menu's design at User:Chaotic Enby/AI Cleanup, to give it a more polished style. Beyond that, I've been wondering if it could be a good thing to work on a common design language to give the project's pages a cleaner look, if anyone is interested. It will probably just be a color palette and maybe a few templates and page layouts, but the current project pages are a bit of a mess and it could be really worth it to make them visually cleaner. Chaotic Enby (talk · contribs) 14:19, 12 August 2025 (UTC)

An onslaught of seeming AI-generated additions to species articles

Apologies if you see this elsewhere; I'm also crossposting it to species-related projects.

I've been tagging a huge amount of seemingly AI-generated additions to articles about species. Some are "sourced," some not. I suspect that they are due to AI tools that will "write" an article based on provided sources and/or search results provided in a prompt. What seems to happen is that the AI, unable to generate text on a topic, speculates on what may be "likely." At first I considered that maybe it's a copy-paste template because there are a few users who are prolific with these, or perhaps a sockpuppet situation, but I've noticed similar text pattern in other topics as well.

Some examples:

Diff 1: While specific distribution data for *Amethysa basalis* is limited, members of the genus are generally found in tropical and subtropical zones. This user has added many many edits like this (though they're not the only one). The asterisks indicate markdown formatting, a common AI tell.
Diff 2: Shell Characteristics: While specific morphological details are limited, as a member of the Modiolus genus, it likely.... A separate AI tell here is this puffery: "This inclusion highlights its relevance in studies of marine biodiversity in the South Atlantic region." Several drafts by the user have been declined for sources not matching text.
Diff 3: Although specific conservation assessments for Halystina globulus are not available, deep-sea species in general are considered.... Not also the "is essential" editorializing.
Diff 4: this remains unconfirmed without direct access to the original description and Specific details about its depth range or precise localities within the Philippine region are not well-documented in available literature, suggesting a need for further research. The "further research" editorializing is common. Note that this user's userpage also shows AI signs, like markdown link formatting.
Diff 5: Specific morphological details about C. bialata are limited in the provided sources.
Diff 6: While specific measurements are not widely detailed, it shares general characteristics with other species.... This user was blocked for ongoing LLM use. Note the plaintext "footnotes," also.

I could list a lot more but I really don't want to be here all day. Basically, we've been getting swamped with these edits for almost a year, it's worse than we thought, it shows no signs of stopping, and it is way too big for one person. and I'm not a biology expert by any means so I am of limited help doing anything but finding this stuff and flagging it to experts.

Anyway, wanted to bring this to your attention, hopefully people have bandwidth to help take it on. Please tag me if you have questions or remarks, or else I won't see it (because I am busy excavating slop).

Note to the future: I do not want a mass deletion campaign to be kicked off due to this and do not approve of any insults toward the authors involved. Please don't make me regret flagging this. Gnomingstuff (talk) 19:11, 12 August 2025 (UTC)

Thanks for bringing this up. I've also seen it in a few articles recently, and it is very good that you flagged it for attention. I'm guessing we should add Specific details are limited/not available in WP:AISIGNS, and maybe to Special:AbuseFilter/1325 (although the sentence structure might be a bit too variable for that).

If the syntax is too vague for the edit filter, we could unironically train a (very small) language model to learn these sentence structures and run it on recent changes. That could possibly be a more flexible tool than an edit filter to look for "tells" of AI-generated content, assuming we train it on specific tells like this (rather than something like GPTZero which compares AI-generated editorial prose with human-generated editorial prose, and completely misses the baseline of Wikipedia's writing style). Chaotic Enby (talk · contribs) 19:32, 12 August 2025 (UTC)

Yep, that's one of the AI prose tells I've noted. There are a few more non-species examples at that link, like this ("provided search results") and this (contains a chatbot response)

As far as an edit filter, this pattern isn't unheard of in older text (example from 2009, example from 2010) so I don't know. Gnomingstuff (talk) 19:46, 12 August 2025 (UTC)

Regarding the edit filter part, it's about spotting them, not blocking them, so it should be fine – especially since this is bad prose either way. Chaotic Enby (talk · contribs) 19:48, 12 August 2025 (UTC)

Here's another pattern, looks like chatbot output (and the user's other edits all but confirm it). Gnomingstuff (talk) 21:41, 12 August 2025 (UTC)

I reverted that edit for two reasons, since is indeed chatbot output:

It added prose to the "References" section, which should usually only contain {{reflist}} or bibliography-style references and nothing else
Communication intended for the user (the chatbot is literally saying it searched for sources)

SuperPianoMan9167 (talk) 22:29, 12 August 2025 (UTC)

Another distinct pattern of possible chatbot output. Sorry for spam, I can take this to a separate page.

I recently spent many hours cleaning up this kind of stuff on mosquito species articles. The pattern I saw was that the AI generates citations to legitimate publications, but those publications don't contain the claims made in the AI text, which appears to take characteristics that are generally true of all mosquitos and phrase them as though they were specific distinguishing features of a specific species ("A. Mosquito is distinguished by its biting behavior, making it a nuisance to humans and pets."). -- LWG ^talk 19:38, 12 August 2025 (UTC)

Georgia (country)

Looking for input at Talk:Georgia (country)#AI additions Moxy🍁 20:13, 13 August 2025 (UTC)

Misuse of collapse template

Please see discussion at Wikipedia:Village pump (miscellaneous)/Archive 83#LLM accusations and non-native speakers and share your thoughts there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:55, 16 August 2025 (UTC)

New logo?

The main page for the project seems to have a new logo (one resembling a brain), but I checked the wikitext and it still seems to be using the same file: File:WikiProject AI Cleanup.svg (the logo with a robot and a magnifying glass). The new logo even appears in old page revisions for some reason. I can't find the new logo image here or on Commons. What is going on? SuperPianoMan9167 (talk) 15:25, 10 August 2025 (UTC)

The logo is actually set in Wikipedia:WikiProject AI Cleanup/style.css as the background-image property of the header_image class. In Special:Diff/1189363698/1305145654, Waddie96 changed the logo from the robot to the brain. In Wikipedia:WikiProject AI Cleanup, the code title="File:WikiProject AI Cleanup.svg" provides advisory information and does not set the background image. Waddie96, would you like to comment on the logo change? — Newslinger talk 15:46, 10 August 2025 (UTC) Fixed class name. — Newslinger talk 15:51, 10 August 2025 (UTC)

I thought it looked better, what do you think? The other icon is from Codex and it represents bots in currently in Wikimedia UI production and in future. So better not to overlap. waddie96 ★ (talk) 15:49, 10 August 2025 (UTC)

I like it! Thanks to Newslinger for explaining how the image works (I didn't know it was set through the page CSS). The old logo is still used in a bunch of places though if you want to change them. SuperPianoMan9167 (talk) 15:54, 10 August 2025 (UTC)

I like it as well, tho I feel like it might give the wrong vibes for some (if you look at it long enough it feels like it is encouraging Cyborg behavior, not necessarily stopping it -- Maybe we need a mop somewhere in the mix?) Sohom (talk) 16:30, 10 August 2025 (UTC)

Yeh sorry about that, I should have tidied up after myself when I was happy how it looked. waddie96 ★ (talk) 19:33, 10 August 2025 (UTC)

Maybe we could add a magnifying glass like the one in the old logo. SuperPianoMan9167 (talk) 20:58, 10 August 2025 (UTC)

Read the icon style guidelines at Codex. And then let me know! waddie96 ★ (talk) 21:12, 10 August 2025 (UTC)

I don't have a strong preference about the WikiProject logo, but I should say that I picked File:OOjs UI icon robot.svg for the icon in {{Collapse AI top}} because it appears to be the image that the robot logo File:WikiProject AI Cleanup.svg was derived from. If the WikiProject logo is changed, then it might make sense to change the icon of {{Collapse AI top}} as well. — Newslinger talk 15:59, 10 August 2025 (UTC)

Yep exactly. Sorry I haven't had time to change.; waddie96 ★ (talk) 19:36, 10 August 2025 (UTC)

Which you all interested in?

waddie96 ★ (talk) 21:07, 10 August 2025 (UTC)

None, I prefer the old robot. A brain is a symbol of intelligence, while the current state of "AI" are unintelligent predictive models. I don't think we should conflate the two and further feed into the misconception that these models are intelligent systems. fifteen thousand two hundred twenty four (talk) 21:15, 10 August 2025 (UTC)

My main issue with the new logo is that it doesn't convey the idea of cleanup, only the "AI" part, and that adding a magnifying glass on top would make it look more crowded.

In terms of colors, going for a blue color scheme could lead to confusion with blue links, although having something a bit more vibrant than the current black-and-grey tones would be neat. Maybe purple/magenta? Chaotic Enby (talk · contribs) 21:17, 10 August 2025 (UTC)

I wonder if we could remix https://thenounproject.com/icon/cleaning-7652032/ overlayed with the AI image above and have a icon that way? Sohom (talk) 21:22, 10 August 2025 (UTC)

The details in that image makes it a bit too much in my opinion, especially with the proposed logo above. Maybe a magnifying glass like before? (Even with a magnifying glass, we might need something like a color distinction between them to make it visually readable) Chaotic Enby (talk · contribs) 21:25, 10 August 2025 (UTC)

I'm thinking, if we decide on a color palette for the whole project, assuming the base color is used for the title text around the logo, then we could have the main part of the logo (either the brain or robot) use the highlight color, and the magnifying glass use the base color for contrast. Chaotic Enby (talk · contribs) 21:40, 10 August 2025 (UTC)

Now that I think about it, in addition to the points mentioned above, this logo is very similar to that of WikiProject Artificial Intelligence; to avoid confusion, I would actually prefer the old logo. SuperPianoMan9167 (talk) 21:23, 10 August 2025 (UTC)

After reading Fifteen thousand two hundred twenty four's comment, I also believe that the brain icon has a mildly positive connotation (representing brainpower), while the robot icon has a mildly negative connotation (representing a failure of in the Turing test). Because of this, I now prefer for {{Collapse AI top}} to retain the robot icon, and I am concerned that the brain icon would project a message that is contrary to the goals of this WikiProject.

{{WikiProject Artificial Intelligence}} was changed to use a blue brain icon (File:Icon AI brain blueshaded.svg) instead of its previous nodes icon (File:Hey Machine Learning Logo.png) in Waddie96's edit Special:Diff/1305146173. This change makes more sense for WikiProject Artificial Intelligence, which focuses on coverage of AI in article space, a very different focus than that of WikiProject AI Cleanup. — Newslinger talk 08:44, 11 August 2025 (UTC) Edited — Newslinger talk 09:14, 11 August 2025 (UTC)

Fully agree with that analysis. Chaotic Enby (talk · contribs) 10:43, 11 August 2025 (UTC)

@Chaotic Enby Please be aware that I spent 10 minutes fixing up the file's licensing. It was about to be tagged for copyright infringement deletion. Please read Commons:Licensing before making any other derivative work of images licensed with free use tags but still require attribution (as they are not public domain. It's tricky wording, and frustrating I know. waddie96 ★ (talk) 13:26, 15 August 2025 (UTC)

Thanks. To clarify, I did not upload the file or write the license myself (it was done by @Queen of Hearts). Additionally, the changes you made to the license were incorrect. Both File:Codex icon robot.svg and File:Codex_icon_search.svg were licensed under CC BY-SA 4.0, and so was File:WikiProject AI Cleanup.svg, so there was no need to switch it to MIT. If the license on the original files was incorrect, please change it there instead of just making changes to derivative files.

Finally, this is not how copyright infringement works. If the file has the wrong license, then it will be usually tagged with something like {{Wrong license}} and fixed. There is no speedy deletion criterion for "forgot to properly give attribution". The closest are F3 (for derivative works of non-free content, which is obviously not the case here), and F5 (if the content is missing a source entirely, and with a warning and a grace period of seven days). Chaotic Enby (talk · contribs) 13:50, 15 August 2025 (UTC)

@Chaotic Enby, @Queen of Hearts, Technically speaking @Waddie96 is not wrong, the onwiki files are marked under the wrong license (not sure why), the original files of the codex icons are indeed under the MIT license as per the LICENSE file for the source code. However, I see this as a simple fixable mistake and not a issue to ask for a copyright infringement deletion. That being said, @Waddie96, please mind your tone, at the moment you are coming off as condescending and combative, in suggesting copyright deletion and implying a inability to understand copyright. Sohom (talk) 14:03, 15 August 2025 (UTC)

Thanks for the additional explanation. This is a bit of a confusing situation, as the README file indicates that the icons are under CC BY-SA 4.0. Should we conclude that they are automatically dual licensed? As I mentioned above, if that is the case, it could have been helpful to also make the change on the original icons to clarify the situation. Chaotic Enby (talk · contribs) 14:10, 15 August 2025 (UTC)

Hmm, I did some digging around, and found phab:T383077#10433947, I think dual licensing it on wiki is the best way forward (since the package containing the icons is MIT, but the icons are also under CC-BY-SA (fun and confusing)). There is part of me that is freaking out about TheDJ's last comment since I agree, that by not linking to the icons (as they are used in our interfaces) we are kinda-sorta violating CC-BY-SA, but that's for the Codex team to figure out :) Sohom (talk) 14:23, 15 August 2025 (UTC)

Edit: And I (kind of) misread, it's CC BY and not CC BY-SA. Chaotic Enby (talk · contribs) 14:29, 15 August 2025 (UTC)

Thanks for spotting that @Sohom Datta waddie96 ★ (talk) 14:41, 15 August 2025 (UTC)

I've request on Commons for a bot to reapply new license tag to all Codex icons. And to sort out the author and source fields, suggestions welcome. waddie96 ★ (talk) 20:42, 15 August 2025 (UTC)

Hmmm, listen it wouldn't be the end times at all if we reverted back. I made a WP:BOLD decision. If anyone independent wants to close the discussion with the outcome when it's reached its end, I'm happy either way per WP:BRD.

waddie96 ★ (talk) 15:38, 11 August 2025 (UTC)

I've undone the change pending discussion and consensus here. fifteen thousand two hundred twenty four (talk) 03:50, 12 August 2025 (UTC)

Prefer the old logo. Sorry - but the magnifying glass over a robot perfectly encapsulates the point of this Project, which is to scrutinise AI-generated text. qcne (talk) 09:00, 11 August 2025 (UTC)

Emphatically prefer old logo. I won't repeat my last rant, but as others have said, we should absolutely not be giving in to identifying these speculative autocomplete technologies as intelligent, because that misunderstanding being made by other users is why this project has to exist in the first place.

Also, at an aesthetic level, I'm not a huge fan of this half-brain half-positronics design. Not to be rude, but does this logo describe linear algebra algorithms that emit text, or would it be a good Cyborg t-shirt? Altoids0 (talk) 19:32, 14 August 2025 (UTC)

@Altoids0 Lol this made me laugh because I'm not sure if you're actually being serious, or if you just joke really well over the Internet. waddie96 ★ (talk) 01:32, 16 August 2025 (UTC)

I think I did mean it as a joke, apologies if I sounded a little testy. For what it's worth I definitely prefer these brainy designs over the typical butthole logos. Altoids0 (talk) 00:18, 17 August 2025 (UTC)

No i thought you were seirously upset at AI. I was concerned haha. No stress waddie96 ★ (talk) 00:23, 17 August 2025 (UTC)

Discussion at Wikipedia:Village pump (idea lab) § Working towards a policy on generative AI

You are invited to join the discussion at Wikipedia:Village pump (idea lab) § Working towards a policy on generative AI, which is within the scope of this WikiProject. Chaotic Enby (talk · contribs) 00:50, 17 August 2025 (UTC)

Discussion at Talk:Mark Karpelès § AI-generated frog portrait

You are invited to join the discussion at Talk:Mark Karpelès § AI-generated frog portrait, which is within the scope of this WikiProject. An editor has added an AI-generated cartoon portrait of a BLP, sourced from the website of the subject's current employer. I removed the image from the article citing WP:AIIMGBLP and WP:AIGI; the editor restored it and defended its inclusion on the basis that AI-generated images should be used when the subject is known to use them (i.e. not generated for the sake of having a photo in an article). Requesting input from uninvolved editors as to whether this constitutes a marginal case in which AI-generated imagery depicting BLPs is permissible. DefaultFree (talk) 04:01, 17 August 2025 (UTC)

Signs of AI writing but with a little more "je ne sais quoi"

Hi folks,

I am one of those folks who keeps wanting to participate in Wikipedia but I struggle to feel like I make meaningful contributions, stuff gets done wrong, I spiral and disappear. I'm not giving up though and I feel like I can genuinely help bring a French adaptation to this: https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

Is there anyone currently working on this? Should I proceed? Pinkythank (talk) 07:04, 17 August 2025 (UTC)

No worries, even experienced users often get stuff wrong, you don't have to worry about this! Regarding the French version, French Wikipedia has a sister project to this one, fr:Projet:Observatoire des IA, which has been working on fr:Aide:Identifier l'usage d'une IA générative. That page is still much shorter than ours, and you can definitely help contribute to it! Chaotic Enby (talk · contribs) 11:58, 17 August 2025 (UTC)

Yet another huge network of unreviewed LLM text

Ugh.

Earlier this year there was a sockpuppet investigation into a couple of users. The investigation brought up some suspected AI use and at least one hallucination. Turns out that combined, these users have made hundreds of edits, most of which are to extremely high profile articles (up to WP:VITAL level 3), all of them seeming to be LLM generated. Some of them are article text, a lot are image captions.

I've gone through and done a quick scan of the most important-seeming edits, and have tagged a lot of articles as a result, but I haven't reviewed every single edit because there are just so many of them. So if anyone else has time to take a look feel free (the ones I have reviewed are mainly the large diffs). In some cases the edits are pretty small, but I feel like having (justified) AI tags at the top of major articles is maybe not the worst idea in the world for awareness raising.... Gnomingstuff (talk) 18:54, 13 August 2025 (UTC)

Could it be possible to ask editors adding {{AI-generated}} to at the very least verify if the added content qualifies as AI before flagging articles as such? For example, ignoring the paragraph that starts with "In emulation of...", how does a whole article warrants a warn because a person just added two images? (CC) Tbhotch^™ 19:08, 13 August 2025 (UTC)

Because they still contain facts that need verification, and can contain hallucinations (the "especially in 1946" that is inserted out of nowhere). Here, with the first image, several factual assertions are made in a short space: the image is indeed the Teatro de los Insurgentes, that it is specifically the facade, and that the mural on the facade is in fact a visual history. In this case the additions do seem to be factually accurate, but any use of AI essentially poisons the whole well of the edits. The AI-generated template does have a parameter to restrict it to specific sections, but with images that's not all that simple to do -- IMO it's probably more disruptive to have a bunch of section tags than one article tag. Gnomingstuff (talk) 22:36, 13 August 2025 (UTC)

I agree. Any article that has been maliciously modified to become filled with unverified or dubious claims can reasonably get a banner, even if it's embarrassing for Wikipedia. How I even became aware of articles like Blues being contaminated was through your maintenance tag additions, as I ritualistically flip through the associated category.

If Gnoming's work is demonstrative of anything, it's of a need for a specific maintenance template for generated captions. Altoids0 (talk) 07:16, 16 August 2025 (UTC)

I find discussions like these very frustrating because they're often shooting the messenger. I (or anyone else adding templates) didn't suddenly put hundreds of new instances of LLM content into articles. They were there before. Now, they might actually get fixed instead of sitting around undetected for 5 or 10 or 19 years, getting cited in books, etc. It's especially frustrating when the edits were made by someone who already got blocked, for LLM use, yet no one took the time to go back and even tag (let alone fix) the contributions that they already decided were worth blocking over.

Any embarrassment to Wikipedia is a feature, not a bug. The AI slop will remain whether we spot it or not, so readers might as well know about it. Gnomingstuff (talk) 17:36, 17 August 2025 (UTC)

User:Ivanisrael06

Resolved

– Editor made aware of problem and everything reverted

If my suspicions are correct, the edits made by user @Ivanisrael06 (Special:Contributions/Ivanisrael06) seem to be, entirely or in large part, generated by AI. This is, to me, somewhat apparent in the wording and tone of their contributions. What is more concerning than that is the very odd page formatting and citation style employed in almost all of their submissions here. While these style issues in and of themselves merit page revisions in most cases, I would definitely appreciate second opinions on the potential use of an LLM. ElooB (talk) 18:05, 24 August 2025 (UTC)

Was coming here to ask for help for the exact same person..after seeing this that even has sources to Wikipedia. I have revered a few of these changes and I'm not seeing anything that's not AI generated..... We'll need people with some time on their hands to take a look at this. Moxy🍁 18:56, 24 August 2025 (UTC)

Unusual citation style, plus the near-total absence of wikilinks in the text they add, to me are strongly suggestive of genAI. Worse, these weird references are to Wikipedia itself or to social media, see e.g. this diff. And some of the references are simply made-up, such as "Dhaka Tribune, 2025" in this diff, despite the absence of any Dhaka Tribune articles in the reference list. I think reversion of all these edits is in order. WeirdNAnnoyed (talk) 19:07, 24 August 2025 (UTC)

The offending edits are all rolled back by now. I guess that should settle that. Ivan, if you read this, please refer to Moxy's message on your talk page for your future contributions on Wikipedia. ElooB (talk) 19:28, 24 August 2025 (UTC)

thank you for your time and effort.Moxy🍁 19:57, 24 August 2025 (UTC)

a whole class worth of AI edits

This fall 2024 course seems to have outright encouraged students to use AI for their edits -- the students' userpages seem to have a lot of essays like this suggesting it's an actual assignment (and the page edits display the usual signs). I know AI edits have been an issue with student edits in general but this seems to be a much more centralized thing, so wanted to post it here in case other classes had something similar around this time.

There's a discussion on the education noticeboard about the current Wiki Ed AI guidelines, which this class predates. Gnomingstuff (talk) 20:01, 25 August 2025 (UTC)

Publifye AS

I wasn't sure if this would be better off here or at Wikipedia:Reliable sources/Noticeboard, but...

Publifye AS (https://publifye.com/) is a self-publishing platform which makes extensive use of AI. This is explained in a disclaimer in the books I've viewed, though it is typically not part of the preview and is only noticeable if you search for "AI" within the book. Some of the authors are even listed as AI, eg: Corbin Shepherd, ‎AI.

Search results for the authors are littered with links to vendors which don't work (eg), so I assume the likes of Amazon, Barnes & Noble, and Everand realised the works were AI generated and removed them.

However, Publifye AS's output is still on Google Books and from there has ended up in at least a dozen Wikipedia article (current search result is 11, and I removed a few earlier). Does anyone have any bright ideas to prevent the use of sources by this publisher? Richard Nevell (talk) 12:45, 24 August 2025 (UTC)

I've removed what was left. My idea would be to create an edit filter for this and when that matches, it would display a warning to the user regarding this, and it would tag the edit so we can look at edits that have ignored this warning (similar to Special:AbuseFilter/869). Kovcszaln6 (talk) 13:48, 24 August 2025 (UTC)

Good removals. An edit filter seems like a reasonable step to take here. Stepwise Continuous Dysfunction (talk) 04:50, 25 August 2025 (UTC)

I've requested one. Kovcszaln6 (talk) 11:43, 27 August 2025 (UTC)

Thank you, Kovcszaln6, I was at a bit of a loss for how to go about that. Richard Nevell (talk) 19:34, 27 August 2025 (UTC)

ANI thread

Wikipedia:Administrators'_noticeboard/Incidents#Likely_undisclosed_AI/LLM_use_to_expand_articles_by_Covnantay is probably of interest to members of this Wikiproject. Potentially hundreds of articles affected by unvetted AI additions. Hemiauchenia (talk) 00:55, 17 August 2025 (UTC)

Thread has been archived without action. Most of their AI additions have been reverted, but it would be good if their additions could be combed over to make sure that none are left: [9]. Hemiauchenia (talk) 21:16, 27 August 2025 (UTC)

Discussion at Wikipedia talk:Speedy deletion#RfC: Including emojis in G15

Come one, come all to a discussion about whether or not G15 should be expanded to include pages with emojis and if so, how. Anecdotal experiences and opinions welcome in the discussion session. GreenLipstickLesbian 💌🦋 13:37, 28 August 2025 (UTC)