Share to: share facebook share twitter share wa share telegram print page

User talk:ClueBot Commons/Archives/2025/May


Unexpected behaviour in my talk page archives

I use Cluebot III for archive my talk page. I recently changed the archive size, I set it to 75k when I created my account and never really thought about it. However looking at that now it seemed a bit small so I changed it to 150k. What I didn't expect was that Cluebot would now archive discussion to the first archive[1], even though there are later archives. I assume if left it will continue this behaviour until all the preexisting archives are the new size. Is this a known issue? -- LCU ActivelyDisinterested «@» °∆t° 12:37, 1 May 2025 (UTC)

This is a consequence of how the bot works. I'm not sure just how "intentional" it is (it's just kind of doing as it's told), but you can manually solve the issue for your talk page by specifying a numberstart so that ClueBot will only start archiving from the archive you choose instead of appending to old ones. Aidan9382 (talk) 12:53, 1 May 2025 (UTC)
Thanks Aidan9382, I've added numberstart. -- LCU ActivelyDisinterested «@» °∆t° 14:41, 1 May 2025 (UTC)

ClueBot added to archive #1, instead of new archive #15

I had 14 (fourteen) archive files; and enabled ClueBot. When it did its first archive action, it appneded into the first (oldest) archive, rather than appending to #14. That is very wrong. I see user ActivelyDisinterested had the same problem (above). Two suggestions:

  1. The issue & the workaround should be more prominently mentioned (as a whole bullet item) in the FAQ (?? which is dedicated to vandalism, not archiving?)
  2. The ClueBot software should be fixed so the tool never appends into an "old" archive: it should always append into the archive with the largest number.

Noleander (talk) 14:03, 7 May 2025 (UTC)

My reasons for the edits

Hi! My reasons for the edits to List of Looney Tunes video games was to have it be less repeating links to the same articles over and over again, as I feel certain links to articles should only be linked once per different section of the article and not through out the entire section. 2603:8001:8403:7188:3D0A:B5A6:F577:3EF6 (talk) 22:08, 6 May 2025 (UTC)

@2603:8001:8403:7188:3D0A:B5A6:F577:3EF6: It really helps to use the Edit Summary if you don't want the bot thinking you are vandalizing - RichT|C|E-Mail 04:37, 7 May 2025 (UTC)
@Rich Smith, I just wanted to let you know that you can't ping IP addresses, see WP:PINGIP. You should leave WP:Talkback messages on IP talk pages instead. Justjourney (talk | contribs) 05:38, 11 May 2025 (UTC)

Should I use Edit Summary to prevent the false positives?

My recent edits are reverted by ClueBot NG, although it was adding information instead of vandalizing. Then is it better for me to use the Edit Summary feature in order to prevent it? Upset New Bird (talk) 02:15, 10 May 2025 (UTC)

It's always good to leave an edit summary. See this Help page. -- R. S. Shaw (talk) 03:51, 10 May 2025 (UTC)
@Upset New Bird You can also report the false positive, if you haven't already. Justjourney (talk | contribs) 05:36, 11 May 2025 (UTC)

is clue bot even needed ?

99% of the bad stuff is helped by the problematic scoring given to recent changes patrollers. there usually more then one patrolling and the harder stuff they miss, cluebot (and other bots) wont be able to help. plus I've seen it wrong manytimes too. JamesEMonroe (talk) 11:58, 11 May 2025 (UTC)

When it comes to countervandalism, all the help that is available is welcome. Defining ClueBot NG's necessity through what it does and doesn't catch or what it mistakens as vandalism overshadows the fact that any help with reverting vandalism is beneficial to Wikipedia. Likewise, every editor is on their own not indispensable, but the wiki would certainly benefit from their positive contributions. As such, asking if an editor or a bot is "necessary" or "needed" is sort of missing the point: Wikipedia is not a project of bean-counters seeking to find efficiencies in processes by trimming the fat; instead, every positive contribution—while on its own insignificant—is a piece of the puzzle that helps to build the project as a whole. —⁠k6ka 🍁 (Talk · Contributions) 19:20, 11 May 2025 (UTC)

Cluebot III: ideas to improve the new archivist experience

Thank you for Cluebot III, which I just successfully used to set up archiving for Talk:Kat Abughazaleh! As I was learning how to use it, I had a few thoughts on what would have made it easier to start using it.

I really appreciate the documentation and easy "getting started" templates I could copy and paste. But, once I added the templates to the talk page (including configuration to "archivenow" a few "resolved" topics), I wasn't sure how long I should expect to wait before archiving would begin.

So I have a few questions/ideas:

  • It would be great to have some kind of an easy-to-use web-based tool to test: "will this configuration successfully do what I want?" Like: "given this set of arguments to the template, and this talk page, what will happen when Cluebot III runs?" I'm imagining something like RegExr or shellcheck or connected to Template sandbox and test cases.
  • I checked the bot's source code, and recent comments by maintainers, to update the documentation with expectations on how long to wait: "Cluebot III runs every 6 hours, on the Wikimedia Toolforge infrastructure. After you initially set up a page with the appropriate templates to invoke Cluebot III, it may take 24 or 48 hours for Cluebot III to execute archiving on that page for the first time." Is that accurate?
  • This is a more minor idea, but: When figuring out "is the bot in the middle of a run?", a user might try to figure out what sequence ClueBot III uses to process pages. But, if I read https://www.mediawiki.org/wiki/API:Query correctly, the sequence is not really predictable, because Cluebot is getting the list of pages from an API call that doesn't guarantee any particular ordering. If that's right, if you confirm, I'll update the relevant documentation.

I noticed @NaomiAmethyst has recently taken a fresh look at the code so I hope these ideas are helpful now! Sumana Harihareswara 15:47, 13 May 2025 (UTC)

Improving ClueBot NG's algorithm

I'm looking into User:Cluebot NG#Vandalism Detection Algorithm, and this is my understanding of the algorithm:

1. For each word and pair of adjacent words that was added in the edit, add its score (which is determined from training data) to a counter.

2. Compute a few other statistics, such as length of text added, etc. and normalize them to prepare as inputs to the neural network.

3. Run neural network and get the score.

Clearly, the algorithm works just fine (just look at ClueBot NG's contributions page). However, there are some areas that could still be improved further. For example, the size of the window for the bayesian classifiers is just 2, meaning a vandalism edit with a phrase of 3 or more words (or extra words interspersed between) might get ignored. In fact, it may be better to use something like a Transformer (deep learning architecture) to more accurately obtain the meaning of the edit.

As far as I know, the principal maintainer of the bot (User:Crispy1989) has been inactive since 2011. Also pinging User:DamianZaremba since he seems to be active on the github repo. If I could, I would be excited to help improve the bot. Sungodtemple (talkcontribs) 01:04, 14 May 2025 (UTC)

Also pinging NaomiAmethyst since she seems to be active. Sungodtemple (talkcontribs) 01:55, 21 May 2025 (UTC)
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9

Portal di Ensiklopedia Dunia

Kembali kehalaman sebelumnya