Wikipedia:Bots/Requests for approval/Galobot

Galobot

[[User:Galobot|Galobot]]

{{Newbot|Galobot|}}

Operator: {{botop|Galobtter}}

Time filed: 07:12, Friday, August 3, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python (Pywikibot)

Source code available: [https://github.com/galobtter/galobot/blob/master/lint%20error%20fixer.py Here]

Function overview: Fix multiple unclosed formatting tags lint errors

Links to relevant discussions (where appropriate): Wikipedia:Village pump (technical)#Remex: Pages that used to look fine are now broken

Wikipedia:Bot requests#HTML errors on discussion pages

Edit period(s): One time run

Estimated number of pages affected: ~10000 based on [https://tools.wmflabs.org/fireflytools/linter/enwiki 34000 errors]

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Basically, replaces things like {{tag|tt|o}}…{{tag|tt|o}} with {{tag|tt|o}}…{{tag|tt|c}} as {{u|Jc86035}} suggested here. Tags fixed are {{tag|tt|o}}, {{tag|s|o}}, {{tag|u|o}}, {{tag|b|o}}, {{tag|i|o}}, {{tag|code|o}}, and {{tag|strike|o}}. Specifically:

  1. for each page that has multiple unclosed formatting tags, finds every multiple unclosed formatting tags error for that page
  2. uses the "location" output of Linter to narrow down where to fix the error in the page text
  3. searches for two instances of start tags of the erroneous tag
  4. if there are no closing tags or templates in between, it replaces the latter instance with a closing tag
  5. :update: the two deprecated tags are now handled a bit differently; {{tag|strike|o}} tags are replaced with {{tag|s|o}} tags and {{tag|tt}} with {{t|mono}} if the fix is to the 99%+ case of reviewer
  6. only makes an edit if it has fixed all multiple unclosed formatting tags errors.

I know that {{u|Ahecht}}'s {{u|Ahechtbot}} is having a BRFA partly for doing the same for just {{tag|s|o}}; however, this fixes all non-nesting tags with such errors, and as it only edits when all errors are fixed there shouldn't be any double-watchlist hits from both bots or anything like that. Also, my bot account was blocked by {{u|Oshwah}} for having "bot" in its name; probably should be unblocked, at-least now :)

=Discussion=

:Unblocked as there is a BRFA open on this and it is not editing outside the bot policy. — xaosflux Talk 13:03, 3 August 2018 (UTC)

  • Regarding "only makes an edit if it has fixed all multiple unclosed formatting tags errors" - is this for the entire page, how will you determine this? — xaosflux Talk 13:08, 3 August 2018 (UTC)
  • :Yes, this is for the entire page. All the multiple unclosed formatting tags errors of the page are gotten through an API call; if for any of the errors on a page it cannot make a fix, it doesn't edit the page (there is no need to make more than one fix per error given by Linter). This filtering decreases the number of pages edited by ~5%. Galobtter (pingó mió) 13:24, 3 August 2018 (UTC)
  • From an "error-handling" perspective, how likely is it that there will be nested instances of these calls? I know it's unlikely that there will be something like This is italics when we do this (which shows up as This is italics when we do this, but it's very possible you could have someone saying "To highlight code, use ", which has two {{tag|code|open}} calls in it. Primefac (talk) 16:12, 3 August 2018 (UTC)
  • :I mean, I know that the second example I've given doesn't actually throw any errors, but if there was another error on the page, would it correct it? Primefac (talk) 16:13, 3 August 2018 (UTC)
  • ::Interesting edge case, but no :). Linter gives the location of the error (from the start tag to the incorrect end tag (addendum: or sometimes till the end of the line)). The script only tries fixing the specific tag that Linter says is problematic within that particular location. (see [https://en.wikipedia.org/w/api.php?action=query&list=linterrors&lntcategories=multiple-unclosed-formatting-tags&lntpageid=50283891 here] for example of API output) So another error elsewhere would not cause "fixing" of that.

:::(Additional thoughts that may not make sense and are of minor import: The only way that would even be close happening is if the page had two unclosed formatting lint errors as in [https://en.wikipedia.org/w/index.php?title=User:Galobtter/sandbox2&oldid=853279399 here]. Linter sometimes gives the location as from the first erroneous tag to the very last one, instead of stopping at the second paired erronous tag (but [https://en.wikipedia.org/w/index.php?title=User:Galobtter/sandbox2&action=info not in the case I've made though]), and thus the whole of the text would be in the location of one of the reported errors, and thus the text to be fixed would include the example you've given, and the program would be looking to fix a error within that. But the program would only fix the first error there and not "fix" the next line; and the location of the second error would only contain Bar) Galobtter (pingó mió) 16:47, 3 August 2018 (UTC)

::::Cool. I've been a little more out-of-the-loop on the Linter stuff recently, so I wasn't sure how the errors were being handled these days. Primefac (talk) 16:29, 4 August 2018 (UTC)

  • I've added the source code, nothing too much too it Galobtter (pingó mió) 21:46, 7 August 2018 (UTC)
  • {{t1|BAG assistance needed}} Galobtter (pingó mió) 15:15, 13 August 2018 (UTC)
  • {{ping|Galobtter}} Wouldn't it be better to change tt tags to {{tl|tt}} or to {{tag|kbd}} (and similarly for strike tags), since they're now deprecated? Jc86035 (talk) 07:53, 14 August 2018 (UTC)

::{{done}}, thanks. {{tag|tt}} are instead replaced with {{t|mono|..}} if there are no pipes in between to muck things up. {{tag|strike|open}} are replaced with {{tag|s|open}} Galobtter (pingó mió) 09:44, 14 August 2018 (UTC)

:::{{ping|Galobtter}} You should also check for curley brackets, just in case someone was trying to type <tt>}}</tt> and accidentally did <tt>}}<tt> instead. I also created a pull request to explicitly call out the first parameter as "1=" and to add "|needs_review=yes". --Ahecht (TALK
PAGE
) 18:18, 17 August 2018 (UTC)

::::99%+ of {{tag|tt|open}} fixes are of reviewer from an old version of {{t|Pending changes reviewer granted}} so I'm thinking of maybe only changing to {{t|mono}} then (to avoid any errors); or at-least in that case the fixes won't need review ({{t|mono}} is what is used now for those notices) Galobtter (pingó mió) 18:28, 17 August 2018 (UTC)

::::Yeah, code updated to only replace with mono if it is reviewer Galobtter (pingó mió) 10:36, 18 August 2018 (UTC)

  • {{ping|Xaosflux}} tis been nearly a month, I have dealt with any issues brought up; would appreciate if this could move forward. Thanks. Galobtter (pingó mió) 13:43, 1 September 2018 (UTC)

:*{{BotTrial|edits=150}} SQLQuery me! 23:56, 7 September 2018 (UTC)

::*{{u|SQL}}, thanks, {{BotTrialComplete}} [https://en.wikipedia.org/w/index.php?title=Special:Contributions/Galobot&offset=20180908131228&target=Galobot&limit=200 Edits]. I made the bot skip user talk base pages for the trial per comments by Xaosflux on the Ahechtbot BRFA regarding creating new messages alerts. A large portion of the edits were of changing the {{tag|tt|open}} tag; here are sample edits of each of the tags: [https://en.wikipedia.org/w/index.php?title=User_talk:Eugen_Simion_14/Archive_1&diff=prev&oldid=858606480 tt] ([https://en.wikipedia.org/w/index.php?title=User_talk:Slakr/Archive_2&diff=prev&oldid=858611429 tt] fix that isn't of reviewer), [https://en.wikipedia.org/w/index.php?title=User_talk:SarahStierch/Archive_34&diff=prev&oldid=858606504 s], [https://en.wikipedia.org/w/index.php?title=User_talk:Moonriddengirl/Archive_54&diff=prev&oldid=858606518 code], [https://en.wikipedia.org/w/index.php?title=User:DexDroid29&diff=prev&oldid=858606443 b], [https://en.wikipedia.org/w/index.php?title=Talk:Malaysia_Airlines_Flight_17/Archive_9&diff=prev&oldid=858606392 i], [https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Requests_for_comment/Archive_14&diff=prev&oldid=858607731 u], [https://en.wikipedia.org/w/index.php?title=Wikipedia:Categories_for_discussion/Log/2012_November_22&diff=prev&oldid=858607848 strike]. There was one [https://en.wikipedia.org/w/index.php?title=User_talk:Doc_James/Archive_91&diff=858608613&oldid=785108170 error], but I spotted it and tweaked the bot code (turns out the location given by linter is sometimes 1 off, and code was tweaked to account for that; it now skips that page). Other than that I was checking every edit and there were no issues I could spot. Galobtter (pingó mió) 12:49, 8 September 2018 (UTC)

::*{{ping|SQL}}Galobtter (pingó mió) 14:03, 7 October 2018 (UTC)

::{{t1|BAGAssistanceNeeded}} It's been over a month since the trial was completed, and a week since Galobtter nudged {{u|SQL}}. --Ahecht (TALK
PAGE
) 02:54, 14 October 2018 (UTC)

:::I apologize for the delay. I've been very busy off-wiki. I don't see any issues with the trial edits, and there has been more than ample time for anyone else comment on this task. {{BotApproved}} SQLQuery me! 03:40, 14 October 2018 (UTC)

:The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.