Wikipedia:Bots/Requests for approval/Lightbot

Lightbot 1

[[User:Lightbot|Lightbot]]

{{Newbot|Lightbot}}

Operator: Lightmouse (talk)

Automatic or Manually Assisted: Manually assisted

Programming Language(s): Monobook or AWB

Function Summary: Janitorial edits mainly to units and dates.

Edit period(s) (e.g. Continuous, daily, one time run): Continuous

Already has a bot flag (Y/N): No.

Function Details: Janitorial edits mainly to units and dates. Examples include:

  • Changing '|sqm|', '|cum|' and '|knot|' to '|m2|', '|m3|' and '|kn|' when in the convert template (no visible effect to the reader but rationalises the template). Very low false positive rate.
  • Fixing damaged date links that damage autoformatting e.g. November 5th (should be November 5). Very low false positive rate.
  • Fixing dates that are damaged by autoformatting such as date ranges e.g. 1 to 4 May should be simply '1 to 4 May' (to stop autoformatting converting it to "1 to May 4"). Low false positive rate.
  • Unlinking date fragments such as links to solitary months (February), solitary days of the week (Tuesday), digits (16). Some false positives possible but I know some of the common ones and will check by eye when doing these.

I have done thousands of script assisted edits of this kind as Lightmouse. Low error rate tasks will be transferred to Lightbot. See contributions of Lightbot.

= Discussion =

Seems fine assuming to check manually for false positives on the last bullet point. --Apoc2400 (talk) 22:00, 25 May 2008 (UTC)

  • You say "Very low false positive rate." a few times. Can you give examples of these false positives? dihydrogen monoxide (H2O) 09:35, 26 May 2008 (UTC)

:Actually, I say 'Very low' for the first two bullets. I say plain 'low' for the third bullet. The fourth bullet has different wording. I base my estimates on thousands of edits as Lightmouse. I will use low error rate parts of the same script.

:*"Changing '|sqm|'": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.

:* "Date links that damage autoformatting": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.

:* "Fixing dates that are damaged by autoformatting": This is a more difficult regex problem. With range examples, it is easy to address the first half of a range (or just bad formatting) such as 1. It is more difficult to correctly address the second half of a range 4 May. That is where the theoretical possibility of false positives exists i.e. I want to delink '4 May' in a date range but not otherwise.

:* Unlinking date fragments. For example false positives for delinking day names occur in references to calendars and gods i.e. I want to delink 'Thursday' when it is just the day that a TV show airs but not when referring to the god 'Thor'. Of the four bullet points here, this one needs the most care. I have done thousands of these and I know what to check for. I would be happy to do test runs. I hope that helps. Lightmouse (talk) 11:13, 26 May 2008 (UTC)

While you're editing units, I wonder whether you would be able to implement a procedure that places a non-breaking space between any digit and an ensuing unit, as per the manual of style.

I think that using a regexp as simple as \d\s\w would suffice; even if false positives did occur, the change in space style would cause no user inconvenience. The scope could be extended with a regexp such as (\s\d+)(\w*) (replaced by $1 $2), but I wonder whether that will be more false-positive producing.

Cheers, Smith609 Talk 13:53, 26 May 2008 (UTC)

:I am not convinced that the upside/downside balance for non-breaking spaces is a net benefit. So I do not choose to write, check, debug and maintain regex for them. I know that my view is a minority. However, please note that I use the convert template. That template includes non-breaking spaces as per the MOS. The net effect of any edit that adds the convert template is to give you what you want. Lightmouse (talk) 17:32, 26 May 2008 (UTC)

I think there would be many false positives when a digit is used inside a name or in codes of various kinds. --Apoc2400 (talk) 19:42, 26 May 2008 (UTC)

:Please set Smith609's non-breaking space question and his/her proposed code to one side. It is not part of my request for bot approval. Lightmouse (talk) 19:54, 26 May 2008 (UTC)

Okay, thanks for considering it, and sorry for sidetracking discussion! Smith609 Talk 08:18, 27 May 2008 (UTC)

:Sounds good to me. A bot to clean up {{tlf|convert}} tranclusions would be useful. JIMp talk·cont 20:27, 29 May 2008 (UTC)

Any news on this? Lightmouse (talk) 21:06, 4 June 2008 (UTC)

:Lets try a {{BotTrial|edits=100}} to see how it works. MBisanz talk 21:18, 4 June 2008 (UTC)

Trial edits complete. Lightmouse (talk) 09:55, 5 June 2008 (UTC)

:{{tl|BAGAssistanceNeeded}} I've taken a look at 1/4 edits and see no mistakes, propose approval. BJTalk 10:03, 5 June 2008 (UTC)

::{{BotApproved}} per Bjweeks, who should be in the BAG :P -- Cobi(t|c|b) 10:04, 5 June 2008 (UTC)

:The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.

:The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.