Wikipedia:Bots/Requests for approval/BHGbot 7

BHGbot 7

[[User:BHGbot|BHGbot 7]]

{{BRFA help}}

{{Newbot|BHGbot|7}}

Operator: {{botop|BrownHairedGirl}}

Time filed: 15:10, Tuesday, July 28, 2020 (UTC)

Function overview: Mass create {{tl|Category redirect}}s to resolve the WP:ENGVAR variations in category names using the word "organisation(s)" or "organization(s)".
e.g. if we have a Category:Anti-Foobar organizations, then the page Category:Anti-Foobar organisations would be created with the content {{nowrap|{{Category redirect|Anti-Foobar organizations|bot=BHGbot}}}}

Automatic, Supervised, or Manual: Automatic

Programming language(s): Bash and AutoWikiBrowser

Source code available: Yes. There are two components:

:# Wikipedia:Bots/Requests for approval/BHGbot 7/Make-BHGbot7-edit-list.sh

:#Wikipedia:Bots/Requests for approval/BHGbot 7/BHGbot-7-AWB-module

Links to relevant discussions (where appropriate): WT:WikiProject Categories#Organi%5BSZ%5Dations_category_redirects ([https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:WikiProject_Categories&oldid=969970475#Organi[SZ]ations_category_redirects permalink], tho discussion is ongoing). This discussion was notified to WP:VPP[https://en.wikipedia.org/w/index.php?title=Wikipedia%3AVillage_pump_%28policy%29&type=revision&diff=969454151&oldid=969329313] and WP:VPR[https://en.wikipedia.org/w/index.php?title=Wikipedia%3AVillage_pump_%28proposals%29&type=revision&diff=969598919&oldid=969595926].
Previous related discussion: WP:Bots/Requests for approval/BHGbot 3 (a similar proposal in 2017, which ran into the sands due to lack of prior consensus. My bad)

Edit period(s): Initial run to handle the backlog. Then a followup every few months.

Estimated number of pages affected: ~12,500 in the initial run.

Namespace(s): Category

Exclusion compliant (Yes/No): Yes

Function details: This task supports MOS:COMMONALITY by resolving the s/z WP:ENGVAR variation in the spelling of "organisation"/"organization", by creating a soft {{tl|category redirect}} to the title which is in use. This corresponds with the MOS:COMMONALITY guideline to create such redirects in article space.

:The word "organisation"/"organization" is one of the most common ENGVAR variants in category titles, and the current lack of redirects is a long-standing nuisance for both readers and editors.

:The bot works in three stages:

:# A set of quarry queries to generate lists of pages

:# A bash script to process these lists and generate a list of category redirect titles to be created

:# An AWB run to create the category redirect pages

:; 1. Get lists

::The first part of the bot is three quarry queries:

::* quarry:query/46899: Gets a list of non-redirect category pages whose title matches \b[Oo]rgani[sz]ations?\b and don't transclude {{tl|Category redirect}} or {{tl|Category disambiguation}}

::* quarry:query/46999: gets a list of all pages in the category namespace

::* quarry:query/47001: gets a list of all pages in the main (article) namespace

:;2 process the lists

:The bash script Make-BHGbot7-edit-list.sh:

::*inverts the S/Z spelling in the list of organisation categories

::* removes from that list titles which are in the list of all pages in the category namespace

::* removes from that list titles which are in the list of all pages in the main (article) namespace

::* wikilinks the resulting edit list

:;3 Create the redirects

:Using the edit list created in step 2, AWB

:* skips any existing pages (there should be none, but some may have been created since the list was made)

:* applies the AWB custom module BHGbot-7-AWB-module to create the redirect with an explanatory edit summary as in this test edit[https://en.wikipedia.org/w/index.php?title=Category:Women%27s_religious_organisations&oldid=969976972]

:**If the page title to be created is "Foo organisations" (with an S), a {{tl|category redirect}} is created to "Foo organizations" (with a Z). And vice versa.

:** Per a request by User:Hellknowz at the 2017 BRFA, the redirect template includes the parameter |bot=BHGbot

:The module includes sanity checks to:

:* skip any pages whose title does not match the regex /^(.*?\b[oO]rgani)[sz](ations?\b.*)$/

:* skip any case where it is about to create a self-redirect

:I have done a dry run (AWB in pre-parse mode) on a deliberately-polluted list of test pages, and it correctly skipped them all. I did another test of the full list of ~12,500 pages, where no pages were skipped, which indicates the accuracy of the list-making.

;Differences from BHGbot 3

:This proposal tackles the same problem as the 2017 proposal BHGbot 3, but it uses a different approach. The 2017 proposal drew its list from recursing the category tree. This proposal uses quarry to collect list of category titles. Using quarry gives a complete list, whereas category recursion is usually woefully incomplete. The quarry-generated lists allow rigorous checks against error.

=Discussion=

{{BotTrial|edits=50}} Primefac (talk) 22:03, 2 August 2020 (UTC)

{{BotTrialComplete}}. Thanks, @Primefac.

:I used the linux shuf command to randomly select 50 pages from a list of 12,461 categories which I had built last week while testing the list-making:

{{collapse top|50 randomly-selected redirects to create in trial run}}

  1. :Category:Religious organisations established in 1928
  2. :Category:Organisations established in 1718
  3. :Category:Films about organisations
  4. :Category:Wikipedia categories named after organizations based in Iran
  5. :Category:Environmental organisations based in Europe
  6. :Category:Organisations based in San Diego
  7. :Category:Organisations based in Mayotte by subject
  8. :Category:Transport organisations based in Lithuania
  9. :Category:Organisations based in Oceania by country and subject
  10. :Category:Student organisations established in 1917
  11. :Category:Scientific organisations established in 1857
  12. :Category:Organisations based in American Samoa by subject
  13. :Category:British Cadet organizations
  14. :Category:State history organisations of the United States
  15. :Category:Missing people organisations
  16. :Category:Arts organisations established in 1887
  17. :Category:Environmental organizations based in the Bahamas
  18. :Category:Horticultural organizations based in India
  19. :Category:Organisations disestablished in 1950
  20. :Category:Transport organizations based in Gibraltar
  21. :Category:Defunct organizations based in Zambia
  22. :Category:Humanitarian aid organisations of World War I
  23. :Category:Defunct organizations based in the Cook Islands
  24. :Category:Wikipedia categories named after organisations based in Romania
  25. :Category:Religious organizations based in Chile
  26. :Category:Cultural organisations based in Moldova
  27. :Category:Cultural organizations based in Portugal
  28. :Category:Ethnic organisations based in the Czech Republic
  29. :Category:Religious organisations based in the Marshall Islands
  30. :Category:Animal welfare organizations based in Peru
  31. :Category:Women's organizations based in Pakistan
  32. :Category:Islamic organizations based in Mali
  33. :Category:Arts organisations established in 1988
  34. :Category:Housing rights organisations
  35. :Category:Sports organizations of South Ossetia
  36. :Category:Religious organisations disestablished in 2010
  37. :Category:National Taiwan University organisations
  38. :Category:Sports organisations disestablished in 1954
  39. :Category:Paramilitary organisations based in South America by country
  40. :Category:Business and industry organisations based in Chicago
  41. :Category:Music organisations based in the State of Palestine
  42. :Category:Organizations based in Bhopal
  43. :Category:Private and independent school organisations in the United States
  44. :Category:Film organizations in Belgium
  45. :Category:Organisations based in Orange County, California
  46. :Category:Members of the Parliamentary Assembly of the Collective Security Treaty Organisation
  47. :Category:Religious organizations based in Gibraltar
  48. :Category:Business organisations based in Turkmenistan
  49. :Category:Research organisations by country
  50. :Category:Migration-related organisations based in the United States

{{collapse bottom}}

:Here are the [https://en.wikipedia.org/w/index.php?title=Special:Contributions&offset=202008040953&dir=next&target=BrownHairedGirl&namespace=14 50 trial edits].

:No pages were skipped, and I have reviewed each of the 50 edits. The redirects are all as intended. --BrownHairedGirl (talk) • (contribs) 10:17, 4 August 2020 (UTC)

::{{BotApproved}} Primefac (talk) 00:37, 6 August 2020 (UTC)

:The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.