Wikipedia:Bot requests/Archive 72#Change interface language to English in Google links
{{Aan}}
Primary School articles
Following [https://en.wikipedia.org/wiki/Wikipedia:Categories_for_discussion/Log/2016_July_6#Category%3AArticles_in_Wikipedia_Primary_School_Project_SSAJRP this discussion], could anyone help set up a bot task that would
- Look for any article talk page tagged with :Category:Articles in Wikipedia Primary School Project SSAJRP. Rename the category :Category:Wikipedia Primary School articles
- Look for any article tagged with :Category:Articles in Wikipedia Primary School Project SSAJRP in the main name space
- Remove that category from the main namespace
- Add the category in the article talk space with a category name change into :Category:Wikipedia Primary School articles
Thank you for your help
Anthere (talk) 07:51, 13 July 2016 (UTC)
Well, thank you anyway if you read me at least. I take it I will have to do it by hand. Oh well. Anthere (talk) 17:54, 17 July 2016 (UTC)
- This doesn't have to be done by hand. This can be done by WP:AWB fairly quickly as well. There's about 280 or so pages either way. -- Ricky81682 (talk) 20:28, 20 July 2016 (UTC)
Commonscat
How about a bot that looks for missing Commons category link in articles where such a Commons category exists with lots of images? Anna Frodesiak (talk) 04:44, 4 May 2016 (UTC)
:@Anna Frodesiak If I'm not mistaken, there is a basic form of this is implemented via Wikidata; did you have something more specific in mind? -FASTILY 06:34, 23 May 2016 (UTC)
::Fastily, I think Anna means that plenty of articles that doesnt have the Commons template for whatever reason. And that a bot that locates and adds the Commons template to the said articles would be beneficial.--BabbaQ (talk) 18:03, 29 May 2016 (UTC)
:I suppose the bot would find Commons categories by checking if there's a Commons link under the sitelinks listed in the Wikidata item for a given article? Enterprisey (talk!) (formerly APerson) 00:59, 25 June 2016 (UTC)
:{{BOTREQ|doing}} I'm working on this. KSFTC 04:46, 2 July 2016 (UTC)
:{{BOTREQ|notdone}} but {{BOTREQ|possible}} – I have run into problems that I don't know how to fix. Maybe someone more experienced can do this. KSFTC 20:15, 23 July 2016 (UTC)
[[Wikipedia:WikiProject Stub sorting/Uncatted stubs]]
Could someone please update Wikipedia:WikiProject Stub sorting/Uncatted stubs? This should be done once in a while, ad it hasn't been done since March 2015. עוד מישהו Od Mishehu 04:10, 15 July 2016 (UTC)
:{{u|Od Mishehu}}, how often do you want it updated? Enterprisey (talk!) (formerly APerson) 03:41, 24 July 2016 (UTC)
::Monthly would probably be best. עוד מישהו Od Mishehu 03:43, 24 July 2016 (UTC)
:::Great, that's what I was thinking. I'm setting up the task now; it'll probably end up in APersonBot's userspace (so a BRFA isn't required), but I can transclude it in the stub sorting project's space. Enterprisey (talk!) (formerly APerson) 03:44, 24 July 2016 (UTC)
::::So apparently running write queries on labs is hard without proper configuration; I'll continue working on this, but any other bot operator is free to take this task and run with it. Enterprisey (talk!) (formerly APerson) 05:30, 24 July 2016 (UTC)
:::::Progress report: The bot currently does a good job of printing uncatted stub titles, but it isn't good at counting transclusions. Fix coming soon. Enterprisey (talk!) (formerly APerson) 04:03, 25 July 2016 (UTC)
[[Wikipedia:WikiProject Stub sorting/Uncatted stubs]]
Could someone please update Wikipedia:WikiProject Stub sorting/Uncatted stubs? This should be done once in a while, ad it hasn't been done since March 2015. עוד מישהו Od Mishehu 04:10, 15 July 2016 (UTC)
:{{u|Od Mishehu}}, how often do you want it updated? Enterprisey (talk!) (formerly APerson) 03:41, 24 July 2016 (UTC)
::Monthly would probably be best. עוד מישהו Od Mishehu 03:43, 24 July 2016 (UTC)
:::Great, that's what I was thinking. I'm setting up the task now; it'll probably end up in APersonBot's userspace (so a BRFA isn't required), but I can transclude it in the stub sorting project's space. Enterprisey (talk!) (formerly APerson) 03:44, 24 July 2016 (UTC)
::::So apparently running write queries on labs is hard without proper configuration; I'll continue working on this, but any other bot operator is free to take this task and run with it. Enterprisey (talk!) (formerly APerson) 05:30, 24 July 2016 (UTC)
:::::Progress report: The bot currently does a good job of printing uncatted stub titles, but it isn't good at counting transclusions. Fix coming soon. Enterprisey (talk!) (formerly APerson) 04:03, 25 July 2016 (UTC)
migrate Library of Congress thomas links to congress.gov
the Library of Congress has refreshed their website, but the archiving is a problem. could we have a bot correct all the references & links to thomas to the congress.gov domain? Beatley (talk)
here is a target list https://en.wikipedia.org/w/index.php?target=http%3A%2F%2F*.thomas.loc.gov&title=Special%3ALinkSearch
:{{re|Beatley}} This doesn't seem to be really all that necessary. All the links retarget to the new page, so why do we need to update them? Omni Flames (talk) 00:42, 3 August 2016 (UTC)
::i understand "if it ain't broke", but it's easier to fix before the links rot? and the nice librarians at the LOC did ask. Beatley (talk) 20:37, 3 August 2016 (UTC)
{{BOTREQ|coding}} trying my hand. ProgrammingGeek (Page! • Talk! • Contribs!) 16:44, 4 August 2016 (UTC)
:Since {{u|ProgrammingGeek}} seems to be away from Wikipedia for a bit, {{u|Beatley}}, {{BOTREQ|badidea}} since the URL after the domain seems to commonly turn to a 404. IABot will get to this when it's time. Dat GuyTalkContribs 18:59, 16 September 2016 (UTC)
::Ditto. Some links such as [http://thomas.loc.gov/cgi-bin/query/B?r112:@FIELD%28FLD003+s%29+@FIELD%28DDATE+20110517%29 this] redirect to a 404. If you replaced thomas.loc.gov with congress.gov, the archiving bot or human who comes by to rescue it won't be able to fix it, since there is no archived version at congress.gov. That being said you could still replace all links to thomas.loc.gov (without a path following it) with congress.gov, but that's not particularly helpful and is a redirect that's likely to stay in place. In many cases a visible link to thomas.loc.gov might be desirable, even if it does redirect — MusikAnimal talk 19:02, 16 September 2016 (UTC)
Convert dead Google Cache links to Wayback Machine
{{tracked|T142178}}
:Something related was discussed by Sfan00 IMG and Redrose64 in October 2012, but not further pursued.
We have [https://en.wikipedia.org/w/index.php?target=http%3A%2F%2Fwebcache.googleusercontent.com&title=Special%3ALinkSearch a whole lot] of Google Cache links, unfortunately most of them dead (it seems, unlike the Internet Archive, Google Cache is only a temporary thing). It would be nice to have a bot convert these links to Wayback Machine links, like I manually did [https://en.wikipedia.org/w/index.php?title=Richard_S._Arnold&type=revision&diff=732968271&oldid=732641371 here]. The Google cache links contain the URL information:
:
→ [http://webcache.googleusercontent.com/search?q=cache:umS520jojVYJ:www.aals.org/profdev/women/clark.pdf+%22Joan+Krauskopf%22+%22Eighth+Circuit%22&hl=en&ct=clnk&cd=5&gl=us ]{{dead link}}
to a {{tl|Wayback}} link like
:
→ {{webarchive |url=https://web.archive.org/web/*/http://www.aals.org/profdev/women/clark.pdf }}
I don't know how hard it would be for a bot to also fill the |date=
and |title=
parameters of {{tl|Wayback}}, but that would be optional anyways. Maybe it could if the raw Google Cache link above had [http... some title]
to it. Anyhow, just fixing the dead Google Cache links would be a valuable service in itself.
Of course, the above mentioned usage of {{tl|Wayback}} goes for raw links like the one I fixed. If the Google Cache link was in a citation template's |archiveurl=
parameter, then the fix should be
:|archiveurl=
to
:|archiveurl=
--bender235 (talk) 14:01, 4 August 2016 (UTC)
::This can be added to InternetArchiveBot's functionality. It's easy to add acknowledgement of archiving services and whether to flag them invalid or not. If you're patient I can try and add it in the upcoming 1.2 release.—cyberpowerChat:Online 16:22, 4 August 2016 (UTC)
:::Sure, we're not in a rush. Thank you. --bender235 (talk) 17:33, 4 August 2016 (UTC)
::::Some points I should point out, some snapshots don't exist in the Wayback machine. Since it's obvious that google cache is only temporary, and that when a site dies, its respective cache will too. That being said, it's probably better to simply remove the cache and tag the URL as dead.—cyberpowerChat:Online 09:46, 5 August 2016 (UTC)
:::::If there's no snapshot on WBM either, then yes. But if Google Cache (and the original URL) are dead, repair it as a WBM link. --bender235 (talk) 13:32, 5 August 2016 (UTC)
- {{yo|Bender235}} [https://en.wikipedia.org/wiki/List%20of%20Puerto%20Ricans?diff=prev&oldid=739310660]—cyberpowerChat:Offline 00:10, 14 September 2016 (UTC)
::Oh, InternetArchiveBot does that now? Is that a new feature, or has it always been converting Google Cache to Internet Archive? --bender235 (talk) 00:16, 14 September 2016 (UTC)
:::I added it because you requested it.—cyberpowerChat:Offline 00:22, 14 September 2016 (UTC)
Non-nested citation templates
{{Resolved}}
Please can someone draw up a list of templates whose name begins Template:Cite
, but which do not themselves either wrap a template with such a name, or invoke a citation module?
For example:
- {{Tl|Cite web}} invokes citation/CS1
- {{Tl|Community trademark}} wraps {{Tl|Cite web}}
- {{Tl|Cite CanLII}} does not wrap another 'cite' template and does not invoke a citation module
I would therefore only expect to see the latter in the results.
The list could either be dumped to a user page, or preferably, a hidden tracking category, say :Category:Citation templates without standard citation metadata, could be added to their documentation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:16, 14 August 2016 (UTC)
:I don't have access to database dumps, but petscan might be able to help. [https://petscan.wmflabs.org/?language=en&project=wikipedia&depth=6&categories=Citation%20templates&negcats=Template%20test%20cases%0D%0ATemplate%20sandboxes&ns%5B10%5D=1&templates_no=Cite%20arXiv%0D%0ACite%20AV%20media%0D%0ACite%20AV%20media%20notes%0D%0ACite%20book%0D%0ACite%20conference%0D%0ACite%20DVD%20notes%0D%0ACite%20encyclopedia%0D%0ACite%20episode%0D%0ACite%20interview%0D%0ACite%20journal%0D%0ACite%20magazine%0D%0ACite%20mailing%20list%0D%0ACite%20map%0D%0ACite%20news%0D%0ACite%20newsgroup%0D%0ACite%20podcast%0D%0ACite%20press%20release%0D%0ACite%20report%0D%0ACite%20serial%0D%0ACite%20sign%0D%0ACite%20speech%0D%0ACite%20techreport%0D%0ACite%20thesis%0D%0ACite%20web&sortby=title&interface_language=en&active_tab=tab_templates_n_links&doit= Here's a first stab at a query]. – Jonesey95 (talk) 18:07, 14 August 2016 (UTC)
::{{Ping|Jonesey95}} Thank you. Alas, my recent experience suggests that the relevant categories are not applied in many cases. One of the purposes of my request is to enable doing so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:00, 14 August 2016 (UTC)
:[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=advanced&profile=advanced&fulltext=Search&search=+-insource%3A%22citation%22+-insource%3A%22cite%22+prefix%3Atemplate%3ACite&ns10=1&searchToken=aa23bgtu8n84udty7eouhr263 Does this link work]? It seems pretty basic. It might miss a few that for some reason include the text "cite" or "citation" in the template directly, but I don't think that will be many if any such templates. --Izno (talk) 16:08, 23 August 2016 (UTC)
::{{Ping|Izno}} It's useful thank you, though it does bring up a number of /doc, /sandbox and /testcase pages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:28, 7 September 2016 (UTC)
:::{{ping|Pigsonthewing}} [https://en.wikipedia.org/w/index.php?title=Special:Search&profile=advanced&profile=advanced&fulltext=Search&search=+-intitle%3A%22%2Fdoc%22+-intitle%3A%22%2Fsandbox%22+-intitle%3A%22%2Ftestcase%22+-insource%3A%22citation%22+-insource%3A%22cite%22+prefix%3Atemplate%3ACite&ns10=1&searchToken=3m6r9zgrgrnfencw6gvb18jdr] --Izno (talk) 11:26, 7 September 2016 (UTC)
::::{{Ping|Izno}} That's great; thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:37, 8 September 2016 (UTC)
Gawker
Gawker (Gawker Media) has been driven into bankruptcy, and then bought out by Univision, which will be shutting it down next week. We've got a lot of articles that cite Gawker pages. Can someone send a bot through the database as a whole, looking for everything cited to Gawker, and then making sure that it's archived (archive.org, WebCite, etc)? DS (talk) 19:23, 18 August 2016 (UTC)
:Perhaps {{U|Green Cardamom}} or {{U|Cyberpower678}} could handle this? They've both run bots related to archives in the past. This is an extremely high-priority task. ~ Rob13Talk 19:36, 18 August 2016 (UTC)
::InternetArchiveBot maintains a massive database of URLs it encountered on Wikipedia and specific information about them, including their live states. I can set the states of all the URLs of this domain to dead and the bot will act on it.—cyberpowerChat:Limited Access 20:16, 18 August 2016 (UTC)
:::{{re|Cyberpower678}} I noticed InternetArchiveBot has been disabled for a bit. What's the reason for that? ~ Rob13Talk 20:46, 18 August 2016 (UTC)
::::A lot of bugs have been reported. They're all fixed, but given the massive scope of this bot, it's being extensively tested before being re-enabled.—cyberpowerChat:Limited Access 21:13, 18 August 2016 (UTC)
When archived at Internet Archive, Univision has the option to block viewing at any time for the whole domain with a single line in robots.txt .. I don't know if Univision would do that but once Peter Theil learns the articles are still available online it seems likely he would put pressure on Univision. WebCite only checks robots.txt at the time of archival. Archive.is doesn't care about robots and is outside US law. Maybe I can generate a list of the Gawker URLs and trigger a save for WebCite and archive.is but I haven't done that in an automated fashion before so don't know if it will work. -- GreenC 21:14, 18 August 2016 (UTC)
:I believe WebCite supports archiving, even in batches, of URLs over their API, which is XML based. It will then return the respective archive URL. If you wrote a script to handle that, and generate an SQL script, I can run it my DB and InternetArchiveBot can then make the alterations. I can very quickly populate the respective URLs we are dealing with.—cyberpowerChat:Limited Access 21:23, 18 August 2016 (UTC)
:::It looks like archive.is has already done the work. [https://en.wikipedia.org/w/index.php?title=Special:LinkSearch&limit=500&offset=0&target=http%3A%2F%2Fgawker.com Every link] I spot checked exists at archive.is .. also I checked webcitation.org and can't find docs on batch archiving. And I read they will take down pages on request by copyright owners so same situation as archive.org with robots. Maybe the thing to do is save by default to Wayback and if robots.txt blocks access deal with that later the normal way (not established yet). At least there is backup at archive.is and probably less than 1500 original URLs total. -- GreenC 23:44, 18 August 2016 (UTC)
:Just a note that archive.is seems to be better at archiving Twitter posts, which Gawker articles refer to frequently. I think that might be the better choice for completeness of the archive. The WordsmithTalk to me 21:49, 18 August 2016 (UTC)
::Does it help that all of Gawker's original content is CC-attribution-noncommercial? DS (talk) 22:05, 18 August 2016 (UTC)
:::The [http://legal.kinja.com/using-gawker-media-content-511255350 ToU] says "Gawker Media's original content" but is some of it may be by guest writers plus user comments I would be wary of a bot sweeping up everything as CC. -- GreenC 23:44, 18 August 2016 (UTC)
- InternetArchiveBot now considers the entire gawker domain as dead.—cyberpowerChat:Absent 13:39, 22 August 2016 (UTC)
Archivebot
I know that the new archivebot has started working. But the backlog is enormous with dead links that needs to be archived. In its current paste it would never get close to catch up. I would atleast suggest that there were two archive bots working at the same time. For example the bot has made four edits today. To have any chance of catching up it would need to be active 24/7. BabbaQ (talk) 23:01, 10 September 2016 (UTC)
:IABot is actually designed to be extremely fast and will have no trouble, I've seen it check 5 million articles in 12 hours or so. Right now it's testing new features. A few months ago it rescued over 140k links increasing the number of Wayback links in the English Wikipedia by about 50%. -- GreenC 00:03, 11 September 2016 (UTC)
::Well, hopefully it will start working fast soon. A backlog of 90.000 articles with dead links are currently waiting. Regards,--BabbaQ (talk) 19:36, 11 September 2016 (UTC)
:::I should not that at least 50,000 of them will need manual intervention since IABot was unable to rescue them, meaning no viable archive was found for at least one link on that couldn't be rescued that is tagged.—cyberpowerChat:Offline 22:52, 11 September 2016 (UTC)
A bot that can automatically edit the wikilinks
If an article is renamed, should the wikilinks to that article, in other articles, be changed automatically by a bot to reflect the new article name. It might help users from edit the wikilinks by themselves, because articles get renamed every day. TheAmazingPeanuts (talk) 09:26, 18 September 2016 (UTC)
:I would say, that is is case of WP:NOTBROKEN. --Edgars2007 (talk/contribs) 15:16, 18 September 2016 (UTC)
:Agree with {{u|Edgars2007}}, why would you want to replace a redirect with the page the redirect redirects to? That's kinda the point of redirects. Therefore, {{BOTREQ|notdone}} Dat GuyTalkContribs 15:55, 18 September 2016 (UTC)
::{{ping|Edgars2007}} {{ping|DatGuy}} So you guys are saying it is unnecessary for a bot to automatically charged wikilinks such as this [https://en.wikipedia.org/w/index.php?title=Good_Kid%2C_M.A.A.D_City&type=revision&diff=739960216&oldid=739314027 this] right? TheAmazingPeanuts (talk) 12:56, 18 September 2016 (UTC)
::: Correct. As long as T-Minus (producer) continues to redirect to T-Minus (record producer), there is no need to change any links there. bd2412 T 13:25, 23 September 2016 (UTC)
::::{{ping|BD2412}} Well okay, I got my answer here. Thanks. TheAmazingPeanuts (talk) 16:47, 23 September 2016 (UTC)
Pages without infoboxes
I am trying to take on a new task of adding infoboxes to any page that doesn't have one. It would be great to have a bot that helps categorize these. At the moment I am working off of pages that link to {{tl|Infobox requested}}. The problem is that when people add an infobox to a page, they rarely remove that template from the talk page. So I see two things that this bot could do...
- If a page has an infobox, remove {{tl|Infobox requested}} from the talk page
- If a page does NOT have an infobox, instead of worrying about adding a template to the talk page, add the page to a hidden maintenance category {{cl|Pages in need of an infobox}}
Just my thinking. --Zackmann08 (Talk to me/What I been doing) 17:56, 29 July 2016 (UTC)
- The first task seems a great idea to me, there are far to many irrelevant and outdated templates on the site and this would cull some unneeded ones. The second idea assumes that all pages need an infobox. Welcome to the infobox wars. I suggest you drop that idea until such a time as we have consensus to create infoboxes on all pages. ϢereSpielChequers 18:11, 29 July 2016 (UTC)
- Let me ping all editors from the Zeta-Jones RfC so we can discuss this further. (Kidding, of course.) The first idea is interesting, and I might take it up. Let me see if it's feasible with AWB. It should be. ~ Rob13Talk 18:17, 29 July 2016 (UTC)
- I can do the first task easily, either manually with AWB or using a bot running on AWB. As for the second task, I think you should probably seek consensus for it first. Omni Flames (talk) 06:06, 30 July 2016 (UTC)
- {{BOTREQ|BRFA|OmniBot 6}} Omni Flames (talk) 07:55, 30 July 2016 (UTC)
"a" before initial vowel in double square brackets
Anyone interested fixing this? See {{phab|T100721}} for more details in necessary. -- Magioladitis (talk) 07:47, 1 August 2016 (UTC)
:{{re|Magioladitis}} I don't think this is suitable as a bot request. There are always cases where the "a" isn't the English indefinite article, or the initial vowel isn't pronounced as a vowel sound. There are lots of examples at User:Sun Creator/A to An. -- John of Reading (talk) 08:11, 1 August 2016 (UTC)
John of Reading OK this means, this task is not suitable for AWB neither. We need someone to create a list of pages then and see how many occurrences are there. -- Magioladitis (talk) 08:17, 1 August 2016 (UTC)
I mean as a general fix. Any other suggestions are welcome but they should be tested before. -- Magioladitis (talk) 10:34, 1 August 2016 (UTC)