Module talk:Excerpt/Archive 4

{{Aan}}

Disambiguating identical section names

I tried to transclude this section, but it produced an error:

{{excerpt|Climate change in India|Agriculture_2}}

Is this not possible because the transcluded section has the same name as another section in the same article? The other section appeared to be transcluded without errors:

{{excerpt|Climate change in India|Agriculture}}

Jarble (talk) 01:29, 9 November 2023 (UTC)

: {{re|Jarble}}, this isn't an answer to your question, and it definitely deserves one, but here are a couple of ideas, more like workarounds, that might help. First thing I thought, was, "Why does it have two sections with the same name?" Sometimes this is justifiable due to MOS:NOBACKREF, and perhaps that's the case here, but sometimes you can just alter one of the section names so they are not the same anymore. Pay attention to any incoming section links from other articles, that you would have to update, if you choose this path; you can examine the results of Special:WhatLinksHere/Climate_change_in_India and search the results for "section"; there are only four sections with in-links, and none of them point to either of the "Agriculture" sections, so you are in luck; if you want to change one of them, you're free to, as long as you can find a reasonable name that others won't object to, and revert. The first one could maybe be "#Agricultural emissions" or "#Agricultural byproducts" or similar.

: The second section is poorly named, as it isn't about agricultural byproducts, it's about the negative economic impact on people, especially poor people, due to reduced crop yields caused by climate change. I think "Reduced crop yields" would be a fine name for that section. If you decide to change it, be sure and leave a detailed edit summary stating your justification for it, and imho it would also be worth mentioning that you checked "WhatLinksHere" and there are no incoming links needing adjustment.

: Back to your original question: I'm subscribed, so I'll be watching for a response, too. But in the meantime, I hope these ideas might help at least in this one case. Cheers, Mathglot (talk) 05:16, 9 November 2023 (UTC)

::While MediaWiki itself allows specifying a number at the end of a section name to mean "the Nth occurance of that heading", the way the underlying module of excerpt (Module:Transcluder) works is that it attempts to find a heading with the exact wikitext you provide (see here), and has no concept of the repeated occurance of a header. This means that when you provide "Agriculture_2", it tries to find a heading with that exact wikitext instead of the 2nd occurance of "Agriculture". This could probably be implemented as a feature, since I'm pretty sure this isn't the first time I've seen this issue crop up. Aidan9382 (talk) 07:41, 9 November 2023 (UTC)

:Anchors could be the solution here. I coded the module naively to expect the section heading text to match the parameter. This doesn't work when there's an anchor in there. For example, Album#Tracks has a heading of ==Tracks{{anchor|Music track}}==, and the only way to extract it is by matching that literally with {{excerpt|Album|Tracks{{((}}anchor{{!}}Music track{{))}}}}. It would be really nice if both {{excerpt|Album|Tracks}} and {{excerpt|Album|Music track}}} worked here. Then we could just add appropriately specific alternative names to subsections, e.g. by concatenating section + subsection titles in an anchor. Certes (talk) 10:15, 9 November 2023 (UTC)

::I've had a go at this in Module:Transcluder/sandbox. It successfully finds anchors, both from the Anchor template and from the span tag to which subst:Anchor expands. It attempts to find headings which have an anchor next to them but fails. I introduced a bug somewhere on lines 482-486 but, as Scribunto disables even the crude debugging that comes with standard Lua, it eludes me. It all works perfectly in an offline Lua session without Scribunto. Certes (talk) 17:50, 9 November 2023 (UTC)

:::I managed to fix {{slink|Album|Tracks}} not working (side effects of how lua handles variable assignment). Not entirely sure why Unusual features still doesnt work, will keep looking into that one. Aidan9382 (talk) 18:24, 9 November 2023 (UTC)

:::: Getting anchors working seems like a nice addition, but isn't there a way to deal with the OP question without adding anchors? Does the Module have access to the generated html, or only the wikicode? If the former, then we could distinguish duplicates from the "id" in the span tag not matching the text of the <H> tags. E.g., for the "Agriculture" sections in Climate change in India we have one H3 and one H4:

::::*

Agriculture

::::*

Agriculture

:::: The ToC clearly knows the difference; can we do what it's doing? Or do we not see that from the Module? Mathglot (talk) 18:34, 9 November 2023 (UTC)

:::::The most html we get access to is probably whatever frame:preprocess() is willing to offer, which in this case (Using == test == as an example) is =='"`UNIQ--h-0--QINU`"' test ==, and the strip marker isnt respected by mw.text.unstrip (it'll be replaced with an empty string since it doesn't like exposing raw HTML), so we have no way to generate the id that the html would normally have. Basically, I'm pretty sure we have to track headings ourself. Aidan9382 (talk) 18:40, 9 November 2023 (UTC)

:::::: Pity; thanks for that informative (and quick) reply. Mathglot (talk) 18:45, 9 November 2023 (UTC)

:::::We could track heading counts in the wikicode, using gmatch rather than match, at the expense of adding a little more complexity. I suppose the second Agriculture section would be a second choice after trying and failing to find a section called literally "Agriculture 2". However, what we really want may be the Agriculture subsection of Economic impacts, regardless of whether it's the first, second or only heading of that name. That's better done with an anchor (if we brush aside the minor quibble that it doesn't actually work). Certes (talk) 18:51, 9 November 2023 (UTC)

:::::: {{ec}} Not to complicate things unduly, but your wording above made me think, "Okay, why *don't* we somehow allow them to specify 'the Agriculture subsection of Economic impacts'{{thinspace}}?" E.g., something like:

::::::* {{excerpt|Climate change in India|Agriculture|in=Greenhouse gas emissions}}

::::::* {{excerpt|Climate change in India|Agriculture|in=Economic impacts}}

::::::* {{excerpt|Climate change in India|Agriculture|in=#top}} /* in case of H2 section */

:::::: Does that offer anything we could use? Mathglot (talk) 19:06, 9 November 2023 (UTC)

:::::::That works, again at the expense of complexity. I'm looking at this through the coder's end of the telescope rather than the editor's, so I don't know whether that's what's desired. We are just string matching; there's no fancy data structure from which to pick sectionText['Economic impacts']['Agriculture']. Certes (talk) 19:20, 9 November 2023 (UTC)

:::::::: Yes, I assumed it would be complex. Trying a though experiment, I imagined adding a function to scan the entire text for section headers, build a hash ('dictionary'?) of arrays, with each section header being one value in the hash (which maybe would just have numeric-ish keys: 1, 1.1. 1.2, 2, ...), with two values, one being the section name, and the other value being an array consisting of the name of every parent section, up to level 2. I believe that could be constructed in one pass, but I haven't thought it out. Once we had that, I see two possibilities: either use the param {{para|in}} feature and consult the array for that section to see if it matches one of the higher level section names listed in the array, or more interestingly, see if our hash actually "matches" the mw-built ToC structure (why wouldn't it? they have to be doing something similar) and then in that case, we can just go back to Agriculture_2 (or Agriculture#2, or whatever) and figure it out on our own, without the "in" param. I grant it would be complex, but think of the glory. Could be worth double your normal editor salary, or even a barnstar. Mathglot (talk) 19:38, 9 November 2023 (UTC)

:::::::: If you exported that as a library function, I bet that would be useful all over the place. Heck, maybe we could just ask wmf for it; someone over there must have something similar they could adapt for general use on our side. Mathglot (talk) 19:47, 9 November 2023 (UTC)

::::Managed to fix {{slink|Abercwmeiddaw quarry|Unusual features}} too, though I felt a bit insane while investigating (I always forget gsub has a 2nd return that you have to be careful about, I thought mw.text.trim was somehow failing to trim a space). Aidan9382 (talk) 19:01, 9 November 2023 (UTC)

:::::Hi! IMHO, I think just renaming one of the sections, or adding some invisible wikitext like an anchor, is the most sensible solution, especially considering how rare this situation seems to be. But if this situation isn't considered too rare, and a solution is required that doesn't imply renaming sections or adding invisible anchors, then perhaps the simplest approach would be to add a third parameter to getSection that simply skips a given number of sections, like so:

:::::

{{Excerpt|Climate change in India|Agriculture|skip=1}}

:::::Ugly as hell, but surely simpler to implement and perhaps acceptable given the rarity of the situation. Also, no matter how sophisticated the solution, it seems to me like there will always be a need of an extra parameter and the user will always have to read some documentation about it, so if every solution is equally unintuitive to the user, we might as well pick the simplest one to implement. Sophivorus (talk) 23:04, 9 November 2023 (UTC)

:::::: {{ec}} I love "ugly" when it's easy for a user to understand, and this surely is. Ugly is beautiful. Might need something like a MOS:HIDDENLINKADVICE hidden comment at the first one (or all of them, if skip=3; god I hope not...), letting editors know that they might break something remote from there, if they removed/renamed (any of) the duplicate section(s). Mathglot (talk) 23:18, 9 November 2023 (UTC)

:::::::I'm concerned that a change to a different section could quietlEy break the excerpt. For example, if we do {{Excerpt|Foo|Agriculture|skip=1}} (or whatever syntax we pick) then changing an earlier, unrelated section heading to or from Agriculture (or inserting or deleting the section entirely) will cause the wrong section or no text to come out. Certes (talk) 23:34, 9 November 2023 (UTC)

:::::::: There's plenty of precedent for that, as we deal with it all the time with respect to all section redirects at Wikipedia; There are various approaches to dealing with it, of which MOS:HIDDENLINKADVICE is one. See the comment just above. Mathglot (talk) 23:39, 9 November 2023 (UTC)

:::::::: Also, that is *already* a risk with Excerpt, any time you do a section excerpt, and we seem to accept that risk, and I don't know what proportion of added section excerpts that included the {{para|skip}} param would break, which would not already break even without that param. Mathglot (talk) 23:45, 9 November 2023 (UTC)

:::::::::Broken excerpts are normally tracked from :Category:Articles with broken excerpts and routinely fixed. Most broken excerpts with the skip parameter would end up there too. For example, in {{Excerpt|Climate change in India|Agriculture|skip=1}}, if the first Agriculture section gets renamed, then the second Agriculture section would be skipped, yielding an empty excerpt and thus categorizing the the page. If the second Agriculture section gets renamed instead, same result. Unless, of course, there happens to be another Agriculture section down below. But this would surely be super rare? That being said, the simpler solution of renaming the section or adding an anchor would avoid all that. Sophivorus (talk) 23:58, 9 November 2023 (UTC)

{{resolved}}

{{re|Jarble}} I have gone ahead and renamed that section to {{slink|Climate change in India|Reduced crop yields}}, which is a better name for it anyway, even without the name collision. Feel free to excerpt from it now using that section name. (The "resolved" indicator is for the OP question, and not intended to stifle further conversation on the numerous interesting ongoing threads of discussion in this section, so by all means continue.) Mathglot (talk) 01:49, 10 November 2023 (UTC)

Lua error in mw.text.lua at line 25: bad argument #1 to 'match' (string expected, got nil).

Why does this error appear when I try to transclude this section?

{{excerpt|Variadic function|In Rust|subsections=yes}} Jarble (talk) 17:56, 13 November 2023 (UTC)

:Module:Transcluder's getTemplates() function is getting confused by the (eval $e:expr) => {{ [...] }} line as it thinks the presence of {{ is meant to indicate the start of a template and the code afterwards doesn't consider it could be receiving invalid data. I've fixed it with this edit. Aidan9382 (talk) 18:15, 13 November 2023 (UTC)

Param references=no doesn't skip sfn

I noticed that adding {{para|references|no}} doesn't skip {{tl|sfn}} templates. Shouldn't it?

:That's odd. We thought of that possibility, and Excerpt calls Transcluder with the fixReferences = true option. It's working for a lot of other citations in that extract, for example 347 (ref 2 in the donor article) which is defined right next to 346 in the donor's lead and reused in the section. Certes (talk) 17:48, 14 January 2024 (UTC)

::Transcluder appears to be picking up :27 from the excerpt's body when looking for :2 and therefore deciding no rescuing needs to take place. The issue appears to be with the refBody regex, which has entirely optional conditions after the refName up until the [^>/]*, meaning any ref starting with :2 can match. Not sure how to immediately fix this one. Aidan9382 (talk) 18:07, 14 January 2024 (UTC)

:::Good spot. Why do we need the [^>/]*? Would something like %s* work equally well, and correctly fail to match "7"? Certes (talk) 20:54, 14 January 2024 (UTC)

::::Changing to %s* seems to fix the problem: [https://en.wikipedia.org/w/index.php?title=Special%3AComparePages&page1=Module%3ATranscluder&page2=Module%3ATranscluder%2Fsandbox]. Nothing else looks obviously broken but I haven't done full regression tests. Before making this change in the sandbox, I discarded a previous change I made in the sandbox to allow excerpts of a section whose heading contains an anchor. That change never got released but is still worth considering. Certes (talk) 22:27, 14 January 2024 (UTC)

:::::The reason I suspect [^>/]* existed was for when there was another property of the reference after the name (E.g. group=abc) while still making sure its not a self-finishing tag (/>) Aidan9382 (talk) 07:39, 15 January 2024 (UTC)

::::::Yes, though I'm not sure how we delimit the name and mark the start of that property if quotes are optional. I can't find formal documentation on the ref syntax but it seems that group= is allowed and we ought to handle it. That seems awkward in Lua with its limited alternation and optionality. Doing the job properly might need four tentative gsubs on name=foo, name="foo", group=bar and group="bar", with a pragmatic decision that name=foo group=bar denotes a name and a group rather than a name of "foo group=bar". I'm amazed that we don't have a PCRE module for Lua yet, but I suppose it's potentially inefficient. Certes (talk) 10:44, 15 January 2024 (UTC)

:::::::I've implemented an idea in the sandbox, which is following refName with ["' >]. It isn't the prettiest looking capture group to follow on with, but this guarantees that either the name gets finished with a quote (), a space (), or that the ref tag ends there (), while still supporting a later occurance of a group or other properties. Would that reasonably fit the potential cases? Aidan9382 (talk) 12:26, 15 January 2024 (UTC)

::::::::I haven't tested it but I think you may risk consuming the terminating > in the ["' >] set, making it not match the simple > later in the regexp. That would fail to match a simple <ref name=foo> without quotes or spacing. There are potential parameters other than name= and group=: Help:Cite also has examples of extends= and follows=, though I don't recall seeing them in actual use. I'm not sure how to do this properly without writing a full matchRef function which strips the <ref and > and / then parses the parameters with several gsubs to match something like (%w+)%s*=%s*"(.-)" then (%w+)%s*=%s*'(.-)' then (%w+)%s*=%s*(%S+) into a table { name = 'foo', group = 'bar' }. Certes (talk) 13:22, 15 January 2024 (UTC)

Can we simplify the list of templates on the configuration page?

The configuration page has a very long (and probably incomplete) list of templates that transclude {{tl|Ambox}} and {{tl|Navbox}} and {{tl|Sidebar}}. Can we automatically generate this list instead of listing all of these templates manually?

It ought to be possible to retrieve a list of templates from :Category:Navigational boxes using a Lua module, but I don't know how to do this. Jarble (talk) 20:46, 21 January 2024 (UTC)

:I don't think getting the list of templates is possible in Lua. However, it could easily read a data page where the category has been dumped manually and perhaps updated periodically by a bot. It could even be a submodule in Lua syntax which sets an exported variable to the list of template names. However, there are a lot of navbox templates. Certes (talk) 21:29, 21 January 2024 (UTC)

Line 98

Please, add for first span element style element to have left padding/margin (e.g. ' style="padding:0 0 0 0.5em;"'). --109.175.38.135 (talk); 11:06, 25 January 2024 (UTC)

:Comment copied from WT:VPT: "A problem is visible at: Toilet#Without water (desktop version seen at mobile)". Certes (talk) 13:00, 25 January 2024 (UTC)

:Should this change be implemented into the skins instead of the {{Excerpt}} template? Vector 2022 (the default skin for desktop) is adding margin/padding while Minerva Neue (the default skin for mobile) does not. --LightNightLights (talkcontribs); 15:58, 25 January 2024 (UTC)

::Maybe; currently, spacing is needed (edit link including parentheses is sticked left to text). --109.163.175.223 (talk); 03:09, 26 January 2024 (UTC)

: If you do, style="padding-left:0.5em;" is equivalent, and easier to read. Mathglot (talk) 03:36, 26 January 2024 (UTC)

unusual issue with lead excerpt

File:Excerpt template error 2024-01-31.png

The section {{section link|2020s in Asian political history|2022 Mahsa Amini protests}} has an unusual case where it's hard-coding [[File: Protestors on Keshavarz Boulevard

Bottom: Protestors at Amir Kabir University }}|thumb|]] from the infobox.

Is this a parsing issue, since that text is part of the caption on Mahsa Amini protests, not actual images? Or maybe due to the use of the {{multiple image}} template? = paul2520 💬 18:11, 31 January 2024 (UTC)

:The problem appears to lie around Module:Excerpt#L-183–184 where, having recognised that the infobox of Mahsa Amini protests contains various images named .jpg, etc., it mistakes the caption text "middle:" for a namespace and assumes that the text following it is a filename. Certes (talk) 19:03, 31 January 2024 (UTC)

:The rough issue here appears to be that excerpt is finding the entire text of the {{tl|multiple images}} template while trying to grab infobox images, seeing the string "middle:", (incorrectly) thinking its the start of a namespace, and trimming the "filename" to be everything after that (see line 184). Now that I think about it, one of the tests on the testcases page ({{Excerpt|Yellow}}) also displays this behaviour, just without the incorrect trimming on-top. The module should probably try detect if the image value is a template and, if so, either ignore it or treat it differently. Aidan9382 (talk) 19:03, 31 January 2024 (UTC)

::Woops, didn't catch the new comment message until the moment I hit reply. I basically said the same thing as Certes. Aidan9382 (talk) 19:05, 31 January 2024 (UTC)

{{tl|Infobox event}}

I see markup errors when I try to include excerpts of pages that use this template. Should it be added to the list of excluded templates? Jarble (talk) 04:43, 7 February 2024 (UTC)

:Could you give an example of a page with that template which has issues? Aidan9382 (talk) 06:59, 7 February 2024 (UTC)

:@Jarble Hi! I did a simple excerpt of a page using Template:Infobox event in my sandbox and I see no problem. Can you help us reproduce the issue? Sophivorus (talk) 14:14, 7 February 2024 (UTC)

::{{reply to|Sophivorus|Aidan9382}} The images displayed inside the infobox should not appear in the excerpt, but one of the images appeared here: how did this happen? Jarble (talk) 17:24, 7 February 2024 (UTC)

:::@Jarble Ah! That is expected behavior, desirable for most excerpts. If you don't want the image, you can set files=0. See what I did in your sandbox, cheers! Sophivorus (talk) 18:01, 7 February 2024 (UTC)

getTags

@Certes @Aidan9382 Hi! Today I added a new getTags method to Module:Transcluder/sandbox. The regexes are still rather simple and probably fail in many edge cases, but once it's more robust it can help us get things like galleries, blockquotes, divs, etc. Furthermore, it could be used in other methods to extract stuff like tags and perhaps even tags. One thing that it should handle though are self-closing tags such as and . I hope you find this idea promising! Sophivorus (talk) 14:23, 7 February 2024 (UTC)

:That looks interesting but does present challenges with Lua's limited regexp syntax. Beware of nested tags, e.g. <noinclude>The world is flat<ref>Anne Idiot</ref></noinclude>: because Lua has no equivalent of \1, a naive regexp may assume that </ref> terminates the noinclude tag here. (Someone could do the world a huge favour by implementing PCRE in Lua, but I suspect it would run for more than ten seconds.) Certes (talk) 14:56, 7 February 2024 (UTC)

::Glad you liked it! Today I did several improvements. The method is now able to handle self-closing tags and nested tags (as long as they're of different types, but afaik nested tags of the same type are not allowed). Another edge case I didn't quite cover are

tags, since both opening and closing section tags are self-closing tags. But I think we can continue handling them differently in the getSection method, since they are such a special case. Next time (next week) I'd like to expand the test cases and maybe start using getTags in other methods, such as getReferences. Ideas and concerns welcome, cheers! Sophivorus (talk) 14:24, 8 February 2024 (UTC)

:::Hm, I just realized that HTML tags such as divs and spans can be nested and of the same type, so I'll have to refine the method further. No problem, I just hope it doesn't become slow. Sophivorus (talk) 14:53, 8 February 2024 (UTC)

::::That looks a lot more robust. A couple more things to watch are spaces within the tag (you've caught some of them) and parameters such as <ref name="Foo">, which is closed by just </ref>. <td> can also be a pain because the closing tag is optional; it can be closed by a second td which might look nested to a naive parser. Certes (talk) 15:33, 8 February 2024 (UTC)

:::::@Certes Today I coded a first version of getTags that supports nested tags of the same type (things like

foo
bar
). It wasn't easy and I just got it to work, so I didn't really test it much. Next time I'll add many more test cases and fix as needed. As to things like unclosed tags, in such cases my heart leans towards fixing the wikitext rather than supporting them. Sophivorus (talk) 15:44, 15 February 2024 (UTC)

::::::Have you looked around the internet for Lua HTML parsers? I don't see anything specifically for tags but there are plenty of general HTML parsers written in Lua, and some may have licences suitable for re-use here. Certes (talk) 16:26, 15 February 2024 (UTC)

:::::::Hi! I confess no, I haven't looked around. Should probably had, but then again, I enjoyed myself quite a bit while writing the code, and our use case is probably unique enough to warrant custom code anyway. I may be wrong though, but in any case, today I added several new test cases for getTags at Module:Transcluder/testcases and it's looking quite robust, dare I say. Sophivorus (talk) 15:24, 20 February 2024 (UTC)

WikitextParser

@Certes @Aidan9382 Hi again! As I mentioned before, I'm thinking on generalizing Module:Transcluder into Module:WikitextParser (Transcluder would then require and use WikitextParser). I think such a module would be more useful, easier to maintain and extend, and more likely to attract new developers. Thoughts? Sophivorus (talk) 14:30, 7 February 2024 (UTC)

:That sounds useful but would be a very serious undertaking and might not perform well enough for use during page rendering. https://pypi.org/project/mwparserfromhell/ does something similar for Python and may be worth studying. Certes (talk) 15:00, 7 February 2024 (UTC)

:Does sound like an interesting idea and the modularity would be nice, though I'm curious how involved/complex you intend for it to be. Also, there's already a similarly-named module somewhat related to that idea, Module:Wikitext Parsing, which is mainly to do with helping handle nowiki-like tags if that'd be of any interest. Aidan9382 (talk) 23:02, 8 February 2024 (UTC)

:It may be worth liaising with a similar development described at Wikipedia talk:Lua/Archive 12#A new template parser. Certes (talk) 16:37, 13 February 2024 (UTC)

::@Aidan9382 @Certes Hi, thanks for the support and links! mwParserFromHell is definitely an inspiration. As to Module:Wikitext Parsing and Wiktionary:Module:template parser, I think they may be useful but I'm not sure how yet. Today I gathered courage and created Module:WikitextParser and Module:WikitextParser/testcases with some code taken from Transcluder. I also started an experiment on good ol' parseFlags method. There's still a long way to go and much may change, but what I currently imagine for this module is a bunch of relatively simple methods to parse wikitext, that other modules may then use and combine as they see fit. I'll try to continue development next week, feel free to contribute if you want! Cheers! Sophivorus (talk) 16:56, 22 February 2024 (UTC)

:::Hi again! I did a lot of progress with Module:WikitextParser, so I started testing it with Module:Transcluder/sandbox. The testcases look good so far! Some thoughts:

:::* I'm hesitating whether to move parseFlags (or some version of it) to WikitextParser and add an extra "flags" parameter to all the methods (getTags, getTables, etc). It would certainly make the methods more useful and versatile, but also more complex and difficult to document.

:::* I'm currently testing WikitextParser in Transcluder, but eventually I'd like to use WikitextParser in Module:Excerpt directly, instead of going through Transcluder (for performance reasons). I guess that's another reason to move parseFlags to WikitextParser.

:::* Eventually, Transcluder would be deprecated but kept working for any modules that still use or prefer it.

:::* WikitextParser, unlike Transcluder, doesn't throw errors, but rather nil when something goes wrong.

:::Kind regards, Sophivorus (talk) 16:08, 29 February 2024 (UTC)

Should subsections be transcluded without <code>subsections=yes</code>?

One of the subsections in this article is transcluded even if subsections=yes is not included as a parameter. This only happens when the section heading is in this format:

= History and motivations =

The section appears to be included in this excerpt:

{{excerpt|Computational sustainability}}

Should this section not be transcluded in this case? Jarble (talk) 16:37, 7 March 2024 (UTC)

:Does this occur only when there is a single equals sign in the heading? Certes (talk) 18:56, 7 March 2024 (UTC)

::{{reply to|Certes}} Yes, I've never seen this happen when there is more than one equals sign in the heading. Jarble (talk) 21:47, 7 March 2024 (UTC)

:::Per Help:Wikitext#Sections, {{tq|A single {{=}} is styled as the article title and should not be used within an article.}} Changing to == should fix the problem and potentially fix other problems with the article too. Certes (talk) 23:31, 7 March 2024 (UTC)

Ref error ruwiki

Hi. I tried to excerpt the lead from ru:Отравление Алексея Навального here - [https://ru.wikipedia.org/wiki/Участник:RenatUK/Черновик] and ref 4 is giving me a reference error. Does anyone know why and how to fix it? Renat 05:53, 22 March 2024 (UTC)

Reference error

I just wanted to notify that there is, currently, a reference error with the excerpt in [https://en.wikipedia.org/w/index.php?title=Utilitarianism&oldid=1214916438 in this article] (ref 153). I thought it was related to the fact that it uses a specific template called "Cite Moulin 2004". So I [https://en.wikipedia.org/w/index.php?title=Utilitarian_rule&diff=1214931542&oldid=1214929874 modified the transcluded reference] to use a more generic format, but it doesn't appear to have solved the issue. Alenoach (talk) 02:45, 22 March 2024 (UTC)

:This seems to be an issue with |templates=0 causing the {{tl|Cite book}} inside the reference to be removed, making the reference content empty and causing an error. Aidan9382 (talk) 07:29, 22 March 2024 (UTC)

::Ok, thanks. I fixed it by whitelisting reference templates with the excerpt parameter "templates=Cite". Alenoach (talk) 01:37, 23 March 2024 (UTC)

::Actually, sometimes you need a more comprehensive whitelist, like e.g. templates=Cite,cite,Citation,rp Alenoach (talk) 03:00, 23 March 2024 (UTC)

Excerpt a paragraph, less its bundled citation with an embedded list

Having a bundled citation with an embedded bullet list for several different sources is not unusual. I tried excluding a bundled ref at the end of the first paragraph of 2023 Brazilian Congress attack using {{para|references|no}} and got a weird result, so added {{para|lists|no}} on top of that, but still doesn't look right:

{{cot|bg=darkseagreen|indent=0.8em|excerpt paragraph #1 of 2023 Brazilian Congress attack minus the refs:}}

{{excerpt|2023 Brazilian Congress attack |paragraphs=1 |hat=no |references=no |lists=no |inline=yes}}

{{excerpt|2023 Brazilian Congress attack |paragraphs=1 |hat=no |references=no |lists=no |inline=yes}}

{{cob}}

The final text I want to see in the excerpt is, "{{xt|...and disrupt the democratic transition of power.}}" I want to keep the {{para|inline|yes}} so I can tack on my own ref instead of the bundle. Anything I'm missing here? Mathglot (talk) 06:55, 15 April 2024 (UTC)

: Something odd is going on. In my (now undone) rev. 1219016229, I attempted a fix by adding a second test after the first, adding param {{para|templates:-cite}}. What happened was that the two tests showed the same result, an improvement over the first attempt, where now there is only a hanging <ref> tag (and no citation content or anything else: just the opening ref tag itself) after the desired text. But the top test in that revision is unchanged from the (only) test in the previous revision (and current revision, after the undo's). So, somehow, the addition of test two in rev. 1219016229 is affecting the result of test 1 in that revision, even though I didn't change that one (afaik). Very odd. Mathglot (talk) 07:15, 15 April 2024 (UTC)

: Also tried: {{excerpt|2023 Brazilian Congress attack |paragraphs=1 |hat=no |references=no |lists=no |inline=yes|templates=-cite web,cite news}} but no go. Mathglot (talk) 07:19, 15 April 2024 (UTC)

::Module:Transcluder is getting very confused by this scenario. It appears to be including the list objects from later paragraphs (specifically *{{Cite web |title=Bolsonaro deixa o [...] and *{{Cite web |title=Brazil: Germany [...]), because getParagraphs seems to think the list objects (which in this case are actually the bundled citations) are unrelated to the paragraph, and therefore not removing them along with said paragraph. This also consumes the references' starting ref tag, so it doesn't get removed later on. Thats why, when you try to do {{Excerpt|2023 Brazilian Congress attack|references=no|paragraphs=1}} (so not specificying no lists), you get the 2 bullet points from the next 2 paragraphs leaking out instead. Aidan9382 (talk) 07:45, 15 April 2024 (UTC)

::Also, |lists=no is probably failing because it then removes the ending ref tag to the starting ref tag (the first reference won't be on a newline so it doesnt get picked up as a list by Transcluder, but then the rest of the bundled citation gets consumed). Aidan9382 (talk) 07:47, 15 April 2024 (UTC)

New tool and grant ideas

Hi guys! Tonight I had an idea for a new tool, called ExcerptHunter (inspired in [https://citationhunt.toolforge.org CitationHunt]). It's basically a semi-automatic tool for doing Template:Excerpt#Replacing summary section with excerpt of child article. I wrote a small demo to help explain. First add the following to your common.js:

mw.loader.load('//en.wikipedia.org/wiki/User:Sophivorus/ExcerptHunter.js?action=raw&ctype=text/javascript');

Then visit User:Sophivorus/ExcerptHunter and you should see the interface. Note that clicking Publish doesn't work yet, but I think the interface already conveys the idea. What do you think? The tool could grow in many ways. For example, by allowing users to limit articles to a category or topic of interest, by showing a live preview next to the wikitext, by working in other wikis, etc.

However, this new tool idea, along with some bugs and feature requests that have been piling up, and other ideas I have in mind (such as generalizing Module:Transcluder into a regex-based Module:WikitextParser) all add up to more than I'm able to handle in my volunteer time.

Therefore, I'm thinking on requesting a Rapid Grant to help me develop ExcerptHunter, WikitextParser, as well as any ideas you come up with and generally catching up and giving a boost to everything excerpt-related. What do you think? Would you support such a grant? Would you like more details, or request some specific work to be done? Looking forward to your reply! Kind regards, Sophivorus (talk) 04:28, 3 December 2023 (UTC)

:@Certes @Aidan9382 I should also mention that it would be a pleasure and an honor to to present a shared grant with you, in case you're interested!!! Sophivorus (talk) 13:40, 4 December 2023 (UTC)

:Please please don't do this in a widespread or semiautomated way, or add tools that make it easier for others to do. The excerpt template is one of the most harmful (reader- and especially author-hostile) changes to Wikipedia in recent years, and its significant proliferation will do dramatic damage to the encyclopedia project. –jacobolus (t) 19:18, 10 December 2023 (UTC)

::Well, I guess the silence and concern expressed imply my proposal wouldn't be welcome. Oh well... Sophivorus (talk) 16:54, 18 December 2023 (UTC)

:::I completely agree with @Jacobolus's comment above. It usually does more harm than good. Clayoquot (talk | contribs) 18:44, 13 May 2024 (UTC)

New Template doc section Incompatibilities

I started a new template doc section, {{slink|Template:Excerpt/doc|Incompatibilities|nopage=yes}}, to hold a description (or perhaps a bullet list?) of incompatibilities between Excerpt and other templates, modules, or functions. Please add entries to it that you know of. Thanks, Mathglot (talk) 01:27, 13 June 2024 (UTC)

2 points

first of all, in TemplateData, I believe the default for onlyfreefiles is supposed to be "yes", not "no". Secondly, can anyone help me understand why the template isn't working for me- here' my code:

{{excerpt|Calvin Coolidge|only=files|hat=no}}

it's not returning any files when the 'only' parameter is set, but it does return files without it. I see the code has support for searching infoboxes, maybe that's bugged? JoeJShmo💌 03:29, 24 July 2024 (UTC)

:The default for onlyfreefiles is indeed "yes", so I've corrected the templatedata there. The issue with only=files is that, since the template gets removed by Module:Transcluder, this module then doesn't see any infobox template to search for images in, so it doesn't find them. Not sure of a fix so far. Aidan9382 (talk) 06:57, 24 July 2024 (UTC)

::Got it, thanks. do you know another way to extract a specific file from an article? I came up with a longer findinpage+find function that works, save for the fact that I don't know how to use the article itself as the string in the 'find' function. JoeJShmo💌 10:07, 24 July 2024 (UTC)

Transclude lead excerpt doesn't recognize imagemaps

See User:Queen of Hearts/D#Passports; {{t|transclude lead excerpt}} fails to show the lead of User:Queen of Hearts/Drafts/List of passports, which begins with an WP:Imagemap. Thanks, Queen of Hearts (talk) 05:25, 7 September 2024 (UTC)

Template-protected edit request on 27 August 2024 don't excerpt shortcuts

{{edit template-protected|Module:Excerpt/config|answered=yes}}

{{td|'[Aa]nchor',|'[Aa]nchor', '[Ss]hc?', '[Ss]hort', '[Ss]hortcut' '[Ss]horthand', '[Pp]olicy shortcut'}}

These additional templates are as problematic as {{t|anchor}}. Excerpting them misinforms because the shortcuts redirect to the original page. Also, {{t|shortcut}} has a built-in {{t|anchor}}. The use case is essays like {{diff||1242505810}}. 142.113.140.146 (talk) 04:54, 27 August 2024 (UTC)

:Unlike anchors, those elements are designed to be seen, looks like there is already an option to not show them as was done in the example above. — xaosflux Talk 12:44, 27 August 2024 (UTC)

::Yes, I had to set the manual option {{para|templates|-policy shortcut}}. But Module:Excerpt/config is about defaults, and I claim that there is no use case that wants to show these elements while excerpting. 142.113.140.146 (talk) 15:47, 27 August 2024 (UTC)

:File:Red information icon with gradient background.svg Not done for now: please establish a consensus for this alteration before using the {{Tlx|Edit template-protected}} template. --Ahecht (TALK
PAGE
)
14:45, 27 September 2024 (UTC)

displaytitle and sections

I believe parameter "displaytitle" should not be appended with the section. For example,

{{tlx|excerpt|Tectonics on icy moons|Plate tectonics|displaytitle=Plate tectonics on icy moons}} should render as "This section is an excerpt from Plate tectonics on icy moons."; currently, it's rendered as "This section is an excerpt from Plate tectonics on icy moons § Plate tectonics." fgnievinski (talk) 01:47, 19 October 2024 (UTC)

images in infobox building

It seems images in template:Infobox building are causing some difficulties; could somebody more knowledgeable take a look at this draft section, please? the problem appears at "File:1; |thumb|". thanks! fgnievinski (talk) 05:46, 22 October 2024 (UTC)

:That article was drastically misusing {{tl|Infobox building}}. I fixed it to use a normal multiple image template instead. – Jonesey95 (talk) 19:31, 22 October 2024 (UTC)

::thank you! fgnievinski (talk) 01:40, 23 October 2024 (UTC)

Excerpt misses {{tag|p}} in last paragraph

I have been trying to fix a bug in scripts that are not able to detect excerpts (namely, User:Caorongjin/wordcount and Wikipedia:Prosesize). However, there seems to be a bug in the Module:Excerpt. Namely, I see the {{tag|div}} that has the classname of "excerpt"; but all the paragraphs with the exceptions of the last paragraph are wrapped in {{tag|p}}. See for example Enhanced oil recovery#Greenhouse gas emissions. —Caorongjin 💬 21:44, 29 October 2024 (UTC)

Italic title issue

The blacklist at Module:Excerpt/config currently does not filter out {{tl|Italic}}, which redirects to {{tl|Italic title}}. This led to Criminal law in the United States being italicized because it excerpted Res gestae which (until just now) transcluded {{tl|Italic}}. -- Visviva (talk) 19:28, 15 September 2024 (UTC)

:After a bit of fiddling I found it also does not filter out {{tl|Italics title}} -- Brit milah was causing Religion and circumcision to italicize and I could not for the life of me figure out why. wound theology 02:48, 14 November 2024 (UTC)

Strange missing end tag for bold

To fix a missing end tag for bold in Portal:Companies, which uses {{tlx|Transclude random excerpt}}, I edited Sony, changing

{{Nihongo foot|Sony Group Corporation|ソニーグループ株式会社|Sonī Gurūpu Kabushiki gaisha|{{IPAc-en|ˈ|s|oʊ|n|i}}|group=lower-alpha|extra2={{respell|SOH|nee}}|lead=yes}} (formerly {{Nihongo foot|Tokyo Tsushin Kogyo K.K.|東京通信工業株式会社|4=Tokyo Telecommunications Engineering Corporation|lead=yes|group=lower-alpha|Tōkyō Tsūshin Kōgyō Kabushiki gaisha}} and {{Nihongo foot|Sony Corporation|ソニー株式会社|lead=yes|group=lower-alpha|Sonī Kabushiki gaisha}})

to

{{Nihongo foot|Sony Group Corporation|ソニーグループ株式会社|Sonī Gurūpu Kabushiki gaisha|{{IPAc-en|ˈ|s|oʊ|n|i}}|group=lower-alpha|extra2={{respell|SOH|nee}}|lead=yes}} (formerly {{Nihongo foot|Tokyo Tsushin Kogyo K.K.|東京通信工業株式会社|4=Tokyo Telecommunications Engineering Corporation|lead=yes|group=lower-alpha|Tōkyō Tsūshin Kōgyō Kabushiki gaisha}} and {{Nihongo foot|Sony Corporation|ソニー株式会社|lead=yes|group=lower-alpha|Sonī Kabushiki gaisha}})

It did not work to change the problematic snip to {{Nihongo foot|Tokyo Tsushin Kogyo K.K....}}; it excerpted the bold from just before the wikilink but not the bold from just after the wikilink. It would be good to figure out why the original markup caused a problem and fix it. —Anomalocaris (talk) 00:39, 31 December 2024 (UTC)

:The issue was to do with there being an unexpected pipe wikilink in {{tl|Nihongo foot}}, which would cause the cleanup during stripTemplates to accidentally leave an unclosed wikilink and template, which would later get trimmed down and cause everything after it to vanish, leading to the above linter error as well as all the text dissapearing. I've fixed this in Special:Diff/1266381727. Aidan9382 (talk) 09:31, 31 December 2024 (UTC)

::Aidan9382: Looks good. Thank you for fixing Module:Excerpt/portals, and for reverting my 3 edits of Sony. —Anomalocaris (talk) 18:57, 31 December 2024 (UTC)

Possible ToU attribution violation issue with copy/translation from articles with excerpts

I wanted to raise a possible issue involving violation of licensing requirements as described at WP:CWW and at WMF Terms of Use, which require crediting the original authors of a copied or translated passage in the target article's edit summary (see WP:CWW and WP:TFOLWP). Proper attribution is still possible when Excerpt (or selective transclusion) is involved, but is a bit tricky and may need to be addressed at the appropriate venue. (That venue is mostly not here, but I wanted to raise the issue here first, while I think about where and how to raise it more generally; probably at WP:CWW and perhaps elsewhere.)

The problem is when an editor at Wikipedia article 'A' copies or translates material from article 'B', where the 'B' content in question is not actually in 'B's wikicode, but excerpted/transcluded from article 'C'. In this case, the authors to be credited are not the B-authors, but the C-authors. (Or both, if what was excerpted spanned content hosted at 'B' as well as content excerpted from 'C' to 'B' into the excerpt area).

A real world example is at French colonial law ('A'), in this edit of 13:12, 16 January 2025, which credits the authors of :fr:Droit colonial français ('B'). Our article talks about {{xt|Colonial trade under French colonial law}} and a trade regime {{xt|known as the "Exclusif."}} This can be found in the French article in section § Régime de l'exclusif. If you edit that section ({{lang|fr|modifier le code}}) you will see that it is an Excerpt ({{lang|fr|Extrait}}) of French article :fr:Principe de l'exclusif ('C'). It is this last article that should have been credited, but was not.

I don't think this situation is an existential threat to the {{tl|Excerpt}} template, but people here should be aware of it. I suspect that after some discussion among transclusion folks and attribution/copyright folks, the end result will be additional documentation in a few places (including here) about how to properly attribute copied or translated content that involves transclusion of third-party content. (Fwiw: I ended up at that diff investigating a completely different issue; it turns out that both of the refs in that diff are LLM hallucinations citing non-existent sources, which is curious, but irrelevant for the attribution issue under discussion.) Thanks, Mathglot (talk) 01:14, 18 January 2025 (UTC)

:How is this different from all other types of transluded pages? — xaosflux Talk 02:18, 18 January 2025 (UTC)

:: Not sure I understand the question, starting with the meaning of "transcluded pages". If you meant:

::* How is copying from an article section that has an {{tl|excerpt}} in it different from copying from an article section that has WP:SELTRANS or some other transclusion method in it?

:::: ➤ They are no different.

::* What's different about this case, compared to any typical use of Excerpt (or SELTRANS, or any transclusion method)?

:::: ➤ Typical usage of excerpt does not require attribution because nothing was copied. For example, the following excerpt from Ralph Waldo Emerson does not require attribution on this page, because nothing from the Emerson page was copied here, and therefore there is no need for attribution, per the ToU:

{{cot|bg=cornsilk|indent=6.4em|excerpt from Ralph Waldo Emerson}}

{{excerpt|Ralph Waldo Emerson|Early life, family, and education|paragraphs=1|hat=no|references=no}}

{{cob}}

::* How is copying from an article section that has an excerpt (or other transclusion method) in it, different from copying from an article section that does not include an excerpt?

:::: ➤ They differ in what needs to be attributed. The second case is the typical one, a simple copy (or translation) of material from one Wikipedia article to another—attribution must be given per the ToU, and this case is the one that is the entire subject of WP:Copying within Wikipedia. The first case is different. When article 'A' copies a section from article 'B', and the 'B' section contains an excerpt from 'C', then it depends how they did it, from the wikicode, or the rendered page. If they copied from the wikicode of 'B', then they have to credit the authors of 'B', and that's all; the {{tl|Excerpt}} template targeting 'C' will get copied to 'A', but not the content, so no credit to 'C' authors id required. (Footnote: in some cases, no credit at all is needed, if the entire copy from 'B' consists of an {{tl|Excerpt}} statement, because if no creative content from 'B' is copied over, then nothing needs accreditation.) If they select/copy from the rendered page (essentially, from 'Print view') of 'B', then they have to attribute it to both the authors of 'B' (if any creative content included) and definitely to the authors of 'C'.

:: If you meant something entirely different, please enlighten. Mathglot (talk) 05:54, 18 January 2025 (UTC)

An example of possible attribution language for a case of this type, including specifying the roles of 'B' and 'C' can be seen in this edit at French colonial law, and states:

: {{xt|Content in the previous edit of 13:12, 16 January 2025 by User:Example was translated from the French article :fr:Principe de l'Exclusif (as excerpted by :fr:Droit colonial français); see that article's history for attribution}}

I have concerns that this is too tricky or convoluted to expect the average volunteer to be willing or able to deal with, and yet, that doesn't absolve us of the ToU requirement to do so, so I really don't know what the best path forward is here. Maybe something could be added to the documentation of the template explaining it, as well as a mention at WP:CWW, perhaps. Mathglot (talk) 23:09, 19 January 2025 (UTC)