template talk:Unichar#Option to only show HTML mnemonic
{{User:MiszaBot/config
|archiveheader = {{talkarchivenav}}
|maxarchivesize = 100K
|counter = 1
|minthreadsleft = 3
|minthreadstoarchive = 1
|algo = old(730d)
|archive = Template talk:Unichar/Archive %(counter)d
}}
{{WikiProject banner shell|
{{WikiProject Writing systems}}
}}
{{Archives |age=730}}
Proposal: use [[Template:Char]]
Would it be good to place the character itself in {{tl|char}}? jlwoodwa (talk) 06:43, 9 July 2023 (UTC)
:Although generally keen on char, I'd need to be convinced in this case. Char is used to "isolate" a glyph under discussion from the associated running text. In the output of unichar, that is usually clear.
:The only argument in favour that I can see is that, at present, unichar identifies the glyph by increasing its size and maybe the faint box used by char would be better? But conversely magnification makes it easier to "read".
:Did you have a particular case that provoked the proposal? πππ½ (talk) 07:51, 9 July 2023 (UTC)
::It's clear to anyone who's familiar with the format, but I'm not sure it's as clear to a general reader, especially one who doesn't know what the "U+ stuff" means. I haven't noticed any specific problems that this would solve, I just think it's good to have a consistent format for "inline character literals" on Wikipedia. jlwoodwa (talk) 08:19, 9 July 2023 (UTC)
:::So how would we handle this example: {{unichar|20E0|Combining Enclosing Circle Backslash}} (which is already not handled terribly well). Likewise, Asiatic scripts present issues that don't occur to those of us only familiar with alphabetic scripts. A lot of development work has gone into this template to deal with these issues so changing it would not be trivial, given the need to verify many many test cases and rewrite to resolve anomalies. Annoyingly, one of the recent main developers, user:DePiep, is no longer available to advise. --πππ½ (talk) 10:20, 9 July 2023 (UTC)
::::{{char|⃠}} seems to work just fine. I understand the difficulty of modifying such a convoluted and widely-used template, though. Since it sounds like it's not {{em|obviously}} a bad idea, I'll try the "obvious implementation" in the sandbox, and give an update here when it's working. jlwoodwa (talk) 10:35, 9 July 2023 (UTC)
:::::on Chrome, the symbol overruns the box (or the box underruns)... πππ½ (talk) 13:43, 9 July 2023 (UTC)
::::::... but then again it overruns the last digit of the codepoint right now. --πππ½ (talk) 13:45, 9 July 2023 (UTC)
Combining diacritics are displaying as tofu on Android - fault may be in cwith= handling?
I don't know if this is new? The argument {{code|1= cwith=◌}} or {{code|1= cwith=β}} is used heavily to display combining diacritics. I'm editing in Android right now and the symbol displays correctly. But in articles like diacritic, it is has more tofu than a Japanese restaurant. Is there a {{code|style serif}} somewhere that is blocking the last resort substitution? --πππ½ (talk) 13:22, 21 September 2023 (UTC)
:No, it is not unique to Unichar, that just happens to be where it first saw it. Diacritic doesn't even use unichar, it just uses a dotted circle and combining diacritic directly, thus {{angbr|βΜ}}. As it is a general problem, I will take it to Wikipedia:Village pump (technical). --πππ½ (talk) 13:37, 21 September 2023 (UTC)
::No solution suggested, it is an implementation defect in Android. So unless someone has a back-channel to Google, we just have to grin and bear it. --πππ½ (talk) 16:33, 22 September 2023 (UTC)
:::Further discussion has revealed that the problem is due to deficiency in the system default sans-serif font. The workaround is to use serif and I have started to do that with success on "freestanding" cases. But {{tl|unichar}} is heavily used so we really need a fix to it, please? --πππ½ (talk) 16:32, 23 September 2023 (UTC)
=Template enhancement needed, please=
Requirement: when {{code|1= cwith=β}} is invoked, wrap the output in
:Will this work
::Yes, that would work. I hate to be ungrateful but to employ that solution would create a lot of work, many many articles would to be updated to use it{{snd}} and, when Google discards Roboto as default sans font, would all have to be undone again. AFIK, this is the only use-case for {{code|1=cwith=β}} so it would not have any deleterious effect elsewhere (and would be easy to back out). [BTW, we couldn't have {{code|1=use2=noto}} because it would break Bing and Safari.] --πππ½ (talk) 22:11, 23 September 2023 (UTC)
:::Ok, I made the change to Template:Unichar/glyph, but let me know if it doesn't look right and I'll revert it. Andreπ 23:51, 23 September 2023 (UTC)
Misaligned diacritics
Can anyone explain (better still fix) this phenomenon:
- {{unichar|0360|Combining double tilde|cwith=ββ}} , a tilde diacritic that spans a pair of adjacent characters: {{char|βΝ β}} no markup: βΝ β
Just using the characters directly puts the diacritic in the right place but unichar fails (placement is offset). (At least when using Chrome on Chromebook).
- {{unichar|0301|Combining acute accent |cwith=β}} is ok. β́
πππ½ (talk) 16:42, 22 September 2023 (UTC)
:|cwith=ββ
puts the dotted circles before the diacritic, but the diacritic is supposed to be between them. I don't know how it should be fixed though. β EruΒ·tuon 19:21, 22 September 2023 (UTC)
::Ah, of course. Obvious really.
:::I have added this text. It is not quite right, the display of the U+0360 is not exactly as produced by the template but does it matter?
:::{{tpq|1=** Note that {{code|1=cwith=ββ}} does not provide the desired result if the intention is to display a diacritic that spans two characters (such as those in the range U+035C to U+0362): the diacritic will be offset. In such cases, editors must emulate the template output by hand, because the correct HTML sequence is "first-character + combining-diacritic + second-character". Thus, for example, to show the combining double tilde U+0360, write {{code|1= U+0360 ◌͠◌}} then (in {{tl|small}}), COMBINING DOUBLE TILDE. This produces U+0360 ◌͠◌ {{small|COMBINING DOUBLE TILDE}}. }}
:::Comments (better still, direct edits to improve) welcome. --πππ½ (talk) 20:24, 22 September 2023 (UTC)
::::Really this needs a "print this instead" for the character. All this size/font/cwith stuff could be put into that instead of trying to fool the automatic text generator into producing the desired result. Spitzak (talk) 21:50, 23 September 2023 (UTC)
:::::Sorry, I don't follow. Rather than spend time explaining, would you write the alternative text please? Here or in the doc. --πππ½ (talk) 22:14, 23 September 2023 (UTC)
::::::I meant that there could be a parameter, perhaps show
, so that if invoked with show=foobar
then instead of showing the character it shows "foobar". This could then contain any wiki or html markup desired and any trick needed to get the character to be correctly visible. In this example it would contain the two circles and the combining diacritic. Spitzak (talk) 00:08, 28 December 2023 (UTC)
:::::::I think it does have a param does something similar, or it did 3 months ago. Andreπ 00:15, 28 December 2023 (UTC)
::::Hmm, a double parameter could be introduced to change the order of the output. Andreπ 19:50, 24 September 2023 (UTC)
Question on Error on off-Wiki
I've copied all the relating templates and modules to our wiki, and I've checked them a few times over, but it keeps giving me the following error:
:I wrote:
:β> "The character
:It should write:
:β> "The character {{unichar|a9|COPYRIGHT SIGN}} is about intellectual property."
:but gives me:
:β> "The character {{red|1=Error using
I don't understand why it does this. Not sure if I should ask this here or somewhere else, but thought to try it here first. Kind regards, Rodejong π¬ βοΈ 23:15, 18 December 2023 (UTC)
:That is a charset encoding issue probably. Or something to do with your wiki's installation of php. {{unichar|a9|COPYRIGHT SIGN}} works fine here, as you can see. Andreπ 00:16, 28 December 2023 (UTC)
::Thanks for answering. I'll ask the hosting guys to look in to that then. Kind regards, Rodejong π¬ βοΈ 00:53, 28 December 2023 (UTC)
Enhancement request: sanity check or lazy invocation
At Copyright sign, a vandal changed
Better still, don't ask for any text, indeed ignore any provided. A simple
Is there a template doctor in the house? πππ½ (talk) 19:53, 2 April 2024 (UTC)
:It seems this has fallen through the cracks. I'm going to see if I can wrangle a modification to this template that will simply allow one to print the canonical Unicode name for a given code point. I would prefer it being the default or {{em|only}} behavior, but I am curious is this would be a problem for anyone. Remsenseθ― 12:58, 5 April 2024 (UTC)
::To my mind, anything but the canonical name is at best finger trouble. The family {{code|1= nlink=}} is there when the WP:common name and the canonical name don't match. As in {{unichar|005E|circumflex accent|nlink=carat}} (
:::The issue being, it seems we need a data module of 150k entries that the module has to be searched every timeβif we want to prevent vandalism, anywayβand that's about three orders of magnitude more entries than I've seen a module on here work with, so I am worried by the potential server load. Remsenseθ― 18:16, 5 April 2024 (UTC)
::::Maybe WP:village pump/technical could advise? But it is not really a search when you already have the index and just want to fetch the record that matches that index. πππ½ (talk) 18:24, 5 April 2024 (UTC)
:::::Doy, you're completely right on the latter point. Had the current flowing the wrong way in my brain there. I'll poke the pump. Remsenseθ― 18:27, 5 April 2024 (UTC)
:Well, that was easy!!!!!!!!!!!!!! {{tlx|Unichar/sandbox}} seems to work perfectly well. Thank you so much @Cryptic for lending some lost, cold, and confused lexicographers a helping {{unichar/sandbox|2F3F}} Remsenseθ― 21:03, 5 April 2024 (UTC)
The sooner we can put this live, the better. There's a lot of it about! (Kudos to {{u|Nickps}} for spotting [https://en.wikipedia.org/w/index.php?title=Hyphen-minus&curid=2734201&diff=1217645792&oldid=1217645552 this one] in such a high-profile article but such basic stuff should't depend on eagle eyes to keep clean.) --πππ½ (talk) 10:33, 7 April 2024 (UTC)
:I am not sure of a particular reason why it can't, I just didn't want to be rash about doing so. It's not like it was a particularly technical change, if you'd like to do the honors? Remsenseθ― 10:38, 7 April 2024 (UTC)
::I'm happy to be the one to do it but you'll have to tell me how. πππ½ (talk) 12:42, 7 April 2024 (UTC)
:::Oh! Apologies for assuming everyone else is the one I should be asking how to do things. I've done it. Remsenseθ― 12:54, 7 April 2024 (UTC)
:::The template should certainly ignore the text given but maybe we should start with a green warning to say that the template has done so. One like the error message you get if you accidently type {{code|1= firdt=John}} in a CS1/2 citation. We could do it silently and let those who have been taking advantage of the failure to check come and read the (to be revised) documentation which will tell them that the free text field is no more. πππ½ (talk) 12:55, 7 April 2024 (UTC)
::::Yes I can do that also, great idea. Remsenseθ― 12:57, 7 April 2024 (UTC)
:::::Revising the doc, I noticed that calling the template with no text generated just omitted it. I can't see why anyone would want to do that but we had best add a {{code |1=name=none}} option? πππ½ (talk) 13:10, 7 April 2024 (UTC)
::::::I think it's nice to have just because I often am too lazy to tab to a template's documentation so I try all the things ({{code|{{=}}none}}? could it be {{code|{{=}}false}}? how about {{code|{{=}}no}}? Surely it will no longer confound me if I try {{code|{{=}}""}}βthere we go!) Remsenseθ― 13:13, 7 April 2024 (UTC)
:::::::Well we could just cheat and regard any input to {{code |1= name=}} as an instruction to omit. Who is ever going to use if to mean yes. --πππ½ (talk) 13:38, 7 April 2024 (UTC)
::::::::This is usually the pragmatist's move with a binary parameter. I swear there's a thing that lets you check all the ways a user wants to say no or yes to something. Remsenseθ― 14:09, 7 April 2024 (UTC)
:I probably don't deserve praise for that one considering I'm the one who made the mistake in the first place [https://en.wikipedia.org/w/index.php?title=Hyphen-minus&diff=prev&oldid=1217645552&diffonly=1] but thanks, I guess. Nickps (talk) 11:06, 7 April 2024 (UTC)
::Of course you do! It's never too late to make things right. Remsenseθ― 11:08, 7 April 2024 (UTC)
=Override option needed=
See
{{blockquote|In Unicode, the majuscule Ζ’ is encoded in the Latin Extended-B block at U+01A2 and the minuscule Ζ£ is encoded at U+01A3.{{cite web|url=https://www.unicode.org/charts/PDF/U0180.pdf|title=Unicode chart}} The assigned names, "LATIN CAPITAL LETTER OI" and "LATIN SMALL LETTER OI" respectively, are acknowledged by the Unicode Consortium to be mistakes, as gha is unrelated to the letters O and I.{{cite web|url=http://unicode.org/notes/tn27/|title=Unicode Technical Note #27: Known Anomalies in Unicode Character Names}} The Unicode Consortium therefore has provided the character name aliases "LATIN CAPITAL LETTER GHA" and "LATIN SMALL LETTER GHA".}}
Right now, we have
- {{unichar|01A2}}
We need a {{code |1=alias= }} as in {{code |1=alias=LATIN CAPITAL LETTER GHA}} , as suggested by {{u|Chatul}} at the Village Pump. There are a very few such cases where an error was made in the original standard that will never be changed. --πππ½ (talk) 13:49, 7 April 2024 (UTC)
:Will start this right now alongside the other thing. Remsenseθ― 14:10, 7 April 2024 (UTC)
::I think it would be ok for arg 1 to continue to work. Instead find all the invocations of this template and remove arg 1 unless it is actually necessary.Spitzak (talk) 19:07, 8 April 2024 (UTC)
:::In principle, you are absolutely right{{snd}} but in practice that would be a huge task, wildly out of proportion to the tiny number of cases where the Unicode Consortium admits it made an error. This is the most practicable solution to this specific problem. Meanwhile, ignoring the supplied 2= in favour of the canonical text resolves immediately the rather more cases of spelling errors and vandalism. --πππ½ (talk) 20:25, 8 April 2024 (UTC)
=Temporary reversion needed=
{{Ping|Remsense}} we forgot the many instances of uses like this:
:Revert done: I'm working on the aliases as we speak also Remsenseθ― 21:26, 8 April 2024 (UTC)
::Which is now working:
::{{tlx|Unichar/sandbox|1A2}} β {{Unichar/sandbox|1A2}}
::{{tlx|Unichar/sandbox|1A2|alias{{=}}yesgivemethealias}} β {{Unichar/sandbox|1A2|alias=yesgivemethealias}}
::What should we do about this? It does say such use of {{para|nlink}} is deprecated. Should we clean it all up somehow? Remsenseθ― 21:38, 8 April 2024 (UTC)
:::I have seen a lot of {{code |1=nlink=
:::first: a list of articles that use nlink= with no data, so that someone (aka me, since I know many of them are my fault) can go round and correct them. [I believe that the template already has such an exceptions report, though whether anyone has been checking since {{u|DePiep}} got canned must be doubtful.) Then we can reinstate the change.
:::second, add some code to say (for all the optional parameters), {{red|1=No data supplied with =, ignored}}
:::PS sorry to have dropped the bombshell and not been around until now to help with the cleanup; officially I was otherwise engaged and shouldn't have been in a position to spot the error.
::::My "first" wouldn't be needed if the current interception of {{code|1=nlink=
:::::Don't apologize at all! Nothing about this is particularly burdensome. I am leaning towards linking to the character itself, are there cases where this is going to break? Remsenseθ― 00:03, 9 April 2024 (UTC)
::::So, do you think directly linking to the character itself is the best move? That's where I am presently unless there are edge cases (e.g. I can think of high-range code points and non-printable ones, and maybe we can define those manually). Remsenseθ― 02:26, 9 April 2024 (UTC)
:::::yes, see below. πππ½ (talk) 08:30, 9 April 2024 (UTC)
:The {{para|nlink}} default is now also working:
:{{tlx|Unichar/sandbox|1A2|alias{{=}}yes|nlink{{=}}}} β {{Unichar/sandbox|1A2|alias=yes|nlink=}} Remsenseθ― 13:47, 9 April 2024 (UTC)
Do we even need nlink=
:::Say: we have a lot of technical redirects, why can't we just add U+XXXX as redirect format to a given page? Remsenseθ― 21:43, 8 April 2024 (UTC)
::::As in, U+2120 now redirects to Service mark symbol, as already did β . This seems like a pre-solved problem. Remsenseθ― 21:49, 8 April 2024 (UTC)
:::::It looks to be a neat solution. The only catch that I can see is that these U+XXXX aren't well watched and may be subject to vandalism. It is not an obvious vector for a "bad actor" so I guess it is a reasonable risk. The problem is that the attack won't be obvious and someone following a link to a Gardiner's sign list entity will have no idea how it happened. --πππ½ (talk) 23:01, 8 April 2024 (UTC)
::::::Are there any cases of nlink=target-name#section-name? I can't think why there would but if it is possible (as it is), someone somewhere will have done it.
:::::::I would say if necessary, the redirect page itself can link to a given section, if I'm understanding properly? Remsenseθ― 00:04, 9 April 2024 (UTC)
::::::::Yes, that makes sense. I can't see any other reasonable possibility. πππ½ (talk) 07:43, 9 April 2024 (UTC)
:::::::::Though there are cases where the nlink goes to a broad concept article (such as Gardiner's sign list) when there is no specific article. So {{code|1=nlink=
:::::::::So to solve the current problem, we just need to change the behaviour of {{code |1=nlink=
Testcases
:::::As a template editor, I find it helpful, when people point out exceptions and cases like this, to put them in the testcases page so that future editors do not have to remember them. β Jonesey95 (talk) 21:52, 8 April 2024 (UTC)
::::::Which testcases? I'm planning on ensuring there's an adequate library of them there once I'm done with this round of updates. Remsenseθ― 21:54, 8 April 2024 (UTC)
:Per above...is there actually a purpose to being able to set a custom link rather than create easter eggs? I say we just have it link in most cases to Ζ’ i.e. the page for the character itself most of the time. Remsenseθ― 21:57, 8 April 2024 (UTC)
=Almost there=
Great to see it working again, thank you. Just one left on the to-do list, I think?
- {{code|1= name=none}} so that
{{unichar|0123|name=none}} produces just plain {{tq|U+0123 Δ£}}
I need to document {{code|1= alias=yes}}: I will copy Unicode#Alias. --πππ½ (talk) 14:48, 9 April 2024 (UTC)
:And there you are: {{tlx|Unichar|1A2|alias{{=}}yes|name{{=}}none}} β {{Unichar|1A2|alias=yes|name=none}} Remsenseθ― 15:15, 9 April 2024 (UTC)
:It looks a lot like the use of the alias can be automatic, by just checking the alias database and using it instead of the real one if there is an entry. Is there a reason you did not do this? Spitzak (talk) 09:44, 10 April 2024 (UTC)
=Anomalies=
Problems as I discover them
- {{unichar|002E|Full stop|nlink=}} (
{{unichar|002E|Full stop|nlink=}} ) misbehaving. OTOH,{{unichar|002E|nlink=Full stop}} behaves as it should. --πππ½ (talk) 19:42, 9 April 2024 (UTC)
:Knew I should've just looked at the page that definitely exists where they tell me what characters can't be used as article titles. Remsenseθ― 19:44, 9 April 2024 (UTC)
::Some you win, some you lose. I just came back to say it must be something to do with that character because these work:
::
=Refs=
{{reflist talk}}
Cwith= and non-latin script
The Nepalese rupee sign, {{char|ΰ€°ΰ₯}} uses the combining diacritic technique of
- {{unichar|0930}} + {{unichar|0942}}.
Unfortunately,
- {{unichar|0930|cwith=ू|note=A dog's breakfast}}.
Can anyone fix? πππ½ (talk) 16:18, 21 April 2024 (UTC)
:I see that it is also a problem with latin script. In the example of "q with circumflex" below, the template fails to align the circumflex correctly over the q. --πππ½ (talk) 18:52, 21 April 2024 (UTC)
::The cwith character is printed first. Also you should not try to use this to show a character that is not a single code point. Spitzak (talk) 08:03, 22 April 2024 (UTC)
:::Ah yes, of course. The general solution is your response to the next question. πππ½ (talk) 08:24, 22 April 2024 (UTC)
cwith handling generally
Suppose that somewhere there exist a letter q with circumflex, q̂. Before we enhanced the template to assert the canonical name (and only the canonical name), it was possible to write
So I would like to propose that, when {{code|1= cwith=
- Thus, for example,
{{unichar|0071|cwith=̂}} should produce {{green|{{unichar|0071}} with {{unichar|0302}} : q̂}}
Comments? πππ½ (talk) 18:50, 21 April 2024 (UTC)
:Cwith should be limited to only the dotted circle.
:I do think the should be a simple "print this instead" argument to replace all the size, font, IMG, and cwith stuff. Spitzak (talk) 08:07, 22 April 2024 (UTC)
::Yes, I agree that the dotted circle should be the only valid option. Perhaps way back in the early developments, it also supported a coloured block to show the various forms of space character? These are now hardcoded but I guess there are too many combining diacritics to do the same here too.
::I will revise the documentation accordingly.
::As for all the other bells and whistles, it would take a full search of existing usage to determine where and why they are used. That is not a trivial task. πππ½ (talk) 08:34, 22 April 2024 (UTC)
:::I have revised the documentation to formally restrict the base character to β and to deprecate any other usage. Please review.
:::When someone has time to revise the template, can this restriction be enforced, please? --πππ½ (talk) 10:27, 22 April 2024 (UTC)
:
produces {{unichar|0302|cwith=q}}. Spitzak (talk) 10:28, 22 April 2024 (UTC)
::True, but should it? As per your earlier comment (with which I agree), the template should only produce real code points. --πππ½ (talk) 16:27, 23 April 2024 (UTC)
=More detailed request for development =
The only legitimate character to use to display a combining diacritic is the dotted circle. So I propose that
- {{code |1=cwith=}} is redefined to mean "circle with".
- The preferred syntax is {{code |1=cwith=yes}}
- {{code |1=cwith=β}} and {{code |1=cwith=◌}} are accepted alternatives.
- Any other argument is flagged as an error.
Is that reasonable? --πππ½ (talk) 16:27, 23 April 2024 (UTC)
:Is it possible to determine it is combining from the unicode info database? If so maybe just ignore the field entirely and use that. Spitzak (talk) 07:15, 25 April 2024 (UTC)
::Do we know how/whether that would work with non-Western scripts? Interestingly (at least on ChromeOS), this Devangari combiner comes with dotted circle out of the box: {{nobr|{{unichar|0942}}}}. I don't know how typical that is. --πππ½ (talk) 17:10, 26 April 2024 (UTC)
Fixing nlink= for [[WP:FORBIDDEN]] characters
The docs say that {{para|nlink}} with no argument is deprecated but in my opinion it is a useful feature that we should try to support. The problematic characters are easy to fix simply by linking to the names instead of the characters. I have already written how this can be done in the sandbox ({{compare pages|Template:Unichar/name|Template:Unichar/name/sandbox|the diff}}). The only problem with the way its currently done is that I have to special case the underscore because low line is a disambiguation page. I don't like hardcoding things like that, but I don't think anyone plans to move underscore any time soon so it should be fine. Nickps (talk) 14:26, 14 June 2024 (UTC)
:It was only deprecated because it is a bit of a bear trap. Not every Unicode canonical name has a matching article, I think? And just because an article of that name exists, does it necessarily relate to the character.
:{{ping|Remsense}}, can you remember what the complications were? πππ½ (talk) 19:00, 14 June 2024 (UTC)
::{{para|nlink}} does not link to the canonical name by default. It links to the character itself. See My proposal is that the name should be linked if and only if the character is not allowed in a title. Nickps (talk) 20:05, 14 June 2024 (UTC)
:::To actually explain what my change is, if the character is any of # < > [ ] { } | : _ which are the characters not allowed in titles, then I link to the name (except low line which is disambiguated to underscore), otherwise, nothing changes. Nickps (talk) 20:29, 14 June 2024 (UTC)
::::Rereading the discussions about the last big change, it does seem to be the case that it was just these forbidden characters that caused the barf (specific example was full stop). Your proposed revision resolves that problem and seems lightweight enough not to cause any problems.
::::As this is such a high profile template, best we give it a week for any other editor to raise any red flag issues. --πππ½ (talk) 07:56, 15 June 2024 (UTC)
:::::Ok, that makes sense, you never know how these things can break. I also need to write testcases anyway, so there's no rush to merge. Nickps (talk) 09:01, 15 June 2024 (UTC)
::::::This makes perfect sense. I cannot figure out why a huge change to make it not use a user-defined name was somehow accompanied by a change that forced a user defined name for the link. I would implement this ASAP as somebody is busy adding text to the nlink in every instance, which is backwards. Spitzak (talk) 14:19, 15 June 2024 (UTC)
:::::::@Spitzak I'd suggest you ask them to stop their edits and comment here. I want to undeprecate the empty nlink parameter but apparently this editor disagrees and should be given a chance to explain their reasons. Nickps (talk) 14:50, 15 June 2024 (UTC)
:::::::Did you mean me? After the big change (when we discovered the anomaly that Nickps is now fixing), I certainly went round clearing nlink=nothing because of not knowing the full extent of the problem. That was a month ago. Has someone else resumed? πππ½ (talk) 18:06, 15 June 2024 (UTC)
What's with [[TM:Unichar/hexformat/sandbox/doc]]?
Now, to be clear, that page used to be at TM:Unichar/sandbox/doc but since it was only used by {{Tl|Unichar/hexformat/sandbox}}, I moved it to its current title. Still, I can't understand the purpose of that page. To me it looks more like a bunch of notes for personal use rather than a documentation page. Does anyone have any idea what it's supposed to say or should it just go to TfD? Nickps (talk) 01:07, 25 June 2024 (UTC)
:{{rto|Nickps}} It looks like a bunch of test cases created by {{u|DePiep}} for regression testing. Since no-one has spoken up it its defence by now, off with its head. πππ½ (talk) 16:22, 16 August 2024 (UTC)
Make |cwith=| a valid option, to save us having to dig out a dotted circle every time?
Since, as documented, the only valid parameter for {{code|1= cwith=}} is the dotted circle, can anyone see a reason to demand the parameter in for first place? Surely we can just have |cwith=|
(a null parameter) as a valid option, with the dotted circle being supplied automatically. πππ½ (talk) 16:26, 16 August 2024 (UTC)
:I think it should also be possible to automatically add the dotted circle if the unicode attributes indicates the character is combining, so no cwith is needed at all.
:If it is wrong, I really recommend an attribute be added that is the "print this instead" attribute. It can contain any markup wanted, and would replace all the stuff to set the font and size and cwith, and the image option, and so on. Spitzak (talk) 17:33, 16 August 2024 (UTC)
::Yes, first para makes sense, I agree.
::Sorry, I don't understand your second paragraph, could you expand? πππ½ (talk) 22:12, 16 August 2024 (UTC)
:::I think most of the current parameters could be replaced with a single optional parameter. If that parameter is given, it's value is used to show the character. This would get rid of the need for the image and a lot of other controls for messing with the font. Popular substitutions could eventually be put in the template itself. Spitzak (talk) 03:02, 17 August 2024 (UTC)
::::But the only character we ever want to show is the canonical glyph and canonical name? (with the sole exception of combining diacritics which need the support of a dotted circle for clarity) [Caution: many Devangari diacritics come with the dotted circle 'as standard'.] I'm still not following you.
::::Or do you mean a option to use serif rather than the default sans, since some glyphs are difficult to "read" without the hinting supplied by serif.
::::Or am I still missing your point? (Though if is that there is surfeit of bells and whistles that are never used and should go, I agree "subject to survey". πππ½ (talk) 15:43, 17 August 2024 (UTC)
:::::I want an option that if set to "BLAH" will make it print "BLAH" instead of attempting to print the character. Spitzak (talk) 17:52, 4 September 2024 (UTC)
::::::I think you really need to give an example. I assume you don't mean anything horrible like getting {{unichar|005E}} to display U+005E ^ {{sc|caret sign}}? --πππ½ (talk) 18:32, 4 September 2024 (UTC)
:::::::Assuming the new field is called "as", I propose that
Width bug
Recently this has been adding a lot of whitespace at the end of the small-caps name. Most obvious if the link is enabled as the underscore is also extended under this whitespace. Spitzak (talk) 17:53, 4 September 2024 (UTC)
:This may be Safari-only. Seems to work on Chrome on Linux Spitzak (talk) 22:57, 4 September 2024 (UTC)
More flexibility in parameter 1
Occasionally, I'd like to use the unicode character itself as the parameter. For instance, for π΄, I'd like
While looking into this, I was reminded that the unichar template doesn't let you add the U+ prefix to the code in parameter 1. So, for instance, U+1F3B4 is an error. Apparently this is a common error for people to make, so maybe it should be detected and the U+ prefix should simply be stripped internally? Dingolover6969 (talk) 07:02, 19 October 2024 (UTC)
:I'm not sure how a reverse lookup like that could be easily accomplished in a Wikipedia template. It seems like something that ought to be possible since the computer obviously has this information, but I don't think you have access to the table that you'd need to do that. The best idea that comes to mind would be to generate a magic template list with a script or bot of some kind that hardcodes the table and then look it up from that. Andreπ 07:08, 19 October 2024 (UTC)
::This is really easy to do with Lua modules. {{ml|ustring|codepoint|\π΄}} -> {{#invoke:ustring|codepoint|\π΄}} converts a unicode character to its corresponding code point. The problem is that adding support for this introduces ambiguity. Consider {{tlp|unichar|7}}. Should it return {{unichar|7}} or {{unichar|37}}? For this reason I oppose adding support for this feature. Instead, we should make a {{tl|unichar2}} that accepts only unicode characters as parameters. Nickps (talk) 13:26, 21 February 2025 (UTC)
:::I missed that Dingo already addressed this in the opening comment. I think that having 7 and 07 behave differently is unnecessarily confusing, so I've gone ahead and made {{tl|unichar2}}. {{tlp|unichar2|π΄}} -> {{unichar2|π΄}} works as specified. Nickps (talk) 15:19, 21 February 2025 (UTC)
::::Oh, nice work! Didn't know about that Lua module. Lua is awesome. There are definitely some templates that we did the old way that could be improved. Andreπ 19:51, 21 February 2025 (UTC)
:::::Thanks! Yes, Lua is pretty useful, especially for stuff like this. From what I've seen, the two main reasons we don't use it more is because 1) not many people know Lua (I don't either) and 2) for some reason people really don't like calling modules from mainspace. Everything has to be a template.
:::::@Dingolover6969 Does this work for you? I'm pretty sure {{tl|unichar2}} solves the problem you're having. Nickps (talk) 22:42, 21 February 2025 (UTC)
::::::Wow, very cool, thank you Nickps! I reckon I will find that template quite useful. Dingolover6969 (talk) 15:22, 22 February 2025 (UTC)
:::::::You're welcome. I'm glad I could help. Nickps (talk) 22:07, 22 February 2025 (UTC)
Can anyone see why combining characters have strange redirects?
- {{unichar|0302|nlink=}}
- {{unichar|0303|nlink=}}
There are redirect articles (created just now) for Combining circumflex accent and Combining tilde. Clicking on the nlinked names above will not take you to either of those redirects. 0302 takes you to the top of Circumflex, 0303 takes you to Nasal vowel, which is not the only use of the diacritic. In each case, the top of the article shows "Redirected from
So two questions:
- Why is the content of the {{code |1= nlink=}} being ignored in favour of an article named for the character itself? (Compare with
{{unichar|0023|nlink=pound sign}} still gets you {{unichar|0023|nlink=pound sign}}, no ifs not buts. [So this behaviour may be a relic of earlier error handling?]) - How can the erroneous redirects be corrected? (because {{tl|unichar}} is not the only way to reach them.
Any ideas? (I assume that these two cases are not unique.) πππ½ (talk) 23:01, 17 February 2025 (UTC)
: What did you put as the nlink content? As I edit this wikipedia page, it's telling me that the source code is
, which produces {{unichar|0302|nlink=}} {{unichar|0303|nlink=}}, which link to what you describe (as expected based on my reading of the documentation). So, that's weird, if you put something else in (as I think your comment implies). I'll try including
in this comment to see if they work or if the same bug(?) affects them. {{unichar|0302|nlink=Combining circumflex accent}} {{unichar|0303|nlink=Combining tilde}}.
: With regards to the erroneous redirects, I'm not sure what everything should redirect to, but https://en.wikipedia.org/w/index.php?title=%CC%83&redirect=no and https://en.wikipedia.org/w/index.php?title=%CC%84&redirect=no get me to the relevant redirect articles, which seem like they can be then edited normally. I got to these by going to another redirect page and then pasting the unicode character, which I copied from a website that puts it into one's clipboard, into the url β the "Redirected from
:I'm on firefox. The html for the "Redirected from
vs
for a regular tilde (ok, it doesn't look normal, but the html seems to be correct)), so I imagine this is just about the browser choosing not to render links clickable when they only occur on combining diacritics. In fact, I've just verified this is the way it works on both firefox and chrome using the following html:
which isn't clickable. It isn't even blue in Chrome! So if we want this to change (which seems like a reasonable change to desire), I think we would have to file bug reports with the browsers themselves. Or do some hack to work around it in Wikipedia's software.
: Dingolover6969 (talk) 10:00, 18 February 2025 (UTC)
::OK, these filled-nlink ones I tried seem to work for me. Dingolover6969 (talk) 10:01, 18 February 2025 (UTC)
:::Thank you, that allowed me to fix the immediate problem with those two (and revealed many more, which I will work through). I didn't know about the {{code |1= title=%XY%99}} hack.
:::Coming back to the more general point of {{code |1= nlink=}}: "if no parameter is specified, then the redirect is to an article with the same name as the Unicode canonical name"{{snd}} or at least that was I believed should happen: I was wrong. I see now that what actually happens is the link goes to the an article whose name is just one character long, the character itself. For 'ordinary' characters, that is not a problem because they are accessible but these combining ones are not.
:::(Normally, we only need to use the nlink if we want a specific section or if the Wikipedia article name and the Unicode canonical name differ.)
:::Do we need to add anything to the documentation? πππ½ (talk) 11:04, 18 February 2025 (UTC)
::::Maybe if there is a cwith= then it should use the name of the character as the link.
::::It would also be really nice if Unicode character properties were used to cause cwith= to happen for any combining character rather than caller having to do it! Spitzak (talk) 18:56, 18 February 2025 (UTC)
::::Glad to hear it! :)
::::The nlink documentation is certainly a little confusing, what with its mentions of "using its canonical name", by which it means that the canonical name is linked, to the unicode character. I was able to figure out the state of affairs by close examination of a quadruply-nested bullet point caveat; but it could certainly be made more evident.
::::Dingolover6969 (talk) 06:46, 20 February 2025 (UTC)
Template styles
Can we convert this to use TemplateStyles?
While code point names are an exception from MOS:SMALLCAPS, this template forces small-caps in a way that isn't overridable by user styles, which means users who [http://w3c.github.io/low-vision-a11y-tf/requirements.html#capitalization struggle with all-caps text] (hi! ππΌ) can't use our user CSS to change the presentation.
If we convert to template styles, users can override for their login, where necessary. β OwenBlacker (he/him; Talk) 10:06, 21 February 2025 (UTC)
:Isn't the text in the database all-caps? It seems like this won't really help anybody who can't read all-caps text. Spitzak (talk) 17:01, 21 February 2025 (UTC)
Jingtian (δΊ)
Our example documentation for note= is {{unichar|4E95|note=Jingtian}}. However, I don't see why that should get a note. Based on my research (googling stuff), δΊ (jΗng) is not a name for δΊη°εΆεΊ¦ (jΗngtiΓ‘n zhΓ¬dΓΉ) (although δΊη° (jΗngtiΓ‘n) is, apparently), even though δΊη°εΆεΊ¦ is named after δΊ. I have added jΗngtiΓ‘n to the page for δΊ, so {{unichar|4E95|nlink=}} should be fine... or maybe {{unichar|4E95|note=jΗng}}. Paging User:JMF, who added this, and so may wish to weigh in. This was also discussed on Talk:δΊ#Wrong_target?, back when that page used to redirect to jingtian, I assume. Dingolover6969 (talk) 09:00, 20 April 2025 (UTC)
:@Dingolover6969, would you move this over to the talk page of the article where you saw it, please? because I don't remember it and it doesn't look like a "feature" of this template. πππ½ (talk) 09:13, 20 April 2025 (UTC)
:and the reason I don't remember it is because I've been framed. I assume you mean [https://en.m.wikipedia.org/w/index.php?title=Number_sign&diff=1249513017&oldid=1247955741 this diff] at number sign?
:: Ah, I understand. I was thinking of [https://en.wikipedia.org/w/index.php?title=Template:Unichar/doc&diff=next&oldid=1249514522 this Template:Unichar/doc diff], but presumably the number sign diff is the progenitor of the example "in the wild".
:: In that case, there's I don't think there's any remaining difficulty to discuss; it's just a single other Wikipedian getting confused and not knowing about the consensus that was reached about δΊ. I'll just remove that note from the pages on which it occurs.
:: Out of curiosity, I also chased down the same verbiage on the Sharp_(music) page, and found [https://en.wikipedia.org/w/index.php?title=Sharp_%28music%29&diff=prev&oldid=1218307262 this diff] which ultimately is a correction of [https://en.wikipedia.org/w/index.php?title=Sharp_%28music%29&diff=prev&oldid=453298269 this diff]; so, it's just some cruft that's been circulating Wikipedia since the early days.
::Anyway, sorry to bother you/thanks for your time π Dingolover6969 (talk) 02:27, 21 April 2025 (UTC)
Cyrillic example of use/use2, which doesn't seem to work?
I've just added an example from Japanese that demonstrates the value of {{code|1= use=lang |use2=ja}}.
- :
→ {{unichar|3099|cwith=β|use=lang|use2=ja}} (If use+use2 are not used, this is the (undesirable) effect: {{unichar|3099|cwith=β}}: the the Japanese diacritic dakuten is not shown properly.){{unichar|3099|cwith=β|use=lang|use2=ja}}
But I can't see what the existing example is intended to demonstrate?
- :
→ {{unichar|0485|cwith=|use=script|use2=Cyrs}}{{unichar|0485|cwith=β|use=script|use2=Cyrs}}
since it doesn't actually render the {{midsize|CYRILLIC DASIA PNEUMATA}} diacritic with the place-holder character ({{char|β}}). [btw,
produces {{unichar|0485|cwith=}}, so that doesn't work either.]
What should happen with the Cyrillic example? πππ½ (talk) 16:56, 21 April 2025 (UTC)
:I think again this shows there should be an argument that is "arbitrary wiki markup to print instead of the character". This would replace the lang, image, size, cwith, and a ton of other argument bloat, and also allow access to stuff that is currently impossible, such as the "cwith for two characters", or "remove the emoji formatting". Spitzak (talk) 17:52, 21 April 2025 (UTC)