Z-variant
{{Short description|Glyphs with minor typographical differences}}
{{SpecialChars}}
In Unicode, two glyphs are said to be Z-variants (often spelled zVariants) if they share the same etymology but have slightly different appearances and different Unicode code points. For example, the Unicode characters {{mono|U+8AAA}} {{lang|zh-Hant|說}} and {{mono|U+8AAC}} {{lang|zh-Hant|説}} are Z-variants. The notion of Z-variance is only applicable to the "CJKV scripts"—Chinese, Japanese, Korean and Vietnamese—and is a subtopic of Han unification.
Differences on the Z-axis
The Unicode philosophy of code point allocation for CJK languages is organized along three "axes." The X-axis represents differences in semantics; for example, the Latin capital A ({{mono|U+0041}} A) and the Greek capital alpha ({{mono|U+0391}} Α) are represented by two distinct code points in Unicode, and might be termed "X-variants" (though this term is not common). The Y-axis represents significant differences in appearance though not in semantics; for example, the traditional Chinese character māo "cat" ({{mono|U+8C93}} {{lang|zh-Hant|貓}}) and the simplified Chinese character ({{mono|U+732B}} {{lang|zh-Hans|猫}}) are Y-variants.{{Cite web|url=https://www.unicode.org/glossary/|title=Glossary|website=www.unicode.org}}
The Z-axis represents minor typographical differences. For example, the Chinese characters ({{mono|U+838A}} {{lang|zh|莊}}) and ({{mono|U+8358}} {{lang|zh|荘}}) are Z-variants, as are ({{mono|U+8AAA}} {{lang|zh|說}}) and ({{mono|U+8AAC}} {{lang|zh|説}}). The glossary at Unicode.org defines "Z-variant" as "Two CJK unified ideographs with identical semantics and unifiable shapes," where "unifiable" is taken in the sense of Han unification.
Thus, were Han unification perfectly successful, Z-variants would not exist. They exist in Unicode because it was deemed useful to be able to "round-trip" documents between Unicode and other CJK encodings such as Big5 and CCCII. For example, the character {{lang|zh|莊}} has CCCII encoding 21552D, while its Z-variant {{lang|zh|荘}} has CCCII encoding 2D552D. Therefore, these two variants were given distinct Unicode code points, so that converting a CCCII document to Unicode and back would be a lossless operation.
Confusion
There is some confusion over the exact definition of "Z-variant." For example, in an Internet Draft (of {{IETF RFC|3743}}) dated 2002,{{Cite journal|url=https://tools.ietf.org/html/draft-jseng-idn-admin-02.html|title=Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean|first1=K.|last1=Huang|first2=Y.|last2=Ko|first3=K.|last3=Konishi|first4=H.|last4=Qian|website=tools.ietf.org|date=April 2004}} one finds {{Transliteration|zh|pinyin|bù}} "no" ({{mono|U+4E0D}} {{lang|zh|不}}) and ({{mono|U+F967}} {{lang|zh|不︀}}) described as "font variants," the term "Z-variant" being apparently reserved for interlanguage pairs such as the Mandarin Chinese {{Transliteration|zh|pinyin|tù}} "rabbit" ({{mono|U+5154}} {{lang|zh|兔}}) and the Japanese {{Transliteration|ja|hepburn|to}} "rabbit" ({{mono|U+514E}} {{lang|ja|兎}}). However, the Unicode Consortium's Unihan database{{Cite web|url=https://www.unicode.org/charts/unihan.html|title=Unihan Database Lookup|website=www.unicode.org}}{{failed verification|date=August 2022|talk=Confusion?}} treats both pairs as Z-variants.
See also
{{wiktionary|z-variant}}
References
{{reflist}}
{{Unicode navigation}}