ISO/IEC 10367
{{short description|Standard specifying graphical character sets}}
ISO/IEC 10367:1991 is a standard developed by ISO/IEC JTC 1/SC 2,{{cite web |url=https://www.iso.org/standard/18428.html |title=Information technology — Standardized coded graphic character sets for use in 8-bit codes |id=ISO/IEC 10367:1991 |author=ISO/IEC JTC 1/SC 2 |author-link=ISO/IEC JTC 1/SC 2 |publisher=ISO |date=1991}} defining graphical character sets for use in character encodings implementing levels 2 and 3 of ISO/IEC 4873{{cite web |url=https://www.terena.org/activities/multiling/euroml/section08.html |title=8. Code Extension, ISO 2022 and 2375, ISO 4873 and 10367 |last=van Wingen |first=Johan W |work=Character sets. Letters, tokens and codes |year=1999 |publisher=Terena |url-status=dead |archive-url=https://web.archive.org/web/20200801214714/https://www.terena.org/activities/multiling/euroml/section08.html |archive-date=2020-08-01}} (as opposed to ISO/IEC 8859, which defines character encodings at level 1 of ISO/IEC 4873).
Relationship to ISO/IEC 8859
The parts of ISO/IEC 8859 define complete encodings at level 1 of ISO/IEC 4873 (i.e., as stateless extended ASCII single-byte encodings, reserving the C1 area), and do not allow for use of multiple parts together. For use at levels 2 and 3 of ISO/IEC 4873 (i.e., with shift codes for additional graphical character sets), ISO/IEC 8859 stipulates that equivalent sets from ISO/IEC 10367 should be used instead.{{citation |mode=cs1 |url=http://www.open-std.org/JTC1/SC2/WG3/docs/n415.pdf |title=Final Text of DIS 8859-10, Information Technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No. 6 |date=1998-02-12 |id=ISO/IEC FDIS 8859-10:1998, JTC1/SC2 N2992, WG3 N415 |author=ISO/IEC JTC 1/SC 2 |author-link=ISO/IEC JTC 1/SC 2}}
ISO/IEC 10367:1991 includes ASCII, as well as sets matching the G1 sets used for the right-hand sides (non-ASCII parts) of ISO/IEC 6937 (ITU T.51) and of ISO/IEC 8859 parts 1 through 9 (i.e., those parts that existed as of 1991, when it was published), a set of additional Roman characters supplementing some of those parts, and a set of box drawing characters (shown below).{{cite web |url=http://std.dkuug.dk/cen/tc304/guide/gis10367.htm |title=8-Bit Character Sets - ISO/IEC 10367 |work=Guide to the use of Character Sets in Europe |publisher=DKUUG}}
{{anchor|ISO-IR-154}}Supplementary G3 Latin set
ISO/IEC 10367 includes the ISO-IR-154 graphical set, which is intended to supplement Latin alphabets number 1, 2 and 5 (i.e., ISO-8859-1, ISO-8859-2 and ISO-8859-9). Specifically, it is intended for use as a G3 set in a profile of ISO/IEC 4873 in which the G1 and G2 sets include the right hand side of ISO-8859-2, and also that of either ISO-8859-1 or ISO-8859-9.{{cite iso-ir |number=154 |title=Supplementary Set for Latin Alphabets 1, 2 and 5. |date=1990-03-01 |sponsor=ECMA |sponsor-link=Ecma International}} These configurations represent the entire ISO/IEC 6937 repertoire (ITU T.51 Annex A) without non-spacing codes.{{citation|mode=cs1 |url=http://open-std.org/JTC1/sc2/wg3/docs/n454.pdf |title=WD 6937, Coded graphic character set for text communication - Latin alphabet |author=ISO/IEC JTC 1/SC 2/WG 3 |author-link=ISO/IEC JTC 1/SC 2 |id=JTC1/SC2/N454 |date=1998-04-15 |page=37 |section=Annex E: Alternative coded representation of the repertoire with no non-spacing diacritical marks}}
For instance, the letter Ĉ would be encoded under ISO/IEC 4873 level 2 as 0x8F 0x23
if this set is included.
Highlighted characters also appear in ISO-8859-1 or ISO-8859-9. Under the current edition of ISO/IEC 4873 / ECMA-43 (though not earlier editions),{{citation|mode=cs1 |title=ECMA-43: 8-Bit Coded Character Set Structure and Rules |author=ECMA |author-link=Ecma International |date=1991 |edition=3rd |type=ECMA Standard |url=https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-043.pdf |page=23 |section=Main differences between the second edition (1985) and the present (third) edition of this ECMA Standard}} characters must be used from the lowest-numbered working set they appear in, hence those characters are not used from this G3 set when the respective ISO-8859 right-hand side set is used as the G1 or G2 set.{{citation|mode=cs1 |title=ECMA-43: 8-Bit Coded Character Set Structure and Rules |author=ECMA |author-link=Ecma International |date=1991 |edition=3rd |type=ECMA Standard |url=https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-043.pdf |page=10 |section=Unique coding of characters}}
{{chset-table-header1|ISO/IEC 10367 supplementary G3 Latin set}} | |||
{{chset-left1|2x/Ax}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+0100 LATIN CAPITAL LETTER A WITH MACRON|Ā}} |{{chset-cell1|U+0108 LATIN CAPITAL LETTER C WITH CIRCUMFLEX|Ĉ}} |{{chset-cell1|U+010A LATIN CAPITAL LETTER C WITH DOT ABOVE|Ċ}} |{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+0116 LATIN CAPITAL LETTER E WITH DOT ABOVE|Ė}} |{{chset-cell1|U+0112 LATIN CAPITAL LETTER E WITH MACRON|Ē}} |{{chset-cell1|U+011C LATIN CAPITAL LETTER G WITH CIRCUMFLEX|Ĝ}} |{{chset-cell1|U+2018 LEFT SINGLE QUOTATION MARK|‘}} |{{chset-cell1|U+201C LEFT DOUBLE QUOTATION MARK|“}} |{{chset-cell1|U+2122 TRADE MARK SIGN|™}} |{{chset-cell1|U+2190 LEFTWARDS ARROW|←}} |{{chset-cell1|U+2191 UPWARDS ARROW|↑}} |{{chset-cell1|U+2192 RIGHTWARDS ARROW|→}} |{{chset-cell1|U+2193 DOWNWARDS ARROW|↓}} |
{{chset-left1|3x/Bx}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+0101 LATIN SMALL LETTER A WITH MACRON|ā}} |{{chset-cell1|U+0109 LATIN SMALL LETTER C WITH CIRCUMFLEX|ĉ}} |{{chset-cell1|U+010B LATIN SMALL LETTER C WITH DOT ABOVE|ċ}} |{{chset-cell1|U+00F0 LATIN SMALL LETTER ETH|ð|style=background:#FFD}} |{{chset-cell1|U+0117 LATIN SMALL LETTER E WITH DOT ABOVE|ė}} |{{chset-cell1|U+0113 LATIN SMALL LETTER E WITH MACRON|ē}} |{{chset-cell1|U+011D LATIN SMALL LETTER G WITH CIRCUMFLEX|ĝ}} |{{chset-cell1|U+2019 RIGHT SINGLE QUOTATION MARK|’}} |{{chset-cell1|U+201D RIGHT DOUBLE QUOTATION MARK|”}} |{{chset-cell1|U+266A EIGHTH NOTE|♪}} |{{chset-cell1|U+215B VULGAR FRACTION ONE EIGHTH|⅛}} |{{chset-cell1|U+215C VULGAR FRACTION THREE EIGHTHS|⅜}} |{{chset-cell1|U+215D VULGAR FRACTION FIVE EIGHTHS|⅝}} |{{chset-cell1|U+215E VULGAR FRACTION SEVEN EIGHTHS|⅞}} | |
{{chset-left1|4x/Cx}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+011E LATIN CAPITAL LETTER G WITH BREVE|Ğ|style=background:#FEE}} |{{chset-cell1|U+0120 LATIN CAPITAL LETTER G WITH DOT ABOVE|Ġ}} |{{chset-cell1|U+0122 LATIN CAPITAL LETTER G WITH CEDILLA|Ģ}} |{{chset-cell1|U+0124 LATIN CAPITAL LETTER H WITH CIRCUMFLEX|Ĥ}} |{{chset-cell1|U+0126 LATIN CAPITAL LETTER H WITH STROKE|Ħ}} |{{chset-cell1|U+0128 LATIN CAPITAL LETTER I WITH TILDE|Ĩ}} |{{chset-cell1|U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE|İ|style=background:#FEE}} |{{chset-cell1|U+012A LATIN CAPITAL LETTER I WITH MACRON|Ī}} |{{chset-cell1|U+012E LATIN CAPITAL LETTER I WITH OGONEK|Į}} |{{chset-cell1|U+0132 LATIN CAPITAL LIGATURE IJ|IJ}} |{{chset-cell1|U+0134 LATIN CAPITAL LETTER J WITH CIRCUMFLEX|Ĵ}} |{{chset-cell1|U+0136 LATIN CAPITAL LETTER K WITH CEDILLA|Ķ}} |{{chset-cell1|U+013B LATIN CAPITAL LETTER L WITH CEDILLA|Ļ}} |{{chset-cell1|U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT|Ŀ}} |{{chset-cell1|U+0145 LATIN CAPITAL LETTER N WITH CEDILLA|Ņ}} | ||
{{chset-left1|5x/Dx}}
|{{chset-cell1|U+2014 EM DASH|—}} |{{chset-cell1|U+014A LATIN CAPITAL LETTER ENG|Ŋ}} |{{chset-cell1|U+014C LATIN CAPITAL LETTER O WITH MACRON|Ō}} |{{chset-cell1|U+0152 LATIN CAPITAL LIGATURE OE|Œ}} |{{chset-cell1|U+0156 LATIN CAPITAL LETTER R WITH CEDILLA|Ŗ}} |{{chset-cell1|U+015C LATIN CAPITAL LETTER S WITH CIRCUMFLEX|Ŝ}} |{{chset-cell1|U+0166 LATIN CAPITAL LETTER T WITH STROKE|Ŧ}} |{{chset-cell1|U+00DE LATIN CAPITAL LETTER THORN|Þ|style=background:#FFD}} |{{chset-cell1|U+0168 LATIN CAPITAL LETTER U WITH TILDE|Ũ}} |{{chset-cell1|U+016C LATIN CAPITAL LETTER U WITH BREVE|Ŭ}} |{{chset-cell1|U+016A LATIN CAPITAL LETTER U WITH MACRON|Ū}} |{{chset-cell1|U+0172 LATIN CAPITAL LETTER U WITH OGONEK|Ų}} |{{chset-cell1|U+0174 LATIN CAPITAL LETTER W WITH CIRCUMFLEX|Ŵ}} |{{chset-cell1|U+00DD LATIN CAPITAL LETTER Y WITH ACUTE|Ý|style=background:#FFD}} |{{chset-cell1|U+0176 LATIN CAPITAL LETTER Y WITH CIRCUMFLEX|Ŷ}} |{{chset-cell1|U+0178 LATIN CAPITAL LETTER Y WITH DIAERESIS|Ÿ}} | |||
{{chset-left1|6x/Ex}}
|{{chset-cell1|U+2126 OHM SIGN|Ω}} |{{chset-cell1|U+011F LATIN SMALL LETTER G WITH BREVE|ğ|style=background:#FEE}} |{{chset-cell1|U+0121 LATIN SMALL LETTER G WITH DOT ABOVE|ġ}} |{{chset-cell1|U+0123 LATIN SMALL LETTER G WITH CEDILLA|ģ}} |{{chset-cell1|U+0125 LATIN SMALL LETTER H WITH CIRCUMFLEX|ĥ}} |{{chset-cell1|U+0127 LATIN SMALL LETTER H WITH STROKE|ħ}} |{{chset-cell1|U+0129 LATIN SMALL LETTER I WITH TILDE|ĩ}} |{{chset-cell1|U+0131 LATIN SMALL LETTER DOTLESS I|ı|style=background:#FEE}} |{{chset-cell1|U+012B LATIN SMALL LETTER I WITH MACRON|ī}} |{{chset-cell1|U+012F LATIN SMALL LETTER I WITH OGONEK|į}} |{{chset-cell1|U+0133 LATIN SMALL LIGATURE IJ|ij}} |{{chset-cell1|U+0135 LATIN SMALL LETTER J WITH CIRCUMFLEX|ĵ}} |{{chset-cell1|U+0137 LATIN SMALL LETTER K WITH CEDILLA|ķ}} |{{chset-cell1|U+013C LATIN SMALL LETTER L WITH CEDILLA|ļ}} |{{chset-cell1|U+0140 LATIN SMALL LETTER L WITH MIDDLE DOT|ŀ}} |{{chset-cell1|U+0146 LATIN SMALL LETTER N WITH CEDILLA|ņ}} | |||
{{chset-left1|7x/Fx}}
|{{chset-cell1|U+0138 LATIN SMALL LETTER KRA|ĸ}} |{{chset-cell1|U+014B LATIN SMALL LETTER ENG|ŋ}} |{{chset-cell1|U+014D LATIN SMALL LETTER O WITH MACRON|ō}} |{{chset-cell1|U+0153 LATIN SMALL LIGATURE OE|œ}} |{{chset-cell1|U+0157 LATIN SMALL LETTER R WITH CEDILLA|ŗ}} |{{chset-cell1|U+015D LATIN SMALL LETTER S WITH CIRCUMFLEX|ŝ}} |{{chset-cell1|U+0167 LATIN SMALL LETTER T WITH STROKE|ŧ}} |{{chset-cell1|U+00FE LATIN SMALL LETTER THORN|þ|style=background:#FFD}} |{{chset-cell1|U+0169 LATIN SMALL LETTER U WITH TILDE|ũ}} |{{chset-cell1|U+016D LATIN SMALL LETTER U WITH BREVE|ŭ}} |{{chset-cell1|U+016B LATIN SMALL LETTER U WITH MACRON|ū}} |{{chset-cell1|U+0173 LATIN SMALL LETTER U WITH OGONEK|ų}} |{{chset-cell1|U+0175 LATIN SMALL LETTER W WITH CIRCUMFLEX|ŵ}} |{{chset-cell1|U+00FD LATIN SMALL LETTER Y WITH ACUTE|ý|style=background:#FFD}} |{{chset-cell1|U+0177 LATIN SMALL LETTER Y WITH CIRCUMFLEX|ŷ}} |{{chset-cell1|U+0149 LATIN SMALL LETTER N PRECEDED BY APOSTROPHE|ʼn}} |
{{legend|#FFD|Also in ISO-8859-1}}
{{legend|#FEE|Also in ISO-8859-9}}
{{anchor|ISO-IR-155}}Box drawing set
The following shows the box drawing set from ISO/IEC 10367, which is registered for ISO/IEC 2022 use as ISO-IR-155. It does not use the 0x20/A0 or 0x7F/FF positions, but is nonetheless registered as a 96-character set.{{cite iso-ir |number=155 |title=Basic Box-Drawings Set |date=1990-04-16 |sponsor=ISO/IEC/JTC1/SC2/WG3 |id=ISO-IR-155 |sponsor-link=ISO/IEC JTC 1/SC 2}}
Perl libintl includes a "ISO_10367-BOX" codec. This encodes/decodes ASCII over GL and the ISO-IR-155 box drawing set over GR with a few deviations. Specifically, it includes double-lined box-drawing characters in place of heavy-lined characters, and it replaces the upper half block (▀) at 0xCB with a private use character U+E019, documented as "Unit space B".{{cite web |url=http://search.cpan.org/~guido/libintl-perl/lib/Locale/RecodeData/ISO_10367_BOX.pm |title=Conversion routines for ISO_10367_BOX |last=Flohr |first=Guido |work=libintl-perl |id=Locale::RecodeData::ISO_10367_BOX}}