T.51/ISO/IEC 6937
{{Short description|ITU-T Recommendation}}
{{Infobox technology standard
| title = T.51
| long_name = Latin based coded character sets for telematic services
| image =
| caption =
| status = In force
| year_started = 1984
| version = (09/92)
| version_date = September 1992
| preview =
| preview_date =
| organization = ITU-T
| committee = Study Group VIII
| base_standards =
| related_standards = T.61, ETS 300 706, ISO/IEC 10367, ISO/IEC 2022, ISO 5426
| abbreviation =
| domain = encoding
| license = Freely available
| website = https://www.itu.int/rec/T-REC-T.51
}}
{{Infobox character encoding
|name = T.51
|standard = {{plainlist|
|alias = {{plainlist|
- Code page 20269
- ISO-IR-90 (old)
- ISO-IR-142 (old)
- ISO-IR-156}}
|basedon = ITU T.61
|otherrelated = {{flatlist|
}}
T.51 / ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or more precisely ISO/IEC 646-IRV.{{Cite web|url=https://www.itu.int/rec/T-REC-T.51|title=T.51 : Latin based coded character sets for telematic services|website=www.itu.int|url-status=live|archive-url=https://web.archive.org/web/20191008190124/https://www.itu.int/rec/T-REC-T.51|archive-date=2019-10-08|access-date=2019-11-14}} It was developed in common with ITU-T (then CCITT) for telematic services under the name of T.51, and first became an ISO standard in 1983. Certain byte codes are used as lead bytes for letters with diacritics. The value of the lead byte often indicates which diacritic that the letter has, and the follow byte then has the ASCII-value for the letter that the diacritic is on.
ISO/IEC 6937's architects were Hugh McGregor Ross, Peter Fenwick, Bernard Marti and Loek Zeckendorf.
ISO6937/2 defines 327 characters found in modern European languages using the Latin alphabet. Non-Latin European characters, such as Cyrillic and Greek, are not included in the standard. Also, some diacritics used with the Latin alphabet like the Romanian comma are not included, using cedilla instead as no distinction between cedilla and comma below was made at the time.
IANA has registered the charset names ISO_6937-2-25 and ISO_6937-2-add for two (older) versions of this standard (plus control codes). But in practice this character encoding is unused on the Internet.{{cn|date=March 2025}}
Single byte characters
The primary set (first half) originally followed ISO 646-IRV before the ISO/IEC 646:1991 revision, that is, mostly following ASCII but with character 0x24 still denoted as an "international currency sign" (¤) instead of the dollar sign ($). The 1992 edition of ITU T.51 permits existing CCITT services to continue to interpret 0x24 as the international currency sign, but stipulates that new telecommunication applications should use it for the dollar sign (i.e. following the current ISO 646-IRV), and instead represent the international currency sign using the supplementary set.
The supplementary set (second half) contains a selection of spacing and non-spacing graphic characters, additional symbols and some locations reserved for future standardisation.
Both of these are ISO/IEC 2022 graphical character sets, with the primary set being a 94-code set and the secondary set being a 96-code set. In contexts where ISO 2022 code extension techniques are not in use, the primary set is designated as the G0 set and invoked over GL (0x20..0x7F), whereas the supplementary set is designated as the G2 set and invoked over GR (0xA0..0xFF) in an 8-bit environment, or by using the control code 0x19 as a single-shift in a 7-bit environment.{{citation|mode=cs1 |url=https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.51-199508-I!Amd1!PDF-E&type=items |title=Recommendation T.51 (1992) Amendment 1 |date=1995-08-11 |author=ITU-T |author-link=ITU-T}} This encoding of the Single Shift Two code matches its location in ISO-IR-106.{{cite iso-ir |number=106 |sponsor=ITU |sponsor-link=International Telecommunication Union |title=Teletex Primary Set of Control Functions |date=1985-08-01}}
The ISO/IEC 2022 escape sequence to designate the supplementary set of ISO/IEC 6937 as the G2 set is ESC . R
(hex 1B 2E 52
).{{cite iso-ir |number=156 |sponsor=ISO/IEC JTC 1/SC 2/WG 3 |sponsor-link=ISO/IEC JTC 1/SC 2 |title=Supplementary Set of ISO/IEC 6937:1992 |date=1991-12-15}} (The left-hand side is [https://www.itscj-ipsj.jp/ir/006.pdf US-ASCII].) The older ISO 6937/2:1983 supplementary set is registered as a 94-code set, and designated to G2 with ESC * l
(hex 1B 2A 6C
).{{cite iso-ir |number=90 |sponsor=ISO/TC97/SC2/WG4 |sponsor-link=ISO/IEC JTC 1/SC 2#History |title=Supplementary Set of Latin Alphabetic and non-Alphabetic Graphic Characters |date=1985-01-10}}
Two byte characters
Accented letters which are not allocated single codes in the primary or supplementary set are coded using two bytes. The first byte, the "non spacing diacritical mark", is followed by a letter from the base set e.g.:
small e with acute accent (é) = [Acute]+e
The ITU T.51 standard allocates column 4 of the supplementary set (i.e. 0xC0–CF when used in 8-bit format) to non-spacing diacritic characters.{{citation|mode=cs1 |url=https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.51-199209-I!!PDF-E&type=items |title=Latin based coded character sets for telematic services |id=Recommendation T.51 |date=1992-09-18 |author=CCITT |author-link=ITU-T |edition=1992}} However, ISO/IEC 6937 defines a fully specified character repertoire, mapping a list of composition sequences to ISO/IEC 10646 character names which match those defined in Unicode. The isolated nonspacing bytes are not included in this repertoire, although spacing variants of the diacritics not otherwise present in ASCII are included, with the ASCII space being the trail byte.{{citation|mode=cs1 |url=http://open-std.org/JTC1/sc2/wg3/docs/n454.pdf |title=WD 6937, Coded graphic character set for text communication - Latin alphabet |author=ISO/IEC JTC 1/SC 2/WG 3 |author-link=ISO/IEC JTC 1/SC 2 |id=JTC1/SC2/N454 |date=1998-04-15}}{{Cite book|url=https://books.google.com/books?id=YH_LBQAAQBAJ&pg=PA888|title=The Telecommunications Illustrated Dictionary|last=Petersen|first=J. K.|date=2002-05-29|publisher=CRC Press|isbn=978-1-4200-4067-8|pages=888|language=en}} Hence, only certain combinations of lead byte and follow byte conform to the ISO/IEC standard.
This repertoire is also affixed to the ITU version of the specification as Annex A, although the ITU version does not reference it from the main text. It is described as a "unified superset" of the Latin-script character repertoires. It corresponds to the repertoire of ISO/IEC 10367 when the ASCII, Latin-1 (or Latin-5), Latin-2 and supplementary Latin sets are used.
This system also differs from the Unicode combining character system in that the diacritic code precedes the letter (as opposed to following it), making it more similar to ANSEL.
A little anomaly is that Latin Small Letter G with Cedilla is coded as if it were with an acute accent, that is, with a 0xC2 lead byte, since due to its descender interfering with a cedilla, the lowercase letter is usually with turned comma above: {{nobr|Ģ ģ}}.
In total 13 diacritical marks can be followed by the selected characters from the primary set:
class="wikitable" |
Accent
! Code ! Second character ! Result |
---|
Grave
| 0xC1 | AEIOUaeiou | ÀÈÌÒÙàèìòù |
Acute
| 0xC2 | ACEILNORSUYZacegilnorsuyz | ÁĆÉÍĹŃÓŔŚÚÝŹáćéģíĺńóŕśúýź |
Circumflex
| 0xC3 | ACEGHIJOSUWYaceghijosuwy | ÂĈÊĜĤÎĴÔŜÛŴŶâĉêĝĥîĵôŝûŵŷ |
Tilde
| 0xC4 | AINOUainou | ÃĨÑÕŨãĩñõũ |
Macron
| 0xC5 | AEIOUaeiou | ĀĒĪŌŪāēīōū |
Breve
| 0xC6 | AGUagu | ĂĞŬăğŭ |
Dot
| 0xC7 | CEGIZcegz | ĊĖĠİŻċėġż |
Umlaut or diæresis
| 0xC8 | AEIOUYaeiouy | ÄËÏÖÜŸäëïöüÿ |
colspan="4"| |
Ring
| 0xCA | AUau | ÅŮåů |
Cedilla
| 0xCB | CGKLNRSTcklnrst | ÇĢĶĻŅŖŞŢçķļņŗşţ |
colspan="4"| |
Double Acute
| 0xCD | OUou | ŐŰőű |
Ogonek
| 0xCE | AEIUaeiu | ĄĘĮŲąęįų |
Caron
| 0xCF | CDELNRSTZcdelnrstz | ČĎĚĽŇŘŠŤŽčďěľňřšťž |
Codepage layout
The reference to combining characters in the U+0300—U+036F range for the codes in the range 0xC1—0xCF below is subject to the caveats mentioned above; they cannot simply be mapped to the codepoints listed. Also, Unicode distinguishes 0xE2 into uppercase D with stroke and uppercase Eth, which usually look different for the lowercase letters (0xF2 and 0xF3).
The older 1988 edition of ITU T.51 defined two versions of the supplementary set, with the first version lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the second version. The first version was defined as an extension of the T.61 supplementary set, and the second version as an extension of the first version. The current (1992) edition only includes the second version, deprecates certain characters, and updates the primary set to the current ISO-646-IRV (ASCII), although existing telematic services are permitted to retain the older behaviour.
{{chset-table-header1|ISO/IEC 6937 or ITU T.51 (Latin)}} | ||||||||||||||||
{{chset-left1|0x}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}} |
{{chset-left1|1x}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}} |
{{chset-left1|2x}}
|{{chset-ctrl1|U+0020 SPACE| SP }} |{{chset-cell1|U+0021 EXCLAMATION MARK|!}} |{{chset-cell1|U+0022 QUOTATION MARK|"}} |{{chset-cell1|U+0023 NUMBER SIGN|#|style=background:#FFD}} |{{chset-cell1|U+0024 DOLLAR SIGN / U+00A4 CURRENCY SIGN|$/¤|36 |fn={{efn|Continued use for ¤ permitted for existing CCITT services only.{{refn|name=T.51}}}} |style=background:#FFD}} |{{chset-cell1|U+0025 PERCENT SIGN|%}} |{{chset-cell1|U+0026 AMPERSAND|&}} |{{chset-cell1|U+0027 APOSTROPHE|'}} |{{chset-cell1|U+0028 LEFT PARENTHESIS|(}} |{{chset-cell1|U+0029 RIGHT PARENTHESIS|)}} |{{chset-cell1|U+002A ASTERISK|*}} |{{chset-cell1|U+002B PLUS SIGN|+}} |{{chset-cell1|U+002C COMMA|,}} |{{chset-cell1|U+002D HYPHEN-MINUS|Plus and minus signs | ||||||||||||||||
}}
|{{chset-cell1|U+002E FULL STOP|.}} |{{chset-cell1|U+002F SOLIDUS|/}} | ||||||||||||||||
{{chset-left1|3x}}
|{{chset-cell1|U+0030 DIGIT ZERO|0}} |{{chset-cell1|U+0031 DIGIT ONE|1}} |{{chset-cell1|U+0032 DIGIT TWO|2}} |{{chset-cell1|U+0033 DIGIT THREE|3}} |{{chset-cell1|U+0034 DIGIT FOUR|4}} |{{chset-cell1|U+0035 DIGIT FIVE|5}} |{{chset-cell1|U+0036 DIGIT SIX|6}} |{{chset-cell1|U+0037 DIGIT SEVEN|7}} |{{chset-cell1|U+0038 DIGIT EIGHT|8}} |{{chset-cell1|U+0039 DIGIT NINE|9}} |{{chset-cell1|U+003A COLON|:}} |{{chset-cell1|U+003B SEMICOLON|;}} |{{chset-cell1|U+003C LESS-THAN SIGN|<}} |{{chset-cell1|U+003D EQUALS SIGN|=}} |{{chset-cell1|U+003E GREATER-THAN SIGN|>}} |{{chset-cell1|U+003F QUESTION MARK|?}} | ||||||||||||||||
{{chset-left1|4x}}
|{{chset-cell1|U+0040 COMMERCIAL AT|@}} |{{chset-cell1|U+0041 LATIN CAPITAL LETTER A|A}} |{{chset-cell1|U+0042 LATIN CAPITAL LETTER B|B}} |{{chset-cell1|U+0043 LATIN CAPITAL LETTER C|C}} |{{chset-cell1|U+0044 LATIN CAPITAL LETTER D|D}} |{{chset-cell1|U+0045 LATIN CAPITAL LETTER E|E}} |{{chset-cell1|U+0046 LATIN CAPITAL LETTER F|F}} |{{chset-cell1|U+0047 LATIN CAPITAL LETTER G|G}} |{{chset-cell1|U+0048 LATIN CAPITAL LETTER H|H}} |{{chset-cell1|U+0049 LATIN CAPITAL LETTER I|I}} |{{chset-cell1|U+004A LATIN CAPITAL LETTER J|J}} |{{chset-cell1|U+004B LATIN CAPITAL LETTER K|K}} |{{chset-cell1|U+004C LATIN CAPITAL LETTER L|L}} |{{chset-cell1|U+004D LATIN CAPITAL LETTER M|M}} |{{chset-cell1|U+004E LATIN CAPITAL LETTER N|N}} |{{chset-cell1|U+004F LATIN CAPITAL LETTER O|O}} | ||||||||||||||||
{{chset-left1|5x}}
|{{chset-cell1|U+0050 LATIN CAPITAL LETTER P|P}} |{{chset-cell1|U+0051 LATIN CAPITAL LETTER Q|Q}} |{{chset-cell1|U+0052 LATIN CAPITAL LETTER R|R}} |{{chset-cell1|U+0053 LATIN CAPITAL LETTER S|S}} |{{chset-cell1|U+0054 LATIN CAPITAL LETTER T|T}} |{{chset-cell1|U+0055 LATIN CAPITAL LETTER U|U}} |{{chset-cell1|U+0056 LATIN CAPITAL LETTER V|V}} |{{chset-cell1|U+0057 LATIN CAPITAL LETTER W|W}} |{{chset-cell1|U+0058 LATIN CAPITAL LETTER X|X}} |{{chset-cell1|U+0059 LATIN CAPITAL LETTER Y|Y}} |{{chset-cell1|U+005A LATIN CAPITAL LETTER Z|Z}} |{{chset-cell1|U+005B LEFT SQUARE BRACKET|[}} |{{chset-cell1|U+005C REVERSE SOLIDUS|\|style=background:#FFD}} |{{chset-cell1|U+005D RIGHT SQUARE BRACKET|]}} |{{chset-cell1|U+005E CIRCUMFLEX ACCENT|^|style=background:#FFD}} |{{chset-cell1|U+005F LOW LINE|_}} | ||||||||||||||||
{{chset-left1|6x}}
|{{chset-cell1|U+0060 GRAVE ACCENT|`|style=background:#FFD}} |{{chset-cell1|U+0061 LATIN SMALL LETTER A|a}} |{{chset-cell1|U+0062 LATIN SMALL LETTER B|b}} |{{chset-cell1|U+0063 LATIN SMALL LETTER C|c}} |{{chset-cell1|U+0064 LATIN SMALL LETTER D|d}} |{{chset-cell1|U+0065 LATIN SMALL LETTER E|e}} |{{chset-cell1|U+0066 LATIN SMALL LETTER F|f}} |{{chset-cell1|U+0067 LATIN SMALL LETTER G|g}} |{{chset-cell1|U+0068 LATIN SMALL LETTER H|h}} |{{chset-cell1|U+0069 LATIN SMALL LETTER I|i}} |{{chset-cell1|U+006A LATIN SMALL LETTER J|j}} |{{chset-cell1|U+006B LATIN SMALL LETTER K|k}} |{{chset-cell1|U+006C LATIN SMALL LETTER L|l}} |{{chset-cell1|U+006D LATIN SMALL LETTER M|m}} |{{chset-cell1|U+006E LATIN SMALL LETTER N|n}} |{{chset-cell1|U+006F LATIN SMALL LETTER O|o}} | ||||||||||||||||
{{chset-left1|7x}}
|{{chset-cell1|U+0070 LATIN SMALL LETTER P|p}} |{{chset-cell1|U+0071 LATIN SMALL LETTER Q|q}} |{{chset-cell1|U+0072 LATIN SMALL LETTER R|r}} |{{chset-cell1|U+0073 LATIN SMALL LETTER S|s}} |{{chset-cell1|U+0074 LATIN SMALL LETTER T|t}} |{{chset-cell1|U+0075 LATIN SMALL LETTER U|u}} |{{chset-cell1|U+0076 LATIN SMALL LETTER V|v}} |{{chset-cell1|U+0077 LATIN SMALL LETTER W|w}} |{{chset-cell1|U+0078 LATIN SMALL LETTER X|x}} |{{chset-cell1|U+0079 LATIN SMALL LETTER Y|y}} |{{chset-cell1|U+007A LATIN SMALL LETTER Z|z}} |{{chset-cell1|U+007B LEFT CURLY BRACKET|{ |style=background:#FFD}} |{{chset-cell1|U+007C VERTICAL LINE|{{pipe}}}} |{{chset-cell1|U+007D RIGHT CURLY BRACKET|} |style=background:#FFD}} |{{chset-cell1|U+007E TILDE|~|style=background:#FFD}} |{{chset-cell1 | |style=background:#DDD}} | |||||||||||||||
{{chset-left1|8x}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}} |
{{chset-left1|9x}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}} |
{{chset-left1|Ax}}
|{{chset-ctrl1|U+00A0 NO-BREAK SPACE|NBSP}} |{{chset-cell1|U+00A1 INVERTED EXCLAMATION MARK|¡}} |{{chset-cell1|U+00A2 CENT SIGN|¢}} |{{chset-cell1|U+00A3 POUND SIGN|£}} |{{chset-cell1|U+0024 DOLLAR SIGN|$|164|fn={{efn|Permitted for existing CCITT services only, otherwise the ASCII representation should be used.{{refn|name=T.51}}|name=bk}} }} |{{chset-cell1|U+00A5 YEN SIGN|¥}} |{{chset-cell1|U+0023 NUMBER SIGN|#|166|fn={{efn|name=bk}} }} |{{chset-cell1|U+00A7 SECTION SIGN|§}} |{{chset-cell1|U+00A4 CURRENCY SIGN|¤}} |{{chset-cell1|U+2018 LEFT SINGLE QUOTATION MARK|‘|style=background:#FFD}} |{{chset-cell1|U+201C LEFT DOUBLE QUOTATION MARK|“|style=background:#FFD}} |{{chset-cell1|U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK|«}} |{{chset-cell1|U+2190 LEFTWARDS ARROW|←|style=background:#FFD}} |{{chset-cell1|U+2191 UPWARDS ARROW|↑|style=background:#FFD}} |{{chset-cell1|U+2192 RIGHTWARDS ARROW|→|style=background:#FFD}} |{{chset-cell1|U+2193 DOWNWARDS ARROW|↓|style=background:#FFD}} | ||||||||||||||||
{{chset-left1|Bx}}
|{{chset-cell1|U+00B0 DEGREE SIGN|°}} |{{chset-cell1|U+00B1 PLUS-MINUS SIGN|±}} |{{chset-cell1|U+00B2 SUPERSCRIPT TWO|²}} |{{chset-cell1|U+00B3 SUPERSCRIPT THREE|³}} |{{chset-cell1|U+00D7 MULTIPLICATION SIGN|×}} |{{chset-cell1|U+00B5 MICRO SIGN|µ}} |{{chset-cell1|U+00B6 PILCROW SIGN|¶}} |{{chset-cell1|U+00B7 MIDDLE DOT|·}} |{{chset-cell1|U+00F7 DIVISION SIGN|÷}} |{{chset-cell1|U+2019 RIGHT SINGLE QUOTATION MARK|’|style=background:#FFD}} |{{chset-cell1|U+201D RIGHT DOUBLE QUOTATION MARK|”|style=background:#FFD}} |{{chset-cell1|U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK|»}} |{{chset-cell1|U+00BC VULGAR FRACTION ONE QUARTER|¼}} |{{chset-cell1|U+00BD VULGAR FRACTION ONE HALF|½}} |{{chset-cell1|U+00BE VULGAR FRACTION THREE QUARTERS|¾}} |{{chset-cell1|U+00BF INVERTED QUESTION MARK|¿}} | ||||||||||||||||
|{{chset-left1|Cx}} |{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+0300 COMBINING GRAVE ACCENT|◌̀}} |{{chset-cell1|U+0301 COMBINING ACUTE ACCENT|◌́}} |{{chset-cell1|U+0302 COMBINING CIRCUMFLEX ACCENT|◌̂}} |{{chset-cell1|U+0303 COMBINING TILDE|◌̃}} |{{chset-cell1|U+0304 COMBINING MACRON|◌̄}} |{{chset-cell1|U+0306 COMBINING BREVE|◌̆}} |{{chset-cell1|U+0307 COMBINING DOT ABOVE|◌̇}} |{{chset-cell1|U+0308 COMBINING DIAERESIS|◌̈}} |{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+030A COMBINING RING ABOVE|◌̊}} |{{chset-cell1|U+0327 COMBINING CEDILLA|◌̧}} |{{chset-cell1|U+0332 COMBINING LOW LINE|◌̲|fn={{efn|Noted in the ITU version of the standard as having existing use for underlined text, in combination with any other character including accented characters. Although the 1988 ITU edition includes this code,{{refn|{{citation|mode=cs1 |url=https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.51-198811-S!!PDF-E&type=items |title=Coded character sets for telematic services |id=Recommendation T.51 |date=1988 |author=CCITT |author-link=ITU-T |edition=1988}} |name=T.51-1988}} the 1992 ITU edition discourages sending this code in favour of ANSI escape sequences, although it does mention that it should be correctly interpreted when received by applicable systems.{{refn|name=T.51}} Previous editions of the ISO/IEC version of the standard also allowed combining this code with any character in the defined repertoire,{{refn|name=reg090}} whereas more recent revisions do not include this code.{{refn|name=wd6937}} }} }} |{{chset-cell1|U+030B COMBINING DOUBLE ACUTE ACCENT|◌̋}} |{{chset-cell1|U+0328 COMBINING OGONEK|◌̨}} |{{chset-cell1|U+030C COMBINING CARON|◌̌}} | ||||||||||||||
{{chset-left1|Dx}}
|{{chset-cell1|U+2015 HORIZONTAL BAR|―|style=background:#FFD}} |{{chset-cell1|U+00B9 SUPERSCRIPT ONE|¹|style=background:#FFD}} |{{chset-cell1|U+00AE REGISTERED SIGN|®|style=background:#FFD}} |{{chset-cell1|U+00A9 COPYRIGHT SIGN|©|style=background:#FFD}} |{{chset-cell1|U+2122 TRADE MARK SIGN|™|style=background:#FFD}} |{{chset-cell1|U+266A EIGHTH NOTE|♪|style=background:#FFD}} |{{chset-cell1|U+00AC NOT SIGN|¬|style=background:#FFD}} |{{chset-cell1|U+00A6 BROKEN BAR|¦|style=background:#FFD}} |{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+215B VULGAR FRACTION ONE EIGHTH|⅛|style=background:#FFD}} |{{chset-cell1|U+215C VULGAR FRACTION THREE EIGHTHS|⅜|style=background:#FFD}} |{{chset-cell1|U+215D VULGAR FRACTION FIVE EIGHTHS|⅝|style=background:#FFD}} |{{chset-cell1|U+215E VULGAR FRACTION SEVEN EIGHTHS|⅞|style=background:#FFD}} | ||||||||||||
{{chset-left1|Ex}}
|{{chset-cell1|U+2126 OHM SIGN|Ω}} |{{chset-cell1|U+00C6 LATIN CAPITAL LETTER AE|Æ}} |{{chset-cell1|U+0110 LATIN CAPITAL LETTER D WITH STROKE / U+00D0 LATIN CAPITAL LETTER ETH|Đ/Ð}} |{{chset-cell1|U+00AA FEMININE ORDINAL INDICATOR|ª}} |{{chset-cell1|U+0126 LATIN CAPITAL LETTER H WITH STROKE|Ħ}} |{{chset-cell1 | |fn={{efn|An early draft placed ȷ in this position.}}|style=background:#DDD}}
|{{chset-cell1|U+0132 LATIN CAPITAL LIGATURE IJ|IJ}} |{{chset-cell1|U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT|Ŀ}} |{{chset-cell1|U+0141 LATIN CAPITAL LETTER L WITH STROKE|Ł}} |{{chset-cell1|U+00D8 LATIN CAPITAL LETTER O WITH STROKE|Ø}} |{{chset-cell1|U+0152 LATIN CAPITAL LIGATURE OE|Œ}} |{{chset-cell1|U+00BA MASCULINE ORDINAL INDICATOR|º}} |{{chset-cell1|U+00DE LATIN CAPITAL LETTER THORN|Þ}} |{{chset-cell1|U+0166 LATIN CAPITAL LETTER T WITH STROKE|Ŧ}} |{{chset-cell1|U+014A LATIN CAPITAL LETTER ENG|Ŋ}} |{{chset-cell1|U+0149 LATIN SMALL LETTER N PRECEDED BY APOSTROPHE|ʼn}} | |||||||||||||||
{{chset-left1|Fx}}
|{{chset-cell1|U+0138 LATIN SMALL LETTER KRA|ĸ}} |{{chset-cell1|U+00E6 LATIN SMALL LETTER AE|æ}} |{{chset-cell1|U+0111 LATIN SMALL LETTER D WITH STROKE|đ}} |{{chset-cell1|U+00F0 LATIN SMALL LETTER ETH|ð}} |{{chset-cell1|U+0127 LATIN SMALL LETTER H WITH STROKE|ħ}} |{{chset-cell1|U+0131 LATIN SMALL LETTER DOTLESS I|ı}} |{{chset-cell1|U+0133 LATIN SMALL LIGATURE IJ|ij}} |{{chset-cell1|U+0140 LATIN SMALL LETTER L WITH MIDDLE DOT|ŀ}} |{{chset-cell1|U+0142 LATIN SMALL LETTER L WITH STROKE|ł}} |{{chset-cell1|U+00F8 LATIN SMALL LETTER O WITH STROKE|ø}} |{{chset-cell1|U+0153 LATIN SMALL LIGATURE OE|œ}} |{{chset-cell1|U+00DF LATIN SMALL LETTER SHARP S|ß}} |{{chset-cell1|U+00FE LATIN SMALL LETTER THORN|þ}} |{{chset-cell1|U+0167 LATIN SMALL LETTER T WITH STROKE|ŧ}} |{{chset-cell1|U+014B LATIN SMALL LETTER ENG|ŋ}} |{{chset-ctrl1|U+00AD SOFT HYPHEN|SHY|style=background:#FFD}} |
{{legend|#FFD|Differences from T.61}}
= Videotex version =
{{main|Videotex character set}}
The versions of the supplementary set used by the ITU T.101 standard for Videotex are based on the first supplementary set of the 1988 edition of T.51.
The default G2 set for Data Syntax 2 adds a ΅ at 0xC0, for combination with codes from a Greek primary set.{{cite iso-ir |number=70 |date=1988-11-01 |title=Supplementary Set of Graphic Characters for Videotex |sponsor=CCITT |sponsor-link=ITU-T}}
The supplementary set for Data Syntax 3 adds non-spacing marks for a "vector overbar" and solidus and several semigraphic characters.{{cite iso-ir |number=128 |date=1986-11-30 |title=Supplementary Set of Graphic Characters for CCITT Recommendation T.101, Data Syntax III |sponsor=CCITT |sponsor-link=ITU-T}}
= ETS 300 706 version =
The ETS 300 706 standard for World System Teletext bases its G2 set on ISO 6937.{{citation|mode=cs1|section=15.6.3 Latin G2 Set|id=ETS 300 706|title=Enhanced Teletext specification (PDF)|author=ETSI|author-link=European Telecommunications Standards Institute|page=116|date=1997|url=https://www.etsi.org/deliver/etsi_i_ets/300700_300799/300706/01_60/ets_300706e01p.pdf}} It is a superset of the supplementary set of T.61, and a superset of the first supplementary set of the 1988 edition of T.51, but collides with the current edition of T.51 in certain positions. Diacritic codes in the ETS version are specified as being "for association with" characters from the G0 set in use, such as US-ASCII or BS_viewdata. This version is shown in the chart below.
{{chset-table-header1|World System Teletext, Latin G2 Set (ETS 300 706:1997)}} | |
{{chset-left1|Ax}}
|{{chset-ctrl1|U+00A0 SPACE| SP |style=background:#FFD}} |{{chset-cell1|U+00A1 INVERTED EXCLAMATION MARK|¡}} |{{chset-cell1|U+00A2 CENT SIGN|¢}} |{{chset-cell1|U+00A3 POUND SIGN|£}} |{{chset-cell1|U+0024 DOLLAR SIGN|$}} |{{chset-cell1|U+00A5 YEN SIGN|¥}} |{{chset-cell1|U+0023 NUMBER SIGN|#}} |{{chset-cell1|U+00A7 SECTION SIGN|§}} |{{chset-cell1|U+00A4 CURRENCY SIGN|¤}} |{{chset-cell1|U+2018 LEFT SINGLE QUOTATION MARK|‘}} |{{chset-cell1|U+201C LEFT DOUBLE QUOTATION MARK|“}} |{{chset-cell1|U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK|«}} |{{chset-cell1|U+2190 LEFTWARDS ARROW|←}} |{{chset-cell1|U+2191 UPWARDS ARROW|↑}} |{{chset-cell1|U+2192 RIGHTWARDS ARROW|→}} |{{chset-cell1|U+2193 DOWNWARDS ARROW|↓}} | |
---|---|
{{chset-left1|Bx}}
|{{chset-cell1|U+00B0 DEGREE SIGN|°}} |{{chset-cell1|U+00B1 PLUS-MINUS SIGN|±}} |{{chset-cell1|U+00B2 SUPERSCRIPT TWO|²}} |{{chset-cell1|U+00B3 SUPERSCRIPT THREE|³}} |{{chset-cell1|U+00D7 MULTIPLICATION SIGN|×}} |{{chset-cell1|U+00B5 MICRO SIGN|µ}} |{{chset-cell1|U+00B6 PILCROW SIGN|¶}} |{{chset-cell1|U+00B7 MIDDLE DOT|·}} |{{chset-cell1|U+00F7 DIVISION SIGN|÷}} |{{chset-cell1|U+2019 RIGHT SINGLE QUOTATION MARK|’}} |{{chset-cell1|U+201D RIGHT DOUBLE QUOTATION MARK|”}} |{{chset-cell1|U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK|»}} |{{chset-cell1|U+00BC VULGAR FRACTION ONE QUARTER|¼}} |{{chset-cell1|U+00BD VULGAR FRACTION ONE HALF|½}} |{{chset-cell1|U+00BE VULGAR FRACTION THREE QUARTERS|¾}} |{{chset-cell1|U+00BF INVERTED QUESTION MARK|¿}} | |
{{chset-left1|Cx}}
|{{chset-cell1 | |style=background:#DDD}}
|{{chset-cell1|U+0300 COMBINING GRAVE ACCENT|◌̀}} |{{chset-cell1|U+0301 COMBINING ACUTE ACCENT|◌́}} |{{chset-cell1|U+0302 COMBINING CIRCUMFLEX ACCENT|◌̂}} |{{chset-cell1|U+0303 COMBINING TILDE|◌̃}} |{{chset-cell1|U+0304 COMBINING MACRON|◌̄}} |{{chset-cell1|U+0306 COMBINING BREVE|◌̆}} |{{chset-cell1|U+0307 COMBINING DOT ABOVE|◌̇}} |{{chset-cell1|U+0308 COMBINING DIAERESIS|◌̈}} |{{chset-cell1|U+0323 COMBINING DOT BELOW|̣◌̣|style=background:#FFD}} |{{chset-cell1|U+030A COMBINING RING ABOVE|◌̊}} |{{chset-cell1|U+0327 COMBINING CEDILLA|◌̧}} |{{chset-cell1|U+0332 COMBINING LOW LINE|◌̲}} |{{chset-cell1|U+030B COMBINING DOUBLE ACUTE ACCENT|◌̋}} |{{chset-cell1|U+0328 COMBINING OGONEK|◌̨}} |{{chset-cell1|U+030C COMBINING CARON|◌̌}} |
{{chset-left1|Dx}}
|{{chset-cell1|U+2015 HORIZONTAL BAR|―}} |{{chset-cell1|U+00B9 SUPERSCRIPT ONE|¹}} |{{chset-cell1|U+00AE REGISTERED SIGN|®}} |{{chset-cell1|U+00A9 COPYRIGHT SIGN|©}} |{{chset-cell1|U+2122 TRADE MARK SIGN|™}} |{{chset-cell1|U+266A EIGHTH NOTE|♪}} |{{chset-cell1|U+20A0 EURO-CURRENCY SIGN|₠|style=background:#FFD}} |{{chset-cell1|U+2030 PER MILLE SIGN|‰|style=background:#FFD}} |{{chset-cell1|U+03B1 GREEK SMALL LETTER ALPHA|α|style=background:#FFD}} |{{chset-cell1|||style=background:#DDD}} |{{chset-cell1|||style=background:#DDD}} |{{chset-cell1|||style=background:#DDD}} |{{chset-cell1|U+215B VULGAR FRACTION ONE EIGHTH|⅛}} |{{chset-cell1|U+215C VULGAR FRACTION THREE EIGHTHS|⅜}} |{{chset-cell1|U+215D VULGAR FRACTION FIVE EIGHTHS|⅝}} |{{chset-cell1|U+215E VULGAR FRACTION SEVEN EIGHTHS|⅞}} | |
{{chset-left1|Ex}}
|{{chset-cell1|U+2126 OHM SIGN|Ω}} |{{chset-cell1|U+00C6 LATIN CAPITAL LETTER AE|Æ}} |{{chset-cell1|U+0110 LATIN CAPITAL LETTER D WITH STROKE/00D0|Đ/Ð}} |{{chset-cell1|U+00AA FEMININE ORDINAL INDICATOR|ª}} |{{chset-cell1|U+0126 LATIN CAPITAL LETTER H WITH STROKE|Ħ}} |{{chset-cell1|||style=background:#DDD}} |{{chset-cell1|U+0132 LATIN CAPITAL LIGATURE IJ|IJ}} |{{chset-cell1|U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT|Ŀ}} |{{chset-cell1|U+0141 LATIN CAPITAL LETTER L WITH STROKE|Ł}} |{{chset-cell1|U+00D8 LATIN CAPITAL LETTER O WITH STROKE|Ø}} |{{chset-cell1|U+0152 LATIN CAPITAL LIGATURE OE|Œ}} |{{chset-cell1|U+00BA MASCULINE ORDINAL INDICATOR|º}} |{{chset-cell1|U+00DE LATIN CAPITAL LETTER THORN|Þ}} |{{chset-cell1|U+0166 LATIN CAPITAL LETTER T WITH STROKE|Ŧ}} |{{chset-cell1|U+014A LATIN CAPITAL LETTER ENG|Ŋ}} |{{chset-cell1|U+0149 LATIN SMALL LETTER N PRECEDED BY APOSTROPHE|ʼn}} | |
{{chset-left1|Fx}}
|{{chset-cell1|U+0138 LATIN SMALL LETTER KRA|ĸ}} |{{chset-cell1|U+00E6 LATIN SMALL LETTER AE|æ}} |{{chset-cell1|U+0111 LATIN SMALL LETTER D WITH STROKE|đ}} |{{chset-cell1|U+00F0 LATIN SMALL LETTER ETH|ð}} |{{chset-cell1|U+0127 LATIN SMALL LETTER H WITH STROKE|ħ}} |{{chset-cell1|U+0131 LATIN SMALL LETTER DOTLESS I|ı}} |{{chset-cell1|U+0133 LATIN SMALL LIGATURE IJ|ij}} |{{chset-cell1|U+0140 LATIN SMALL LETTER L WITH MIDDLE DOT|ŀ}} |{{chset-cell1|U+0142 LATIN SMALL LETTER L WITH STROKE|ł}} |{{chset-cell1|U+00F8 LATIN SMALL LETTER O WITH STROKE|ø}} |{{chset-cell1|U+0153 LATIN SMALL LIGATURE OE|œ}} |{{chset-cell1|U+00DF LATIN SMALL LETTER SHARP S|ß}} |{{chset-cell1|U+00FE LATIN SMALL LETTER THORN|þ}} |{{chset-cell1|U+0167 LATIN SMALL LETTER T WITH STROKE|ŧ}} |{{chset-cell1|U+014B LATIN SMALL LETTER ENG|ŋ}} |{{chset-cell1|U+25A0 BLACK SQUARE|■|style=background:#FFD}} |
{{legend|#FFD|Differences from T.51}}
See also
Footnotes
{{notelist}}
References
{{Reflist|2}}
External links
- [https://www.itu.int/rec/T-REC-T.51 ITU Recommendation T.51]
- ISO pages: [https://www.iso.org/standard/13466.html ISO 6937-1:1983], [https://www.iso.org/standard/13467.html ISO 6937-2:1983], [https://www.iso.org/standard/13468.html ISO 6937-2:1983/Add 1:1989], [https://www.iso.org/standard/13465.html ISO/IEC 6937:1994], [https://www.iso.org/standard/31393.html ISO/IEC 6937:2001]
- [http://std.dkuug.dk/JTC1/sc2/wg3/docs/n454.pdf WD 6937, Coded graphic character set for text communication - Latin alphabet (Revision of ISO/IEC 6937:1994)] (ISO/IEC 6937:1994 draft)
- [https://www.itscj-ipsj.jp/ir/156.pdf ISO-IR-156] (ISO-IR registration of right-hand part)
{{character encoding|state=uncollapsed}}
{{ISO standards}}
{{DEFAULTSORT:ISO IEC 6937}}