Private Use Areas
{{Short description|Purposely unassigned Unicode code points}}
{{about|the Unicode PUA range of codepoints|other uses|Private use area (disambiguation)}}
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard.{{cite web |website=Unicode Consortium |url=https://unicode.org/glossary/#private_use_area |title=Glossary of Unicode Terms: "Private Use Area (PUA)"}} Three Private Use Areas are defined: one in the Basic Multilingual Plane ({{mono|U+E000–U+F8FF}}), and one each in, and nearly covering, planes 15 and 16 ({{mono|1=U+F0000–U+FFFFD}}, {{mono|1=U+100000–U+10FFFD}}). They are intentionally left undefined so that third parties may assign their own characters without conflicting with Unicode Standard assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.{{cite web |url=https://unicode.org/policies/stability_policy.html |title=Unicode Character Encoding Stability Policy |publisher=Unicode Consortium |access-date=2022-03-03 |date=2021-11-10}}
Assignments to private-use code points need not be "private" in the sense of strictly internal to an organisation; a number of assignment schemes have been published by several organisations. Such publication may include a font that supports the definition (showing the glyphs), and software making use of the private-use characters (e.g., a graphics character for a "print document" function). By definition, multiple private parties may assign different characters to the same code point, with the consequence that a user may see one private character from an installed font where a different one was intended.
Definition
Under the Unicode definition, code points in the Private Use Areas are not noncharacters, reserved, or unassigned. Their category is "Other, private use (Co)
", and no character names are specified. No representative glyphs are provided, and character semantics are left to private agreement.
Private-use characters are assigned Unicode code points whose interpretation is not specified by this standard and whose use may be determined by private agreement among cooperating users. These characters are designated for private use and do not have defined, interpretable semantics except by private agreement.... No charts are provided for private-use characters, as any such characters are, by their very nature, defined only outside the context of this standard.{{cite book |chapter-url=https://www.unicode.org/versions/Unicode14.0.0/ch23.pdf |chapter=Chapter 23 Special Areas and Format Characters |title=The Unicode Standard Version 14.0 - Core Specification |isbn=978-1-936213-29-0 |date=September 2021 |publisher=Unicode Consortium |at=Private Use characters}}
{{anchor|PUA-A|PUA-B}}Blocks
There are three PUA blocks in Unicode.{{cite web |url=https://www.unicode.org/ucd/ |title=Unicode Character Database |work=The Unicode Standard |publisher=Unicode Consortium |access-date=2023-07-26}}{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|access-date=2023-07-26}}
In the Basic Multilingual Plane (plane 0), the block titled Private Use Area (PUA) has 6400 code points.
Planes 15 and 16 are almostThe last two characters of every plane are defined to be noncharacters. The remaining 65,534 characters of each of planes 15 and 16 are assigned as private-use characters. entirely assigned to two further Private Use Areas: Supplementary Private Use Area-A (SPUA-A) and Supplementary Private Use Area-B (SPUA-B). In UTF-16 a subset of the high surrogates (U+DB80..U+DBFF) is used for these and only these planes, and are called High Private Use Surrogates.
{{col begin|width=auto}}
{{col break}}
{{Infobox Unicode block
|style=float:left; margin-left:0;
|blockname = Private Use Area
|rangestart = E000
|rangeend = F8FF
|script1 = Unknown
|1_0_0 = 5632
|1_0_1 = 768
|codechart = https://unicode.org/charts/PDF/UE000.pdf
}}
{{col break}}
{{Infobox Unicode block
|style = float:left; margin-left:0;
|blockname = Supplementary Private Use Area-A
|rangestart = F0000
|rangeend = FFFFD
|script1 = Unknown
|2_0 = 65534
|codechart = https://unicode.org/charts/PDF/UF0000.pdf
}}
{{col break}}
{{Infobox Unicode block
|style = float:left; margin-left:0;
|blockname = Supplementary Private Use Area-B
|rangestart = 100000
|rangeend = 10FFFD
|script1 = Unknown
|2_0 = 65534
|codechart = https://unicode.org/charts/PDF/U100000.pdf
}}
{{col end}}
=History=
In Unicode 1.0.0, the Private Use Area extended from U+E800 to U+FDFF{{cite book |section-url=https://www.unicode.org/versions/Unicode1.0.0/ch03_5.pdf |section=3.5: Private Use Area |isbn=0-201-56788-1 |title=The Unicode Standard, Version 1.0, Volume 1 |year=1991 |publisher=Unicode Consortium |pages=118–119 |access-date=2021-10-11 |archive-date=2021-10-21 |archive-url=https://web.archive.org/web/20211021205258/https://www.unicode.org/versions/Unicode1.0.0/ch03_5.pdf |url-status=live }} (i.e. did not include U+E000..E7FF, but additionally included the U+F900..FDFF range now occupied by CJK Compatibility Ideographs, Alphabetic Presentation Forms and Arabic Presentation Forms-A). This was changed to U+E000..F8FF in Unicode 1.0.1,{{cite book |url=https://www.unicode.org/versions/Unicode1.0.0/Notice.pdf |title=Unicode 1.0.1 |series=The Unicode Standard |publisher=Unicode Consortium |date=1992-11-03 |access-date=2016-07-09 |archive-date=2016-07-02 |archive-url=https://web.archive.org/web/20160702004420/http://www.unicode.org/versions/Unicode1.0.0/Notice.pdf |url-status=live}} and remained so in Unicode 1.1.{{cite book |section-url=https://www.unicode.org/versions/Unicode1.1.0/ch02.pdf |section=2.0: Changes in Unicode 1.0 |title=The Unicode Standard, Version 1.1 |id=UTR #4 |publisher=Unicode Consortium |pages=3–4 |access-date=2021-10-11 |archive-date=2021-11-20 |archive-url=https://web.archive.org/web/20211120194908/https://www.unicode.org/versions/Unicode1.1.0/ch02.pdf |url-status=live }} The range U+D800..DFFF, used for UTF-16 surrogates since Unicode 2.0, was unassigned and not part of the Private Use Area in any Unicode 1.x version.
Planes E0 (224) through FF (255), and groups 60 (96) though 7F (127) of the Universal Coded Character Set (i.e. U+E00000 through U+FFFFFF and U+60000000 through U+7FFFFFFF) were also designated as private use. These ranges were removed when UCS was restricted to the seventeen planes reachable in UTF-16.{{cite web |url=https://www.unicode.org/L2/L2000-UTC/u2000-015.txt |last=Whistler |first=Ken |title=Necessary changes for ISO/IEC 10646 regarding the PUA |website=Unicode |year=2000 |id=UTC/00-015 |access-date=2021-01-30 |archive-date=2021-06-23 |archive-url=https://web.archive.org/web/20210623065232/https://www.unicode.org/L2/L2000-UTC/u2000-015.txt |url-status=live }}
Usage
=Standardization initiative uses=
Many people and institutions have created character collections for the PUA. Some of these private use agreements are published, so other PUA implementers can aim for unused or less-used code points to prevent overlaps. Several characters and scripts previously encoded in private use agreements have actually been fully encoded in Unicode, necessitating mappings from the PUA to other Unicode code points.
One of the more well-known and broadly implemented PUA agreements is maintained by the ConScript Unicode Registry (CSUR). The CSUR, which is not officially endorsed or associated with the Unicode Consortium, provides a mapping for constructed scripts, such as Klingon pIqaD and Ferengi script (Star Trek), Tengwar and Cirth (J.R.R. Tolkien's cursive and runic scripts), Alexander Melville Bell's Visible Speech, and Dr. Seuss's alphabet from On Beyond Zebra. The CSUR previously encoded the undeciphered Phaistos characters, as well as the Shavian and Deseret alphabets, which have all been accepted for official encoding in Unicode.
Another common PUA agreement is maintained by the Medieval Unicode Font Initiative (MUFI). This project is attempting to support all of the scribal abbreviations, ligatures, precomposed characters, symbols, and alternate letterforms found in medieval texts written in the Latin alphabet. The express purpose of MUFI is to experimentally determine which characters are necessary to represent these texts, and to have those characters officially encoded in Unicode. As of Unicode version 5.1, 152 MUFI characters have been incorporated into the official Unicode encoding.{{update inline|date=November 2021}}
Some agreed-upon PUA character collections exist in part or whole because the Unicode Consortium is in no hurry to encode them. Some, such as unrepresented languages, are likely to end up encoded in the future. Some unusual cases such as fictional languages are outside the usual scope of Unicode but not explicitly ruled out by the principles of Unicode, and may show up eventually (such as the Star Trek and Tolkien writing systems). In other cases, the proposed encoding violates one or more Unicode principles and hence is unlikely to ever be officially recognized by Unicode—mostly where users want to directly encode alternate forms, ligatures, or base-character-plus-diacritic combinations (such as the TUNE scheme).
class=wikitable
! Publishing organization!! Topic !! PUA area used !! Font | |||
CSUR | Artificial and some ancient/medieval scripts | PUA (BMP) and Plane 15 | Code2000 |
MUFI | Medieval scripts | PUA (BMP) | several |
SIL | Phonetics and languages | PUA (BMP) | {{nowrap|Charis SIL}} |
TITUS | Ancient and medieval scripts | PUA (BMP) | TITUS Cyberbit Basic |
- Emoji were originally defined in unused spaces in Shift JIS mobile encodings, with different carriers supporting different emoji characters. Before emoji were added to the Unicode Standard in Unicode 6.0, Google and major Japanese phone carriers each defined their own Private Use Area mappings for emoji. The Japanese carriers defined their encoding schemes in the Basic Multilingual Plane's Private Use Area, whereas Google defined theirs in Supplementary Private Use Area-A.{{cite web |last1=Scherer |first1=Markus |last2=Davis |first2=Mark |last3=Momoi |first3=Kat |last4=Tong |first4=Darick |last5=Kida |first5=Yasuo |last6=Edberg |first6=Peter |title=Emoji Symbols: Background Data |url=https://www.unicode.org/L2/L2010/10132-emojidata.pdf |publisher=Unicode Consortium |id=L2/10-132 |access-date=24 April 2025 |date=27 April 2010}}
- GB/T 20542-2006 ("Tibetan Coded Character Set Extension A") and GB/T 22238-2008 ("Tibetan Coded Character Set Extension B") are Chinese national standards that use the PUA to encode precomposed Tibetan ligatures.
- GBK and earlier versions of GB 18030 used the PUA to provisionally encode characters not found in Unicode standards at the time of publication. In the 2022 version of the standard (GB 18030-2022), characters are instead mapped to their standard Unicode codepoints.{{cite web |last1=Lunde |first1=Ken |title=The GB 18030-2022 Standard |url=https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132 |website=Medium |access-date=7 August 2022 |language=en |date=4 August 2022}}
- The Institute of the Estonian Language uses the PUA to encode Latin and Cyrillic precomposed characters{{cite web |url=http://www.eki.ee/letter/chardata.cgi?ucode=e000-f8ff |title=Letter Database |publisher=Eki.ee |access-date=2013-04-11 |archive-date=2018-05-21 |archive-url=https://web.archive.org/web/20180521182103/http://www.eki.ee/letter/chardata.cgi?ucode=e000-f8ff |url-status=live }} that have no Unicode encoding.
- The [http://freetengwar.sourceforge.net/ Free Tengwar Font Project] uses a different mapping from the ConScript Unicode Registry that largely follows Michael Everson's 2001-03-07 Tengwar discussion paper, but diverges in some details.
- The MARC 21 standard uses the PUA to encode East Asian characters present in MARC-8{{cite web |url=https://www.loc.gov/marc/specifications/specchar.pua.html |title=Character Sets: East Asian Characters: Alternative Unicode Mappings for MARC 21 Characters Assigned to the Private Use Area (PUA): MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media |publisher=Library of Congress |date=2004-09-02 |access-date=2013-04-11 |archive-date=2013-08-19 |archive-url=https://web.archive.org/web/20130819180025/http://www.loc.gov/marc/specifications/specchar.pua.html |url-status=live }} that have no Unicode encoding.
- The SIL Corporate PUA uses the PUA to encode characters used in minority languages that have not yet been accepted into Unicode.
- The STIX Fonts project uses the PUA to provide a comprehensive font set of mathematical symbols and alphabets, many of which are also available in the SMP now, e.g. in the Mathematical Alphanumeric Symbols block.
- The SMuFL uses the PUA to encode new music notation symbols, extending the Musical Symbols Unicode block.
- The Tamil Unicode New Encoding (TUNE){{cite web |url=http://www.tunerfc.tn.nic.in |title=tunerfc.tn.nic.in |publisher=tunerfc.tn.nic.in |access-date=2013-04-11 |archive-url=https://web.archive.org/web/20100729194712/http://www.tunerfc.tn.nic.in/ |archive-date=2010-07-29 |url-status=dead }} is a proposed scheme for encoding Tamil that overcomes perceived deficiencies in the current Unicode encoding.
=Vendor use=
{{redirect|U+F8FF|the company|Apple Inc.|and|Apple logo}}
Informally, the range U+F000 through U+F8FF is known as the Corporate Use Area. This originates from early versions of Unicode, which defined an "End User Zone" extending from U+E000 upward and a "Corporate Use Zone" extending from U+F8FF downward, with the boundary between the two left undefined.
- The Adobe Glyph List used to use the PUA for some of its glyphs.{{cite web |url=https://partners.adobe.com/asn/tech/type/corporateuse.txt |title=Unicode Corporate Use Subarea as used by Adobe Systems |date=October 22, 1998 |archive-url=https://web.archive.org/web/20021009225850/http://partners.adobe.com/asn/developer/type/corporateuse.txt |archive-date=October 9, 2002 |access-date=May 12, 2021 |url-status=dead}}
- Apple lists a range of 1,280 characters in its developer documentation{{cite web |url=https://developer.apple.com/documentation/foundation/nsopenstepunicodereservedbase |title=NSOpenStepUnicodeReservedBase - Apple Developer Documentation |publisher=Apple Inc. |access-date=2020-10-16 |archive-date=2020-11-06 |archive-url=https://web.archive.org/web/20201106115702/https://developer.apple.com/documentation/foundation/nsopenstepunicodereservedbase |url-status=live }} from U+F400–U+F8FF within the PUA for Apple's use. Of those, only 311 are used, in the range U+F700–U+F8FF (NeXT (NeXTSTEP and OPENSTEP) and Apple (macOS AppKit)).{{cite web |title=CORPCHAR.TXT - Registry (external version) of Apple use of Unicode corporate-zone characters |author=Apple Computer, Inc. |date=2005 |orig-year=1994 |publisher=Unicode Consortium |version=c03 |url=https://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CORPCHAR.TXT |access-date=2020-10-16 |archive-date=2020-10-30 |archive-url=https://web.archive.org/web/20201030195128/https://unicode.org/Public/MAPPINGS/VENDORS/APPLE/CORPCHAR.TXT |url-status=live }}
- {{anchor|U+F8FF}} One of these is U+F8FF, the Apple logo, generally supported by Apple's 8-bit sets.
- WGL4 uses the PUA (U+F001 and U+F002) to encode duplicates of the ligatures {{not a typo|fi}} (U+FB01) {{not a typo|fl}} (U+FB02).{{cite web|url=http://www.microsoft.com/typography/otspec/WGL4E.HTM|title=WGL4 Unicode Range U+2013 through U+FB02|website=Microsoft |archive-url=https://web.archive.org/web/20140717022830/http://www.microsoft.com/typography/otspec/wgl4e.htm|archive-date=2014-07-17|url-status=dead}}
- Microsoft's defunct Services For Macintosh feature used U+F001 through U+F029 as replacements for special characters allowed in HFS but forbidden in NTFS, and U+F02A for the Apple logo.{{cite web |url=https://support.microsoft.com/en-us/kb/117258 |title=SFM Converts Macintosh HFS Filenames to NTFS Unicode |date=February 24, 2014 |website=Microsoft Support |archive-url=https://web.archive.org/web/20160527200113/https://support.microsoft.com/en-us/kb/117258 |archive-date=May 27, 2016 |url-status=dead}}{{cite web |url=https://opensource.apple.com/source/ntfs/ntfs-91.50.2/util/ntfs.util.c.auto.html |title=ntfs.util.c |year=2008 |quote=Invalid NTFS filename characters are encodeded{{sic}} using the SFM (Services for Macintosh) private use Unicode characters. |access-date=2018-08-07 |archive-date=2018-08-07 |archive-url=https://web.archive.org/web/20180807190401/https://opensource.apple.com/source/ntfs/ntfs-91.50.2/util/ntfs.util.c.auto.html |url-status=dead}}
- In old versions of its RichEdit component, Microsoft mapped U+F020–U+F0FF within the PUA to symbol fonts. For any character in this range, RichEdit would show a character from a symbol font instead of the end-user-defined character (EUDC).{{cite web|website=Microsoft Knowledge Base|url=http://support.microsoft.com/kb/897872|title=The range of characters between U+F020 and U+F0FF in the Private Use Area of Unicode is mapped to symbol fonts in Richedit 4.1|archive-url=https://web.archive.org/web/20121022095705/http://support.microsoft.com/kb/897872|archive-date=2012-10-22|url-status=dead}}{{cite web |url=http://scripts.sil.org/cms/SCRIPTs/page.php?site%5Fid=nrsi&item%5Fid=PUACharsInMSSotware |title=Handling of PUA Characters in Microsoft Software |date=2003-04-25 |website=SIL International |archive-url=https://web.archive.org/web/20150511005915/http://scripts.sil.org/cms/scripts/page.php?site%5Fid=nrsi&item%5Fid=PUACharsInMSSotware |archive-date=2015-05-11 |url-status=dead |access-date=2014-03-04 }}
- {{Clarify|date=June 2020|reason=which version? 2007 uses U+2205, U+00B1, U+00B0|text=AutoCAD}} uses U+F8FC–U+F8FE for ⌀ (diameter sign), ± (plus–minus sign) and ° (degree sign) respectively.{{citation needed|date=March 2025}}
- Some fonts place the Windows logo at
U+F000
.{{citation needed|date=March 2025}} - The code point
U+F000
is a numeral succession starting at 13 or 18 in some video games like Agar.io.{{citation needed|date=March 2025}} - On Ubuntu,
U+E0FF
is displayed as the "Circle Of Friends" logo{{Cite web|title=Comment #8 : Bug #651606 (circle-of-friends) : Bugs : Ubuntu Font Family|url=https://bugs.launchpad.net/ubuntu-font-family/+bug/651606/comments/8/+index|access-date=2020-10-17|website=Launchpad|date=5 October 2010 |language=en|archive-date=2020-10-17|archive-url=https://web.archive.org/web/20201017022723/https://bugs.launchpad.net/ubuntu-font-family/+bug/651606/comments/8/+index|url-status=live}} andU+F200
is "ubuntu" in the Ubuntu typeface with a superscripted "Circle Of Friends" (this itself isU+F0FF
).{{Cite web|title=Comment #2 : Bug #853855 : Bugs : Ubuntu Font Family|url=https://bugs.launchpad.net/ubuntu-font-family/+bug/853855/comments/2/+index|access-date=2020-10-17|website=Launchpad|date=26 September 2011 |language=en|archive-date=2020-10-17|archive-url=https://web.archive.org/web/20201017213250/https://bugs.launchpad.net/ubuntu-font-family/+bug/853855/comments/2/+index|url-status=live}} - The [https://github.com/rbanffy/3270font 3270] font includes the Debian logo at
U+F100
.{{citation needed|date=March 2025}} - In the Linux Libertine font,
U+E000
displays Tux, the mascot of Linux.{{citation needed|date=March 2025}} - The Font Awesome icon font uses the PUA to display various glyphs.{{citation needed|date=March 2025}}
- Powerline, a status line plugin for Vim, uses U+E0A0–U+E0A2 and U+E0B0–U+E0B3 for extra box-drawing characters.{{cite web |last1=Li |first1=Renzhi |title=Proposal to add additional characters into the Graphics for Legacy Computing block of the UCS |website=Unicode |url=https://www.unicode.org/L2/L2019/19068r2-powerline-syms.pdf |access-date=2023-07-31 |date=2019-08-23}}{{cite web |title=Installation |url=https://powerline.readthedocs.io/en/latest/installation.html#fonts-installation |work=Powerline beta documentation |publisher=Powerline |at=Fonts installation |access-date=21 April 2025 |quote=The used application (e.g. terminal emulator) must also either be configured to use patched fonts (in some cases even support it because custom glyphs live in private use area which some applications reserve for themselves) or support fontconfig for powerline to work properly with powerline-specific glyphs.}}
- In the Fira Sans typeface used in Firefox OS,
U+E003
is displayed as the Mozilla logo (the dinosaur head).{{citation needed|date=March 2025}} - Lotus Multi-Byte Character Set (LMBCS), the encoding and character set internally used by Lotus/IBM Lotus 1-2-3, Symphony, SmartSuite, Notes, Domino as well as a number of third-party products such as Microsoft Works, uses some characters (
U+F862
-U+F89F
andU+F8FB
-U+F8FE
) in the Private Use Area for symbols not defined in Unicode. Of these,U+F8FB
is known to be reserved for a crown currency symbol ("Kr"), andU+F8FC
andU+F8FD
were later mapped toU+FB02
({{not a typo|fl}}) andU+FB01
({{not a typo|fi}}) respectively. Additionally, when UTF-16 codes are embedded in LMBCS, the UTF-16 codes corresponding toU+F601
throughU+F6FF
are substituted for UTF-16 codes which would contain null bytes, since LMBCS is designed to not contain embedded null bytes.{{cite web |title=lmb-excp.ucm |website=GitHub |publisher=Unicode, Inc. |date=2000-02-10 |url=https://github.com/unicode-org/icu/blob/master/icu4c/source/data/mappings/lmb-excp.ucm |access-date=2020-04-23 |archive-date=2022-01-25 |archive-url=https://web.archive.org/web/20220125034059/https://github.com/unicode-org/icu/blob/main/icu4c/source/data/mappings/lmb-excp.ucm |url-status=live }}{{cite book |title=Lotus 1-2-3 Version 3.1 Referenzhandbuch |language=de |trans-title=Lotus 1-2-3 Version 3.1 Reference Manual |edition=1 |chapter=Anhang 2. Der Lotus Multibyte Zeichensatz (LMBCS) |trans-chapter=Appendix 2. The Lotus Multibyte Character Set (LMBCS) |pages=A2–1 – A2–13 |date=1989 |publisher=Lotus Development Corporation |location=Cambridge, Massachusetts, US |id=302168}} - IBM reserved several code page IDs for PUA code pages: code page 1446 for the generic plane 15, code page 1447 for the generic plane 16, code page 1448 for the generic BMP PUA, code page 1445 (IBM AFP PUA No. 1) for plane 15 with IBM allocations in U+FFF00–U+FFFFD,{{cite web |url=https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP01445.pdf |title=CPGID 01445 (chart) |work=REGISTRY: Graphic Character Sets and Code Pages |id=C-H 3-3220-050 |date=2012 |orig-year=2011 |quote=The area shown in the chart above represents only 254 bytes of row FF in plane 0F.}}{{cite web |url=https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP01445.txt |title=CPGID 01445: IBM AFP PUA No. 1 |work=REGISTRY: Graphic Character Sets and Code Pages |id=C-H 3-3220-050 |date=2012 |orig-year=2011 |quote=The area shown in the chart above represents only 254 bytes of row FF in plane 0F.}} and code page 1449 (IBM default PUA) for the BMP PUA with IBM allocations in U+F83D–U+F8FF.{{cite web |archive-url=https://web.archive.org/web/20150916190822/http://www-01.ibm.com/software/globalization/cp/cp01449.html |archive-date=2015-09-16 |url=http://www-01.ibm.com/software/globalization/cp/cp01449.html |url-status=dead |title=CPGID 01449: IBM default PUA |work=IBM Globalization: Code page identifiers |publisher=IBM |quotation=IBM has designated 195 positions from U+F83D to U+F8FF for use as IBM Corporate-zone and intends to use them consistently within IBM whenever there is a need to maintain the round-trip integrity of IBM characters.}}{{citation|mode=cs1 |title=unicode.nam: Allow the Unicode characters to be specified using either the IBM or PostScript like names. |author=IBM |author-link=IBM |date=1997}} (Included with {{citation|mode=cs2 |title=OS/2 Codepage and Keyboard Display Tools |last=Borgendale |first=Ken |url=http://www.borgendale.com/tools/tools.htm}})
- The file system found in Windows uses the
U+F000
toU+F0FF
block to escape special characters.{{citation needed|date=March 2025}} - NetApp translates characters in filenames that are allowed on Unix but invalid for SMB clients to PUA characters.{{Cite web|url=https://docs.netapp.com/us-en/ontap/smb-admin/configure-character-mappings-file-name-translation-task.html|title=Configure character mapping for SMB file name translation on volumes|date=9 December 2021 |access-date=2022-10-14}}
- Twitter's Chirp font provides some additional icons, like
U+E000
which corresponds to a left down arrow,U+EA00
which corresponds to the Twitter bird, andU+F8FF
which corresponds to an Apple logo, possibly for compatibility with Apple fonts.{{Cite web|url=https://c.r74n.com/twitter/chirp|title=Twitter Chirp Font|website=Copy Paste Dump|access-date=2022-02-08}}
Private-use characters in other character sets
The concept of reserving specific code points for private use is based on similar earlier usage in other character sets. In particular, many otherwise obsolete characters in East Asian scripts continue to be used in specific names or other situations, and so some character sets for those scripts made allowance for private-use characters (such as the user-defined planes of CNS 11643, or gaiji in certain Japanese encodings). The Unicode standard references these uses under the name "End User Character Definition" (EUCD).
Additionally, the C1 control block contains two codes intended for private use "control functions" by ECMA-48: 0x91 {{smallcaps|private use one}} (PU1) and 0x92 {{smallcaps|private use two}} (PU2).{{cite web|url=https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf|title=Standard ECMA-48, Fifth Edition - June 1991|at=§8.2.14 Miscellaneous control functions, §8.3.100, §8.3.101}}{{Cite iso-ir |number=77 |title=C1 Control Character Set of ISO 6429 |sponsor=((ISO/TC97/SC2)) |sponsor-link=ISO/IEC JTC 1/SC 2#History |date=1983-10-01 |access-date=2022-03-03}} Unicode includes these at {{unichar|0091|PRIVATE USE ONE}} and {{unichar|0092|PRIVATE USE TWO}} but defines them as control characters (category Cc
), not private-use characters (category Co
).{{cite book |chapter-url=https://www.unicode.org/versions/Unicode14.0.0/ch04.pdf |chapter=Chapter 4 Character Properties |title=The Unicode Standard Version 14.0 - Core Specification |isbn=978-1-936213-29-0 |date=September 2021 |publisher=Unicode Consortium |at=Table 4-4}}
Encodings that do not have private use areas but have more or less unused areas, such as ISO/IEC 8859 and Shift JIS, have seen uncontrolled variants of these encodings evolve.{{Cite web |url=http://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT |title=Map (external version) from Mac OS Japanese encoding to Unicode 2.1 and later. |publisher=Unicode Consortium |access-date=2021-10-08 |archive-date=2021-08-31 |archive-url=https://web.archive.org/web/20210831135118/http://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT |url-status=live }} For Unicode, software companies can use the Private Use Areas for their desired additions.
Notes
{{Reflist|group=note}}
References
{{Reflist}}
{{Unicode navigation}}
{{DEFAULTSORT:Private Use (Unicode)}}
Category:Articles with unsupported Private Use Area characters