U+2014 Em Dash
U+2014 was added in Unicode version 1.1 in 1993. It belongs to the block
This character is a Dash Punctuation and is commonly used, that is, in no specific script.
The glyph is not a composition. Its width in East Asian texts is determined by its context. It can be displayed wide or narrow. In bidirectional text it acts as Other Neutral. When changing direction it is not mirrored. It will not end a sentence. U+2014 offers a line break opportunity at its position. The glyph can be confused with one other glyph.
The CLDR project calls this character “em dash” for use in screen reading software. It assigns these additional labels, e.g. for search in emoji pickers: dash, em.
The Wikipedia has the following information about this codepoint:
The dash is a punctuation mark consisting of a long horizontal line. It is similar in appearance to the hyphen but is longer and sometimes higher from the baseline. The most common versions are the en dash –, generally longer than the hyphen but shorter than the minus sign; the em dash —, longer than either the en dash or the minus sign; and the horizontal bar ―, whose length varies across typefaces but tends to be between those of the en and em dashes.
Typical uses of dashes are to mark a break in a sentence, or to set off an explanatory remark (similar to parenthesis), or to show spans of time or ranges of values.
The em dash is sometimes used as a leading character to identify the source of a quoted text.
Representations
System | Representation |
---|---|
Nº | 8212 |
UTF-8 | E2 80 94 |
UTF-16 | 20 14 |
UTF-32 | 00 00 20 14 |
URL-Quoted | %E2%80%94 |
HTML hex reference | — |
Wrong windows-1252 Mojibake | — |
HTML named entity | — |
Encoding: MACINTOSH (hex bytes) | D1 |
Encoding: WINDOWS-1250 (hex bytes) | 97 |
Encoding: WINDOWS-1251 (hex bytes) | 97 |
Encoding: WINDOWS-1252 (hex bytes) | 97 |
Encoding: WINDOWS-1253 (hex bytes) | 97 |
Encoding: WINDOWS-1254 (hex bytes) | 97 |
Encoding: WINDOWS-1255 (hex bytes) | 97 |
Encoding: WINDOWS-1256 (hex bytes) | 97 |
Encoding: WINDOWS-1257 (hex bytes) | 97 |
Encoding: WINDOWS-1258 (hex bytes) | 97 |
Encoding: WINDOWS-874 (hex bytes) | 97 |
Encoding: X-MAC-CYRILLIC (hex bytes) | D1 |
LATEX | \textemdash |
AGL: Latin-1 | emdash |
AGL: Latin-2 | emdash |
AGL: Latin-3 | emdash |
AGL: Latin-4 | emdash |
AGL: Latin-5 | emdash |
Adobe Glyph List | emdash |
digraph | -M |
Related Characters
Confusables
Elsewhere
Complete Record
Property | Value |
---|---|
1.1 (1993) | |
EM DASH | |
— | |
General Punctuation | |
Dash Punctuation | |
Common | |
Other Neutral | |
Not Reordered | |
none | |
|
|
✘ | |
|
|
|
|
✘ | |
|
|
|
|
|
|
|
|
|
|
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✔ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
|
|
Any | |
✔ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
0 | |
0 | |
0 | |
✘ | |
None | |
— | |
NA | |
Consonant_Placeholder | |
— | |
✘ | |
✘ | |
✘ | |
✘ | |
Yes | |
Yes | |
|
|
Yes | |
|
|
Yes | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✔ | |
✘ | |
✘ | |
✘ | |
✘ | |
Sentence Continue | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
Other | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
✘ | |
|
|
None | |
ambiguous | |
Not Applicable | |
— | |
No_Joining_Group | |
Non Joining | |
Break Opportunity Before and After | |
none | |
not a number | |
|
|
R |