HTML 2.0 -
Escaped entities


The Internet standards document RFC 1866, adopted as an elective standard by IETF, defines Hypertext Markup Language version 2.0. Among many other things, that document defines what is to be displayed when HTML contains an "escaped entity" - for example, RFC 1866 says that when HTML contains the six-byte sequence ¥ a yen sign is to be displayed. (In addition to the numeric codes, corresponding to ISO 8859-1, RFC 1866 specifies alphabetic mnemonics.)

A document must use escaped entities for Less than (<), Greater than (>), and Ampersand (&). In addition, within HREF or IMG elements, a document must use escaped entities for equal sign (=), quote ("), and space ( ). A document need not use escaped entities for any other characters coded 126 or below. A document need not use escaped entities for characters coded 128 through 255, provided that the codes are in conformance with ISO 8859-1 ("Latin-1"); however, use of the escaped entities is recommended in this case.

For further information, see ISO Latin 1 Character Entity Set for HTML 3.2 at W3O.

The following table is taken from RFC 1866. Each HTML escaped entity in the left column is followed by a display, produced by your browser, of that entity in parentheses in the middle column. If the glyph is not accurately described by the name in the right column, your browser fails HTML 2.0 conformance.

REFERENCE DISP  DESCRIPTION
--------- ----  -----------
 &#32;    ( )   Space
 &#33;    (!)   Exclamation mark
 &#34;    (")   Quotation mark
 &#35;    (#)   Number sign
 &#36;    ($)   Dollar sign
 &#37;    (%)   Percent sign
 &#38;    (&)   Ampersand
 &#39;    (')   Apostrophe
 &#40;    (()   Left parenthesis
 &#41;    ())   Right parenthesis
 &#42;    (*)   Asterisk
 &#43;    (+)   Plus sign
 &#44;    (,)   Comma
 &#45;    (-)   Hyphen
 &#46;    (.)   Period (fullstop)
 &#47;    (/)   Solidus (slash)
 &#48; (0) - &#57; (9)   Digits 0-9
 &#58;    (:)   Colon
 &#59;    (;)   Semi-colon
 &#60;    (<)   Less than
 &#61;    (=)   Equals sign
 &#62;    (>)   Greater than
 &#63;    (?)   Question mark
 &#64;    (@)   Commercial at
 &#65; (A) - &#90; (Z)   Letters A-Z
 &#91;    ([)   Left square bracket
 &#92;    (\)   Reverse solidus (backslash)
 &#93;    (])   Right square bracket
 &#94;    (^)   Caret
 &#95;    (_)   Horizontal bar (underscore)
 &#96;    (`)   Acute accent
 &#97; (a) - &#122; (z)  Letters a-z
&#123;    ({)   Left curly brace
&#124;    (|)   Vertical bar
&#125;    (})   Right curly brace
&#126;    (~)   Tilde
&#160;    ( )   Non-breaking Space
&#161;    (¡)   Inverted exclamation
&#162;    (¢)   Cent sign
&#163;    (£)   Pound sterling
&#164;    (¤)   General currency sign
&#165;    (¥)   Yen sign
&#166;    (¦)   Broken vertical bar
&#167;    (§)   Section sign
&#168;    (¨)   Umlaut (dieresis)
&#169;    (©)   Copyright
&#170;    (ª)   Feminine ordinal
&#171;    (&laqno;)   Left angle quote, guillemotleft
&#172;    (¬)   Not sign
&#173;    (­)   Soft hyphen
&#174;    (®)   Registered trademark
&#175;    (¯)   Macron accent
&#176;    (°)   Degree sign
&#177;    (±)   Plus or minus
&#178;    (²)   Superscript two
&#179;    (³)   Superscript three
&#180;    (´)   Acute accent
&#181;    (µ)   Micro sign
&#182;    (¶)   Paragraph sign
&#183;    (·)   Middle dot
&#184;    (¸)   Cedilla
&#185;    (¹)   Superscript one
&#186;    (º)   Masculine ordinal
&#187;    (»)   Right angle quote, guillemotright
&#188;    (¼)   Fraction one-fourth
&#189;    (½)   Fraction one-half
&#190;    (¾)   Fraction three-fourths
&#191;    (¿)   Inverted question mark
&#192;    (À)   Capital A, grave accent
&#193;    (Á)   Capital A, acute accent
&#194;    (Â)   Capital A, circumflex accent
&#195;    (Ã)   Capital A, tilde
&#196;    (Ä)   Capital A, dieresis or umlaut mark
&#197;    (Å)   Capital A, ring
&#198;    (Æ)   Capital AE dipthong (ligature)
&#199;    (Ç)   Capital C, cedilla
&#200;    (È)   Capital E, grave accent
&#201;    (É)   Capital E, acute accent
&#202;    (Ê)   Capital E, circumflex accent
&#203;    (Ë)   Capital E, dieresis or umlaut mark
&#204;    (Ì)   Capital I, grave accent
&#205;    (Í)   Capital I, acute accent
&#206;    (Î)   Capital I, circumflex accent
&#207;    (Ï)   Capital I, dieresis or umlaut mark
&#208;    (Ð)   Capital Eth, Icelandic
&#209;    (Ñ)   Capital N, tilde
&#210;    (Ò)   Capital O, grave accent
&#211;    (Ó)   Capital O, acute accent
&#212;    (Ô)   Capital O, circumflex accent
&#213;    (Õ)   Capital O, tilde
&#214;    (Ö)   Capital O, dieresis or umlaut mark
&#215;    (×)   Multiply sign
&#216;    (Ø)   Capital O, slash
&#217;    (Ù)   Capital U, grave accent
&#218;    (Ú)   Capital U, acute accent
&#219;    (Û)   Capital U, circumflex accent
&#220;    (Ü)   Capital U, dieresis or umlaut mark
&#221;    (Ý)   Capital Y, acute accent
&#222;    (Þ)   Capital THORN, Icelandic
&#223;    (ß)   Small sharp s, German (sz ligature)
&#224;    (à)   Small a, grave accent
&#225;    (á)   Small a, acute accent
&#226;    (â)   Small a, circumflex accent
&#227;    (ã)   Small a, tilde
&#228;    (ä)   Small a, dieresis or umlaut mark
&#229;    (å)   Small a, ring
&#230;    (æ)   Small ae dipthong (ligature)
&#231;    (ç)   Small c, cedilla
&#232;    (è)   Small e, grave accent
&#233;    (é)   Small e, acute accent
&#234;    (ê)   Small e, circumflex accent
&#235;    (ë)   Small e, dieresis or umlaut mark
&#236;    (ì)   Small i, grave accent
&#237;    (í)   Small i, acute accent
&#238;    (î)   Small i, circumflex accent
&#239;    (ï)   Small i, dieresis or umlaut mark
&#240;    (ð)   Small eth, Icelandic
&#241;    (ñ)   Small n, tilde
&#242;    (ò)   Small o, grave accent
&#243;    (ó)   Small o, acute accent
&#244;    (ô)   Small o, circumflex accent
&#245;    (õ)   Small o, tilde
&#246;    (ö)   Small o, dieresis or umlaut mark
&#247;    (÷)   Division sign
&#248;    (ø)   Small o, slash
&#249;    (ù)   Small u, grave accent
&#250;    (ú)   Small u, acute accent
&#251;    (û)   Small u, circumflex accent
&#252;    (ü)   Small u, dieresis or umlaut mark
&#253;    (¦)   Small y, acute accent
&#254;    (þ)   Small thorn, Icelandic
&#255;    (ÿ)   Small y, dieresis or umlaut mark

Many versions of Netscape for the Mac, including Navigator 3.0.2 and Communicator 4.0.1, fail to properly display about a dozen codes. Communicator 4.0.1 is more divergent from the standard than Navigator 3.0.2.

Olds application on the Mac will have problems with the eight ISO Latin-1 characters that are not fully implemented by Mac OS:

&#175;   ()   Macron accent
&#178;   (²)   Superscript two
&#179;   (³)   Superscript three
&#185;   (¹)   Superscript one
&#188;   (¼)   Fraction one-fourth
&#189;   (½)   Fraction one-half
&#190;   (¾)   Fraction three-fourths
&#215;   (×)   Multiply sign

However, there is little excuse for the other nonconformant behavior - Adobe PageMill 2.0 on the Mac gets everything right, except those eight.

Charles
1997-07-13(a)