XML Character Entities

Working Draft 0.3, 13 June 2002

Document identifier:

wd-docbook-xmlcharent-0.3 (XML, HTML, PDF)

Location:

http://www.oasis-open.org/docbook/specs

Editor:

Norman Walsh, Sun Microsystems, Inc. <Norman.Walsh@Sun.COM>

Abstract:

This Standard defines XML encodings of the 19 standard character entity sets defined in Non-normative Annex D of [SGML].

Status:

This is a working draft constructed by the editor. It is not an official committee work product and may not reflect the consensus opinion of the committee.

Please send comments on this specification to the <docbook@lists.oasis-open.org> list. To subscribe, send an email message to <docbook-request@lists.oasis-open.org> with the word "subscribe" as the body of the message.


Table of Contents

1. XML Character Entity Sets
1.1. Multi-Character Replacements
1.2. Duplicate Entities
1.3. Entities with no Mapping
1.4. Entities with Substituted Mappings
2. XML Character Elements

Appendixes

A. Added Latin 1
B. Added Latin 2
C. Greek Letters
D. Monotoniko Greek
E. Russian Cyrillic
F. Non-Russian Cyrillic
G. Numeric and Special Graphic
H. Diacritical Marks
I. Publishing
J. Box and Line Drawing
K. General Technical
L. Greek Symbols
M. Alternative Greek Symbols
N. Added Math Symbols: Ordinary
O. Added Math Symbols: Binary Operators
P. Added Math Symbols: Relations
Q. Added Math Symbols: Negated Relations
R. Added Math Symbols: Arrow Relations
S. Added Math Symbols: Delimiters
T. Unicode Glyphs
U. OASIS DocBook Technical Committee (Non-Normative)
V. Notices
W. Intellectual Property Rights
X. Revision History
References

This Standard defines XML encodings of the standard SGML character entity sets.

Non-normative Annex D of [SGML] defines 19 standard SGML character entity sets: Added Latin 1, Added Latin 2, Greek Letters, Monotoniko Greek, Russian Cyrillic, Non-Russian Cyrillic, Numeric and Special Graphic, Diacritical Marks, Publishing, Box and Line Drawing, General Technical, Greek Symbols, Alternative Greek Symbols, Added Math Symbols: Ordinary, Added Math Symbols: Binary Operators, Added Math Symbols: Relations, Added Math Symbols: Negated Relations, Added Math Symbols: Arrow Relations, Added Math Symbols: Delimiters. The SGML declarations for these entities use the specific character data (SDATA) entity type that is not supported in XML, so alternative XML declarations are necessary.

In XML, the specific character data of most entities can be expressed as a [Unicode] character.

1. XML Character Entity Sets

The character entity sets defined by this Standard are summarized in Appendix A through Appendix S.

In order to use these entities in a document, they must be declared. Entities can be declared in the external subset or the internal subset, as described in [XML]. An example document, with the declaration in the internal subset, is shown in Example 1.

Example 1. Declaring and Using the ISO Latin 1 Character Entity Set

<!DOCTYPE doc [
<!ENTITY % iso-lat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML"
                    "http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-lat1.ent">
%iso-lat1;
]>
<doc>
<p>This document declares the ISO Latin 1 Character Entity Set, providing
access to the ISO Latin 1 entities, such as "&eacute;" and "&copy;".</p>
</doc>

Note

Non-validating XML Parsers may choose not to process externally declared entities. This Standard does not alter the semantics of XML processors. If a processor does not see the declaration for an entity, it will not be able to report the correct replacement text for that entity.

1.1. Multi-Character Replacements

The replacement text of some entities includes more than a single Unicode character. Some characters are composed with the "combining reverse solidus overlay" (20E5) and some are composed with a variation selector (FE00, FE01, …).

1.2. Duplicate Entities

Historically, the inodot entity is multiply defined in iso-lat2.ent and iso-amso.ent. If both entity sets are included, some parsers will warn about redefinition of this entity. The warning can be ignored.

1.3. Entities with no Mapping

There are a small number of entities that have no [Unicode] representation. These entities are all mapped to the Unicode character "FFFD", the "replacement character".

Entity
Name
Entity SetDescription
fjligiso-pub.entSmall fj ligature
gnapiso-amsn.entGreater, not approximate
jnodotiso-amso.entSmall j, no dot
lnapiso-amsn.entLess, not approximate
lpargtiso-amsc.entGreater than, left arc
nsmidiso-amsn.entNegated short mid
prnEiso-amsn.entPrecedes, not double equals
rpargtiso-amsc.entRight paren, greater than
scnEiso-amsn.entSucceeds, not double equals
smidiso-amsr.entshortmid r
vsubnEiso-amsn.entSubset not double equals, variant

Users needing these characters will have to rely on the private use area or other non-portable mechanisms to access them.

1.4. Entities with Substituted Mappings

There are a few more for which there is no specific [Unicode] representation but where a reasonable substitution has been used:

Entity
Name
Entity SetSubstitutionDescription
bepsiiso-amsr.ent220DBack epsilon: such that
gesiso-amsr.ent2265Greater-or-equal, slanted
gvnEiso-amsn.ent2269Gt, vert, not double equals
iffiso-tech.ent21D4If and only if
lesiso-amsr.ent2264Less-than-or-equal, slanted
lozfiso-pub.ent2726Lozenge, filled
lvnEiso-amsn.ent2268Less, vert, not double equals
ngeiso-amsn.ent2271Neither greater-than nor equal to
nleiso-amsn.ent2270Not less-than-or-equal
npreiso-amsn.ent22E0Not precedes, equals
nsceiso-amsn.ent22E1Not succeeds, equals
nspariso-amsn.ent2226Not short parallel
preiso-amsr.ent227CPrecedes, equals
spariso-amsr.ent2225Short parallel
ssetmniso-amsb.ent2216Small set minus (reverse solidus)
stariso-pub.ent22C6Star operator
starfiso-pub.ent2605Black star
thkapiso-amsr.ent2248Thick approximate
thksimiso-amsr.ent223CThick similar
vsubneiso-amsn.ent228ASubset, not equals, variant
vsupnEiso-amsn.ent228BSubset not double equals, variant
vsupneiso-amsn.ent228BSuperset, not equals, variant
xhArriso-amsa.ent2194Long left and right double arr
xharriso-amsa.ent2194Long left and right arr
xlArriso-amsa.ent21D0Long left double arrow
xrArriso-amsa.ent21D2Long right double arr
ssmileiso-amsr.ent2323Small smile
sfrowniso-amsr.ent2322Small frown

Users needing alternate glyphs for these characters will have to rely on redefining them to use the private use area or other non-portable mechanisms to access them.

2. XML Character Elements

Named XML entities (except for the five predefined entities) cannot be used if they are not declared. Entity declaration requires either an external or an internal subset. Some classes of applications forbid the occurrence of markup declarations in documents. For these documents, named character entities are inaccessible.

In this section, we introduce an XML vocabulary with the semantics of character entity reference. This Standard defines the semantics of elements and attributes declared in the "http://www.oasis-open.org/docbook/xmlcharent/names" namespace.

This namespace contains exactly one element, char. The char element has two attributes, entity and name. They are mutually exclusive.

The entity attribute identifies characters by their character entity names. (The set of valid names is the closed set of names associated with character entity sets defined by this Standard.) Case is significant in entity names.

The name attribute identifies characters by their Unicode character names. (The set of valid names is the set of character names published in the [Unicode] specification, or any later version of that specification.) Case is insignificant in character names.

The [RELAX NG] definition of this namespace is shown in figure Figure 1.

Figure 1. The RELAX NG Definition of the http://www.oasis-open.org/docbook/xmlcharent/names Namespace

<?xml version="1.0"?>
<grammar xmlns="http://relaxng.org/ns/structure/0.9"
         ns="http://www.oasis-open.org/docbook/xmlcharent/names">

<start>
  <element name="char">
    <choice>
      <attribute name="entity">
        <ref name="EntityNames"/>
      </attribute>
      <attribute name="name">
        <ref name="UnicodeNames"/>
      </attribute>
    </choice>
  </element>
</start>

<define name="EntityNames">
  <!-- Logically, this is the list of ISO 9573 Character Entity Names -->
  <!-- For now, just text. -->
  <text/>
</define>

<define name="UnicodeNames">
  <!-- Logically, this is the list of Unicode Character Names -->
  <!-- For now, just text. -->
  <text/>
</define>

</grammar>

Example 2 shows a sample document using this mechanism.

Example 2. Declaring and Using the ISO Latin 1 Character Entity Set

<doc xmlns:e="http://www.oasis-open.org/docbook/xmlcharent/names">
<p>This document uses the character names element to access
character entities, such as "<e:char name="eacute"/>" and
"<e:char name="COPYRIGHT SIGN"/>".</p>
</doc>

The character names element is limited to contexts where elements may occur. In particular, elements may not occur in XML attribute values. Note, however, that internationalization requirements such as bidirectional language support and Ruby already require structure in arbitrary contexts. It is probably an error to use attributes for human-readable content.

A. Added Latin 1

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Added Latin 1//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-lat1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
aacute00E1
Unicode 00E1
small a, acute accent
Aacute00C1
Unicode 00C1
capital A, acute accent
acirc00E2
Unicode 00E2
small a, circumflex accent
Acirc00C2
Unicode 00C2
capital A, circumflex accent
agrave00E0
Unicode 00E0
small a, grave accent
Agrave00C0
Unicode 00C0
capital A, grave accent
aring00E5
Unicode 00E5
small a, ring
Aring00C5
Unicode 00C5
capital A, ring
atilde00E3
Unicode 00E3
small a, tilde
Atilde00C3
Unicode 00C3
capital A, tilde
auml00E4
Unicode 00E4
small a, dieresis or umlaut mark
Auml00C4
Unicode 00C4
capital A, dieresis or umlaut mark
aelig00E6
Unicode 00E6
small ae diphthong (ligature)
AElig00C6
Unicode 00C6
capital AE diphthong (ligature)
ccedil00E7
Unicode 00E7
small c, cedilla
Ccedil00C7
Unicode 00C7
capital C, cedilla
eth00F0
Unicode 00F0
small eth, Icelandic
ETH00D0
Unicode 00D0
capital Eth, Icelandic
eacute00E9
Unicode 00E9
small e, acute accent
Eacute00C9
Unicode 00C9
capital E, acute accent
ecirc00EA
Unicode 00EA
small e, circumflex accent
Ecirc00CA
Unicode 00CA
capital E, circumflex accent
egrave00E8
Unicode 00E8
small e, grave accent
Egrave00C8
Unicode 00C8
capital E, grave accent
euml00EB
Unicode 00EB
small e, dieresis or umlaut mark
Euml00CB
Unicode 00CB
capital E, dieresis or umlaut mark
iacute00ED
Unicode 00ED
small i, acute accent
Iacute00CD
Unicode 00CD
capital I, acute accent
icirc00EE
Unicode 00EE
small i, circumflex accent
Icirc00CE
Unicode 00CE
capital I, circumflex accent
igrave00EC
Unicode 00EC
small i, grave accent
Igrave00CC
Unicode 00CC
capital I, grave accent
iuml00EF
Unicode 00EF
small i, dieresis or umlaut mark
Iuml00CF
Unicode 00CF
capital I, dieresis or umlaut mark
ntilde00F1
Unicode 00F1
small n, tilde
Ntilde00D1
Unicode 00D1
capital N, tilde
oacute00F3
Unicode 00F3
small o, acute accent
Oacute00D3
Unicode 00D3
capital O, acute accent
ocirc00F4
Unicode 00F4
small o, circumflex accent
Ocirc00D4
Unicode 00D4
capital O, circumflex accent
ograve00F2
Unicode 00F2
small o, grave accent
Ograve00D2
Unicode 00D2
capital O, grave accent
oslash00F8
Unicode 00F8
small o, slash
Oslash00D8
Unicode 00D8
capital O, slash
otilde00F5
Unicode 00F5
small o, tilde
Otilde00D5
Unicode 00D5
capital O, tilde
ouml00F6
Unicode 00F6
small o, dieresis or umlaut mark
Ouml00D6
Unicode 00D6
capital O, dieresis or umlaut mark
szlig00DF
Unicode 00DF
small sharp s, German (sz ligature)
thorn00FE
Unicode 00FE
small thorn, Icelandic
THORN00DE
Unicode 00DE
capital THORN, Icelandic
uacute00FA
Unicode 00FA
small u, acute accent
Uacute00DA
Unicode 00DA
capital U, acute accent
ucirc00FB
Unicode 00FB
small u, circumflex accent
Ucirc00DB
Unicode 00DB
capital U, circumflex accent
ugrave00F9
Unicode 00F9
small u, grave accent
Ugrave00D9
Unicode 00D9
capital U, grave accent
uuml00FC
Unicode 00FC
small u, dieresis or umlaut mark
Uuml00DC
Unicode 00DC
capital U, dieresis or umlaut mark
yacute00FD
Unicode 00FD
small y, acute accent
Yacute00DD
Unicode 00DD
capital Y, acute accent
yuml00FF
Unicode 00FF
small y, dieresis or umlaut mark

B. Added Latin 2

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Added Latin 2//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-lat2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
abreve0103
Unicode 0103
small a, breve
Abreve0102
Unicode 0102
capital A, breve
amacr0101
Unicode 0101
small a, macron
Amacr0100
Unicode 0100
capital A, macron
aogon0105
Unicode 0105
small a, ogonek
Aogon0104
Unicode 0104
capital A, ogonek
cacute0107
Unicode 0107
small c, acute accent
Cacute0106
Unicode 0106
capital C, acute accent
ccaron010D
Unicode 010D
small c, caron
Ccaron010C
Unicode 010C
capital C, caron
ccirc0109
Unicode 0109
small c, circumflex accent
Ccirc0108
Unicode 0108
capital C, circumflex accent
cdot010B
Unicode 010B
small c, dot above
Cdot010A
Unicode 010A
capital C, dot above
dcaron010F
Unicode 010F
small d, caron
Dcaron010E
Unicode 010E
capital D, caron
dstrok0111
Unicode 0111
small d, stroke
Dstrok0110
Unicode 0110
capital D, stroke
ecaron011B
Unicode 011B
small e, caron
Ecaron011A
Unicode 011A
capital E, caron
edot0117
Unicode 0117
small e, dot above
Edot0116
Unicode 0116
capital E, dot above
emacr0113
Unicode 0113
small e, macron
Emacr0112
Unicode 0112
capital E, macron
eogon0119
Unicode 0119
small e, ogonek
Eogon0118
Unicode 0118
capital E, ogonek
gacute01F5
Unicode 01F5
small g, acute accent
gbreve011F
Unicode 011F
small g, breve
Gbreve011E
Unicode 011E
capital G, breve
Gcedil0122
Unicode 0122
capital G, cedilla
gcirc011D
Unicode 011D
small g, circumflex accent
Gcirc011C
Unicode 011C
capital G, circumflex accent
gdot0121
Unicode 0121
small g, dot above
Gdot0120
Unicode 0120
capital G, dot above
hcirc0125
Unicode 0125
small h, circumflex accent
Hcirc0124
Unicode 0124
capital H, circumflex accent
hstrok0127
Unicode 0127
small h, stroke
Hstrok0126
Unicode 0126
capital H, stroke
Idot0130
Unicode 0130
capital I, dot above
Imacr012A
Unicode 012A
capital I, macron
imacr012B
Unicode 012B
small i, macron
ijlig0133
Unicode 0133
small ij ligature
IJlig0132
Unicode 0132
capital IJ ligature
inodot0131
Unicode 0131
small i, no dot
iogon012F
Unicode 012F
small i, ogonek
Iogon012E
Unicode 012E
capital I, ogonek
itilde0129
Unicode 0129
small i, tilde
Itilde0128
Unicode 0128
capital I, tilde
jcirc0135
Unicode 0135
small j, circumflex accent
Jcirc0134
Unicode 0134
capital J, circumflex accent
kcedil0137
Unicode 0137
small k, cedilla
Kcedil0136
Unicode 0136
capital K, cedilla
kgreen0138
Unicode 0138
small k, Greenlandic
lacute013A
Unicode 013A
small l, acute accent
Lacute0139
Unicode 0139
capital L, acute accent
lcaron013E
Unicode 013E
small l, caron
Lcaron013D
Unicode 013D
capital L, caron
lcedil013C
Unicode 013C
small l, cedilla
Lcedil013B
Unicode 013B
capital L, cedilla
lmidot0140
Unicode 0140
small l, middle dot
Lmidot013F
Unicode 013F
capital L, middle dot
lstrok0142
Unicode 0142
small l, stroke
Lstrok0141
Unicode 0141
capital L, stroke
nacute0144
Unicode 0144
small n, acute accent
Nacute0143
Unicode 0143
capital N, acute accent
eng014B
Unicode 014B
small eng, Lapp
ENG014A
Unicode 014A
capital ENG, Lapp
napos0149
Unicode 0149
small n, apostrophe
ncaron0148
Unicode 0148
small n, caron
Ncaron0147
Unicode 0147
capital N, caron
ncedil0146
Unicode 0146
small n, cedilla
Ncedil0145
Unicode 0145
capital N, cedilla
odblac0151
Unicode 0151
small o, double acute accent
Odblac0150
Unicode 0150
capital O, double acute accent
Omacr014C
Unicode 014C
capital O, macron
omacr014D
Unicode 014D
small o, macron
oelig0153
Unicode 0153
small oe ligature
OElig0152
Unicode 0152
capital OE ligature
racute0155
Unicode 0155
small r, acute accent
Racute0154
Unicode 0154
capital R, acute accent
rcaron0159
Unicode 0159
small r, caron
Rcaron0158
Unicode 0158
capital R, caron
rcedil0157
Unicode 0157
small r, cedilla
Rcedil0156
Unicode 0156
capital R, cedilla
sacute015B
Unicode 015B
small s, acute accent
Sacute015A
Unicode 015A
capital S, acute accent
scaron0161
Unicode 0161
small s, caron
Scaron0160
Unicode 0160
capital S, caron
scedil015F
Unicode 015F
small s, cedilla
Scedil015E
Unicode 015E
capital S, cedilla
scirc015D
Unicode 015D
small s, circumflex accent
Scirc015C
Unicode 015C
capital S, circumflex accent
tcaron0165
Unicode 0165
small t, caron
Tcaron0164
Unicode 0164
capital T, caron
tcedil0163
Unicode 0163
small t, cedilla
Tcedil0162
Unicode 0162
capital T, cedilla
tstrok0167
Unicode 0167
small t, stroke
Tstrok0166
Unicode 0166
capital T, stroke
ubreve016D
Unicode 016D
small u, breve
Ubreve016C
Unicode 016C
capital U, breve
udblac0171
Unicode 0171
small u, double acute accent
Udblac0170
Unicode 0170
capital U, double acute accent
umacr016B
Unicode 016B
small u, macron
Umacr016A
Unicode 016A
capital U, macron
uogon0173
Unicode 0173
small u, ogonek
Uogon0172
Unicode 0172
capital U, ogonek
uring016F
Unicode 016F
small u, ring
Uring016E
Unicode 016E
capital U, ring
utilde0169
Unicode 0169
small u, tilde
Utilde0168
Unicode 0168
capital U, tilde
wcirc0175
Unicode 0175
small w, circumflex accent
Wcirc0174
Unicode 0174
capital W, circumflex accent
ycirc0177
Unicode 0177
small y, circumflex accent
Ycirc0176
Unicode 0176
capital Y, circumflex accent
Yuml0178
Unicode 0178
capital Y, dieresis or umlaut mark
zacute017A
Unicode 017A
small z, acute accent
Zacute0179
Unicode 0179
capital Z, acute accent
zcaron017E
Unicode 017E
small z, caron
Zcaron017D
Unicode 017D
capital Z, caron
zdot017C
Unicode 017C
small z, dot above
Zdot017B
Unicode 017B
capital Z, dot above

C. Greek Letters

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Greek Letters//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-grk1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
agr03B1
Unicode 03B1
small alpha, Greek
Agr0391
Unicode 0391
capital Alpha, Greek
bgr03B2
Unicode 03B2
small beta, Greek
Bgr0392
Unicode 0392
capital Beta, Greek
ggr03B3
Unicode 03B3
small gamma, Greek
Ggr0393
Unicode 0393
capital Gamma, Greek
dgr03B4
Unicode 03B4
small delta, Greek
Dgr0394
Unicode 0394
capital Delta, Greek
egr03B5
Unicode 03B5
small epsilon, Greek
Egr0395
Unicode 0395
capital Epsilon, Greek
zgr03B6
Unicode 03B6
small zeta, Greek
Zgr0396
Unicode 0396
capital Zeta, Greek
eegr03B7
Unicode 03B7
small eta, Greek
EEgr0397
Unicode 0397
capital Eta, Greek
thgr03B8
Unicode 03B8
small theta, Greek
THgr0398
Unicode 0398
capital Theta, Greek
igr03B9
Unicode 03B9
small iota, Greek
Igr0399
Unicode 0399
capital Iota, Greek
kgr03BA
Unicode 03BA
small kappa, Greek
Kgr039A
Unicode 039A
capital Kappa, Greek
lgr03BB
Unicode 03BB
small lambda, Greek
Lgr039B
Unicode 039B
capital Lambda, Greek
mgr03BC
Unicode 03BC
small mu, Greek
Mgr039C
Unicode 039C
capital Mu, Greek
ngr03BD
Unicode 03BD
small nu, Greek
Ngr039D
Unicode 039D
capital Nu, Greek
xgr03BE
Unicode 03BE
small xi, Greek
Xgr039E
Unicode 039E
capital Xi, Greek
ogr03BF
Unicode 03BF
small omicron, Greek
Ogr039F
Unicode 039F
capital Omicron, Greek
pgr03C0
Unicode 03C0
small pi, Greek
Pgr03A0
Unicode 03A0
capital Pi, Greek
rgr03C1
Unicode 03C1
small rho, Greek
Rgr03A1
Unicode 03A1
capital Rho, Greek
sgr03C3
Unicode 03C3
small sigma, Greek
Sgr03A3
Unicode 03A3
capital Sigma, Greek
sfgr03C2
Unicode 03C2
final small sigma, Greek
tgr03C4
Unicode 03C4
small tau, Greek
Tgr03A4
Unicode 03A4
capital Tau, Greek
ugr03C5
Unicode 03C5
small upsilon, Greek
Ugr03A5
Unicode 03A5
capital Upsilon, Greek
phgr03C6
Unicode 03C6
small phi, Greek
PHgr03A6
Unicode 03A6
capital Phi, Greek
khgr03C7
Unicode 03C7
small chi, Greek
KHgr03A7
Unicode 03A7
capital Chi, Greek
psgr03C8
Unicode 03C8
small psi, Greek
PSgr03A8
Unicode 03A8
capital Psi, Greek
ohgr03C9
Unicode 03C9
small omega, Greek
OHgr03A9
Unicode 03A9
capital Omega, Greek

D. Monotoniko Greek

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Monotoniko Greek//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-grk2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
aacgr03AC
Unicode 03AC
small alpha, accent, Greek
Aacgr0386
Unicode 0386
capital Alpha, accent, Greek
eacgr03AD
Unicode 03AD
small epsilon, accent, Greek
Eacgr0388
Unicode 0388
capital Epsilon, accent, Greek
eeacgr03AE
Unicode 03AE
small eta, accent, Greek
EEacgr0389
Unicode 0389
capital Eta, accent, Greek
idigr03CA
Unicode 03CA
small iota, dieresis, Greek
Idigr03AA
Unicode 03AA
capital Iota, dieresis, Greek
iacgr03AF
Unicode 03AF
small iota, accent, Greek
Iacgr038A
Unicode 038A
capital Iota, accent, Greek
idiagr0390
Unicode 0390
small iota, dieresis, accent, Greek
oacgr03CC
Unicode 03CC
small omicron, accent, Greek
Oacgr038C
Unicode 038C
capital Omicron, accent, Greek
udigr03CB
Unicode 03CB
small upsilon, dieresis, Greek
Udigr03AB
Unicode 03AB
capital Upsilon, dieresis, Greek
uacgr03CD
Unicode 03CD
small upsilon, accent, Greek
Uacgr038E
Unicode 038E
capital Upsilon, accent, Greek
udiagr03B0
Unicode 03B0
small upsilon, dieresis, accent, Greek
ohacgr03CE
Unicode 03CE
small omega, accent, Greek
OHacgr038F
Unicode 038F
capital Omega, accent, Greek

E. Russian Cyrillic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Russian Cyrillic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-cyr1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
acy0430
Unicode 0430
small a, Cyrillic
Acy0410
Unicode 0410
capital A, Cyrillic
bcy0431
Unicode 0431
small be, Cyrillic
Bcy0411
Unicode 0411
capital BE, Cyrillic
vcy0432
Unicode 0432
small ve, Cyrillic
Vcy0412
Unicode 0412
capital VE, Cyrillic
gcy0433
Unicode 0433
small ghe, Cyrillic
Gcy0413
Unicode 0413
capital GHE, Cyrillic
dcy0434
Unicode 0434
small de, Cyrillic
Dcy0414
Unicode 0414
capital DE, Cyrillic
iecy0435
Unicode 0435
small ie, Cyrillic
IEcy0415
Unicode 0415
capital IE, Cyrillic
iocy0451
Unicode 0451
small io, Russian
IOcy0401
Unicode 0401
capital IO, Russian
zhcy0436
Unicode 0436
small zhe, Cyrillic
ZHcy0416
Unicode 0416
capital ZHE, Cyrillic
zcy0437
Unicode 0437
small ze, Cyrillic
Zcy0417
Unicode 0417
capital ZE, Cyrillic
icy0438
Unicode 0438
small i, Cyrillic
Icy0418
Unicode 0418
capital I, Cyrillic
jcy0439
Unicode 0439
small short i, Cyrillic
Jcy0419
Unicode 0419
capital short I, Cyrillic
kcy043A
Unicode 043A
small ka, Cyrillic
Kcy041A
Unicode 041A
capital KA, Cyrillic
lcy043B
Unicode 043B
small el, Cyrillic
Lcy041B
Unicode 041B
capital EL, Cyrillic
mcy043C
Unicode 043C
small em, Cyrillic
Mcy041C
Unicode 041C
capital EM, Cyrillic
ncy043D
Unicode 043D
small en, Cyrillic
Ncy041D
Unicode 041D
capital EN, Cyrillic
ocy043E
Unicode 043E
small o, Cyrillic
Ocy041E
Unicode 041E
capital O, Cyrillic
pcy043F
Unicode 043F
small pe, Cyrillic
Pcy041F
Unicode 041F
capital PE, Cyrillic
rcy0440
Unicode 0440
small er, Cyrillic
Rcy0420
Unicode 0420
capital ER, Cyrillic
scy0441
Unicode 0441
small es, Cyrillic
Scy0421
Unicode 0421
capital ES, Cyrillic
tcy0442
Unicode 0442
small te, Cyrillic
Tcy0422
Unicode 0422
capital TE, Cyrillic
ucy0443
Unicode 0443
small u, Cyrillic
Ucy0423
Unicode 0423
capital U, Cyrillic
fcy0444
Unicode 0444
small ef, Cyrillic
Fcy0424
Unicode 0424
capital EF, Cyrillic
khcy0445
Unicode 0445
small ha, Cyrillic
KHcy0425
Unicode 0425
capital HA, Cyrillic
tscy0446
Unicode 0446
small tse, Cyrillic
TScy0426
Unicode 0426
capital TSE, Cyrillic
chcy0447
Unicode 0447
small che, Cyrillic
CHcy0427
Unicode 0427
capital CHE, Cyrillic
shcy0448
Unicode 0448
small sha, Cyrillic
SHcy0428
Unicode 0428
capital SHA, Cyrillic
shchcy0449
Unicode 0449
small shcha, Cyrillic
SHCHcy0429
Unicode 0429
capital SHCHA, Cyrillic
hardcy044A
Unicode 044A
small hard sign, Cyrillic
HARDcy042A
Unicode 042A
capital HARD sign, Cyrillic
ycy044B
Unicode 044B
small yeru, Cyrillic
Ycy042B
Unicode 042B
capital YERU, Cyrillic
softcy044C
Unicode 044C
small soft sign, Cyrillic
SOFTcy042C
Unicode 042C
capital SOFT sign, Cyrillic
ecy044D
Unicode 044D
small e, Cyrillic
Ecy042D
Unicode 042D
capital E, Cyrillic
yucy044E
Unicode 044E
small yu, Cyrillic
YUcy042E
Unicode 042E
capital YU, Cyrillic
yacy044F
Unicode 044F
small ya, Cyrillic
YAcy042F
Unicode 042F
capital YA, Cyrillic
numero2116
Unicode 2116
numero sign

F. Non-Russian Cyrillic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Non-Russian Cyrillic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-cyr2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
djcy0452
Unicode 0452
small dje, Serbian
DJcy0402
Unicode 0402
capital DJE, Serbian
gjcy0453
Unicode 0453
small gje, Macedonian
GJcy0403
Unicode 0403
capital GJE Macedonian
jukcy0454
Unicode 0454
small je, Ukrainian
Jukcy0404
Unicode 0404
capital JE, Ukrainian
dscy0455
Unicode 0455
small dse, Macedonian
DScy0405
Unicode 0405
capital DSE, Macedonian
iukcy0456
Unicode 0456
small i, Ukrainian
Iukcy0406
Unicode 0406
capital I, Ukrainian
yicy0457
Unicode 0457
small yi, Ukrainian
YIcy0407
Unicode 0407
capital YI, Ukrainian
jsercy0458
Unicode 0458
small je, Serbian
Jsercy0408
Unicode 0408
capital JE, Serbian
ljcy0459
Unicode 0459
small lje, Serbian
LJcy0409
Unicode 0409
capital LJE, Serbian
njcy045A
Unicode 045A
small nje, Serbian
NJcy040A
Unicode 040A
capital NJE, Serbian
tshcy045B
Unicode 045B
small tshe, Serbian
TSHcy040B
Unicode 040B
capital TSHE, Serbian
kjcy045C
Unicode 045C
small kje Macedonian
KJcy040C
Unicode 040C
capital KJE, Macedonian
ubrcy045E
Unicode 045E
small u, Byelorussian
Ubrcy040E
Unicode 040E
capital U, Byelorussian
dzcy045F
Unicode 045F
small dze, Serbian
DZcy040F
Unicode 040F
capital dze, Serbian

G. Numeric and Special Graphic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-num.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
half00BD
Unicode 00BD
fraction one-half
frac1200BD
Unicode 00BD
fraction one-half
frac1400BC
Unicode 00BC
fraction one-quarter
frac3400BE
Unicode 00BE
fraction three-quarters
frac18215B
Unicode 215B
fraction one-eighth
frac38215C
Unicode 215C
fraction three-eighths
frac58215D
Unicode 215D
fraction five-eighths
frac78215E
Unicode 215E
fraction seven-eighths
sup100B9
Unicode 00B9
superscript one
sup200B2
Unicode 00B2
superscript two
sup300B3
Unicode 00B3
superscript three
plus002B
Unicode 002B
plus sign [Binary operator]
plusmn00B1
Unicode 00B1
plus-or-minus sign[Binary operator]
lt003C
Unicode 003C
less-than sign [Relation]
equals003D
Unicode 003D
equals sign [Relation]
gt003E
Unicode 003E
greater-than sign [Relation]
divide00F7
Unicode 00F7
divide sign[Binary operator]
times00D7
Unicode 00D7
multiply sign[Binary operator]
curren00A4
Unicode 00A4
general currency sign
pound00A3
Unicode 00A3
pound sign
dollar0024
Unicode 0024
dollar sign
cent00A2
Unicode 00A2
cent sign
yen00A5
Unicode 00A5
yen sign
num0023
Unicode 0023
number sign
percnt0025
Unicode 0025
percent sign
amp0026
Unicode 0026
ampersand
ast002A
Unicode 002A
asterisk[Binary operator]
commat0040
Unicode 0040
commercial at
lsqb005B
Unicode 005B
left square bracket[Opening delimiter]
bsol005C
Unicode 005C
reverse solidus
rsqb005D
Unicode 005D
right square b