XML Character Entities Version 0.2

OASIS DocBook Technical Committee

Working Draft 19 Mar 2002

This version:
Working Draft: 19 Mar 2002
Previous versions:
Working Draft: 19 Nov 2001
Editor:
Norman Walsh <Norman.Walsh@Sun.COM>

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Abstract

This Standard defines XML encodings of the 19 standard character entity sets defined in Non-normative Annex D of [ISO 8879:1986].

Status of this Document

This Working Draft was approved for publication by the OASIS DocBook Technical Committee. Comments on this document may be sent to docbook@lists.oasis-open.org.

Table of Contents

1. XML Character Entity Sets
1.1. Multi-Character Replacements
1.2. Duplicate Entities
1.3. Entities with no Mapping
1.4. Entities with Substituted Mappings
2. XML Character Elements

Appendixes

A. Added Latin 1
B. Added Latin 2
C. Greek Letters
D. Monotoniko Greek
E. Russian Cyrillic
F. Non-Russian Cyrillic
G. Numeric and Special Graphic
H. Diacritical Marks
I. Publishing
J. Box and Line Drawing
K. General Technical
L. Greek Symbols
M. Alternative Greek Symbols
N. Added Math Symbols: Ordinary
O. Added Math Symbols: Binary Operators
P. Added Math Symbols: Relations
Q. Added Math Symbols: Negated Relations
R. Added Math Symbols: Arrow Relations
S. Added Math Symbols: Delimiters
T. Unicode Glyphs
U. OASIS DocBook Technical Committee (Non-Normative)
References

This Standard defines XML encodings of the standard SGML character entity sets.

Non-normative Annex D of [ISO 8879:1986] defines 19 standard SGML character entity sets: Added Latin 1, Added Latin 2, Greek Letters, Monotoniko Greek, Russian Cyrillic, Non-Russian Cyrillic, Numeric and Special Graphic, Diacritical Marks, Publishing, Box and Line Drawing, General Technical, Greek Symbols, Alternative Greek Symbols, Added Math Symbols: Ordinary, Added Math Symbols: Binary Operators, Added Math Symbols: Relations, Added Math Symbols: Negated Relations, Added Math Symbols: Arrow Relations, Added Math Symbols: Delimiters. The SGML declarations for these entities use the specific character data (SDATA) entity type that is not supported in XML, so alternative XML declarations are necessary.

In XML, the specific character data of most entities can be expressed as a [Unicode] character.

1. XML Character Entity Sets

The character entity sets defined by this Standard are summarized in Appendix A through Appendix S.

In order to use these entities in a document, they must be declared. Entities can be declared in the external subset or the internal subset, as described in [XML 1.0]. An example document, with the declaration in the internal subset, is shown in Example 1.

Example 1. Declaring and Using the ISO Latin 1 Character Entity Set

<!DOCTYPE doc [
<!ENTITY % iso-lat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML"
                    "http://www.oasis-open.org/docbook/xmlcharent/0.2/isolat1.ent">
%iso-lat1;
]>
<doc>
<p>This document declares the ISO Latin 1 Character Entity Set, providing
access to the ISO Latin 1 entities, such as "&eacute;" and "&copy;".</p>
</doc>

Note

Non-validating XML Parsers may choose not to process externally declared entities. This Standard does not alter the semantics of XML processors. If a processor does not see the declaration for an entity, it will not be able to report the correct replacement text for that entity.

1.1. Multi-Character Replacements

The replacement text of some entities includes more than a single Unicode character. Some characters are composed with the "combining reverse solidus overlay" (20E5) and some are composed with a variation selector (FE00, FE01, …).

1.2. Duplicate Entities

Historically, the inodot entity is multiply defined in iso-lat2.ent and iso-amso.ent. If both entity sets are included, some parsers will warn about redefinition of this entity. The warning can be ignored.

1.3. Entities with no Mapping

There are a small number of entities that have no [Unicode] representation. These entities are all mapped to the Unicode character "FFFD", the "replacement character".

Entity
Name
Entity SetDescription
fjligiso-pub.entSmall fj ligature
gnapiso-amsn.entGreater, not approximate
jnodotiso-amso.entSmall j, no dot
lnapiso-amsn.entLess, not approximate
lpargtiso-amsc.entGreater than, left arc
nsmidiso-amsn.entNegated short mid
prnEiso-amsn.entPrecedes, not double equals
rpargtiso-amsc.entRight paren, greater than
scnEiso-amsn.entSucceeds, not double equals
smidiso-amsr.entshortmid r
vsubnEiso-amsn.entSubset not double equals, variant

Users needing these characters will have to rely on the private use area or other non-portable mechanisms to access them.

1.4. Entities with Substituted Mappings

There are a few more for which there is no specific [Unicode] representation but where a reasonable substitution has been used:

Entity
Name
Entity SetSubstitutionDescription
bepsiiso-amsr.ent220DBack epsilon: such that
gesiso-amsr.ent2265Greater-or-equal, slanted
gvnEiso-amsn.ent2269Gt, vert, not double equals
iffiso-tech.ent21D4If and only if
lesiso-amsr.ent2264Less-than-or-equal, slanted
lozfiso-pub.ent2726Lozenge, filled
lvnEiso-amsn.ent2268Less, vert, not double equals
ngeiso-amsn.ent2271Neither greater-than nor equal to
nleiso-amsn.ent2270Not less-than-or-equal
npreiso-amsn.ent22E0Not precedes, equals
nsceiso-amsn.ent22E1Not succeeds, equals
nspariso-amsn.ent2226Not short parallel
preiso-amsr.ent227CPrecedes, equals
spariso-amsr.ent2225Short parallel
ssetmniso-amsb.ent2216Small set minus (reverse solidus)
stariso-pub.ent22C6Star operator
starfiso-pub.ent2605Black star
thkapiso-amsr.ent2248Thick approximate
thksimiso-amsr.ent223CThick similar
vsubneiso-amsn.ent228ASubset, not equals, variant
vsupnEiso-amsn.ent228BSubset not double equals, variant
vsupneiso-amsn.ent228BSuperset, not equals, variant
xhArriso-amsa.ent2194Long left and right double arr
xharriso-amsa.ent2194Long left and right arr
xlArriso-amsa.ent21D0Long left double arrow
xrArriso-amsa.ent21D2Long right double arr
ssmileiso-amsr.ent2323Small smile
sfrowniso-amsr.ent2322Small frown

Users needing alternate glyphs for these characters will have to rely on redefining them to use the private use area or other non-portable mechanisms to access them.

2. XML Character Elements

Named XML entities (except for the five predefined entities) cannot be used if they are not declared. Entity declaration requires either an external or an internal subset. Some classes of applications forbid the occurrence of markup declarations in documents. For these documents, named character entities are inaccessible.

In this section, we introduce an XML vocabulary with the semantics of character entity reference. This Standard defines the semantics of elements and attributes declared in the "http://www.oasis-open.org/docbook/xmlcharent/names" namespace.

This namespace contains exactly one element, char. The char element has two attributes, entity and name. They are mutually exclusive.

The entity attribute identifies characters by their character entity names. (The set of valid names is the closed set of names associated with character entity sets defined by this Standard.) Case is significant in entity names.

The name attribute identifies characters by their Unicode character names. (The set of valid names is the set of character names published in the [Unicode] specification, or any later version of that specification.) Case is insignificant in character names.

The [RELAX NG] definition of this namespace is shown in figure Figure 1.

Figure 1. The RELAX NG Definition of the http://www.oasis-open.org/docbook/xmlcharent/names Namespace

<?xml version="1.0"?>
<grammar xmlns="http://relaxng.org/ns/structure/0.9"
         ns="http://www.oasis-open.org/docbook/xmlcharent/names">

<start>
  <element name="char">
    <choice>
      <attribute name="entity">
        <ref name="EntityNames"/>
      </attribute>
      <attribute name="name">
        <ref name="UnicodeNames"/>
      </attribute>
    </choice>
  </element>
</start>

<define name="EntityNames">
  <!-- Logically, this is the list of ISO 9573 Character Entity Names -->
  <!-- For now, just text. -->
  <text/>
</define>

<define name="UnicodeNames">
  <!-- Logically, this is the list of Unicode Character Names -->
  <!-- For now, just text. -->
  <text/>
</define>

</grammar>

Example 2 shows a sample document using this mechanism.

Example 2. Declaring and Using the ISO Latin 1 Character Entity Set

<doc xmlns:e="http://www.oasis-open.org/docbook/xmlcharent/names">
<p>This document uses the character names element to access
character entities, such as "<e:char name="eacute"/>" and
"<e:char name="COPYRIGHT SIGN"/>".</p>
</doc>

The character names element is limited to contexts where elements may occur. In particular, elements may not occur in XML attribute values. Note, however, that internationalization requirements such as bidirectional language support and Ruby already require structure in arbitrary contexts. It is probably an error to use attributes for human-readable content.

A. Added Latin 1

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Added Latin 1//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isolat1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
aacute00E1
Unicode 00E1
=small a, acute accent
Aacute00C1
Unicode 00C1
=capital A, acute accent
acirc00E2
Unicode 00E2
=small a, circumflex accent
Acirc00C2
Unicode 00C2
=capital A, circumflex accent
agrave00E0
Unicode 00E0
=small a, grave accent
Agrave00C0
Unicode 00C0
=capital A, grave accent
aring00E5
Unicode 00E5
=small a, ring
Aring00C5
Unicode 00C5
=capital A, ring
atilde00E3
Unicode 00E3
=small a, tilde
Atilde00C3
Unicode 00C3
=capital A, tilde
auml00E4
Unicode 00E4
=small a, dieresis or umlaut mark
Auml00C4
Unicode 00C4
=capital A, dieresis or umlaut mark
aelig00E6
Unicode 00E6
=small ae diphthong (ligature)
AElig00C6
Unicode 00C6
=capital AE diphthong (ligature)
ccedil00E7
Unicode 00E7
=small c, cedilla
Ccedil00C7
Unicode 00C7
=capital C, cedilla
eth00F0
Unicode 00F0
=small eth, Icelandic
ETH00D0
Unicode 00D0
=capital Eth, Icelandic
eacute00E9
Unicode 00E9
=small e, acute accent
Eacute00C9
Unicode 00C9
=capital E, acute accent
ecirc00EA
Unicode 00EA
=small e, circumflex accent
Ecirc00CA
Unicode 00CA
=capital E, circumflex accent
egrave00E8
Unicode 00E8
=small e, grave accent
Egrave00C8
Unicode 00C8
=capital E, grave accent
euml00EB
Unicode 00EB
=small e, dieresis or umlaut mark
Euml00CB
Unicode 00CB
=capital E, dieresis or umlaut mark
iacute00ED
Unicode 00ED
=small i, acute accent
Iacute00CD
Unicode 00CD
=capital I, acute accent
icirc00EE
Unicode 00EE
=small i, circumflex accent
Icirc00CE
Unicode 00CE
=capital I, circumflex accent
igrave00EC
Unicode 00EC
=small i, grave accent
Igrave00CC
Unicode 00CC
=capital I, grave accent
iuml00EF
Unicode 00EF
=small i, dieresis or umlaut mark
Iuml00CF
Unicode 00CF
=capital I, dieresis or umlaut mark
ntilde00F1
Unicode 00F1
=small n, tilde
Ntilde00D1
Unicode 00D1
=capital N, tilde
oacute00F3
Unicode 00F3
=small o, acute accent
Oacute00D3
Unicode 00D3
=capital O, acute accent
ocirc00F4
Unicode 00F4
=small o, circumflex accent
Ocirc00D4
Unicode 00D4
=capital O, circumflex accent
ograve00F2
Unicode 00F2
=small o, grave accent
Ograve00D2
Unicode 00D2
=capital O, grave accent
oslash00F8
Unicode 00F8
=small o, slash
Oslash00D8
Unicode 00D8
=capital O, slash
otilde00F5
Unicode 00F5
=small o, tilde
Otilde00D5
Unicode 00D5
=capital O, tilde
ouml00F6
Unicode 00F6
=small o, dieresis or umlaut mark
Ouml00D6
Unicode 00D6
=capital O, dieresis or umlaut mark
szlig00DF
Unicode 00DF
=small sharp s, German (sz ligature)
thorn00FE
Unicode 00FE
=small thorn, Icelandic
THORN00DE
Unicode 00DE
=capital THORN, Icelandic
uacute00FA
Unicode 00FA
=small u, acute accent
Uacute00DA
Unicode 00DA
=capital U, acute accent
ucirc00FB
Unicode 00FB
=small u, circumflex accent
Ucirc00DB
Unicode 00DB
=capital U, circumflex accent
ugrave00F9
Unicode 00F9
=small u, grave accent
Ugrave00D9
Unicode 00D9
=capital U, grave accent
uuml00FC
Unicode 00FC
=small u, dieresis or umlaut mark
Uuml00DC
Unicode 00DC
=capital U, dieresis or umlaut mark
yacute00FD
Unicode 00FD
=small y, acute accent
Yacute00DD
Unicode 00DD
=capital Y, acute accent
yuml00FF
Unicode 00FF
=small y, dieresis or umlaut mark

B. Added Latin 2

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Added Latin 2//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isolat2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
abreve0103
Unicode 0103
=small a, breve
Abreve0102
Unicode 0102
=capital A, breve
amacr0101
Unicode 0101
=small a, macron
Amacr0100
Unicode 0100
=capital A, macron
aogon0105
Unicode 0105
=small a, ogonek
Aogon0104
Unicode 0104
=capital A, ogonek
cacute0107
Unicode 0107
=small c, acute accent
Cacute0106
Unicode 0106
=capital C, acute accent
ccaron010D
Unicode 010D
=small c, caron
Ccaron010C
Unicode 010C
=capital C, caron
ccirc0109
Unicode 0109
=small c, circumflex accent
Ccirc0108
Unicode 0108
=capital C, circumflex accent
cdot010B
Unicode 010B
=small c, dot above
Cdot010A
Unicode 010A
=capital C, dot above
dcaron010F
Unicode 010F
=small d, caron
Dcaron010E
Unicode 010E
=capital D, caron
dstrok0111
Unicode 0111
=small d, stroke
Dstrok0110
Unicode 0110
=capital D, stroke
ecaron011B
Unicode 011B
=small e, caron
Ecaron011A
Unicode 011A
=capital E, caron
edot0117
Unicode 0117
=small e, dot above
Edot0116
Unicode 0116
=capital E, dot above
emacr0113
Unicode 0113
=small e, macron
Emacr0112
Unicode 0112
=capital E, macron
eogon0119
Unicode 0119
=small e, ogonek
Eogon0118
Unicode 0118
=capital E, ogonek
gacute01F5
Unicode 01F5
=small g, acute accent
gbreve011F
Unicode 011F
=small g, breve
Gbreve011E
Unicode 011E
=capital G, breve
Gcedil0122
Unicode 0122
=capital G, cedilla
gcirc011D
Unicode 011D
=small g, circumflex accent
Gcirc011C
Unicode 011C
=capital G, circumflex accent
gdot0121
Unicode 0121
=small g, dot above
Gdot0120
Unicode 0120
=capital G, dot above
hcirc0125
Unicode 0125
=small h, circumflex accent
Hcirc0124
Unicode 0124
=capital H, circumflex accent
hstrok0127
Unicode 0127
=small h, stroke
Hstrok0126
Unicode 0126
=capital H, stroke
Idot0130
Unicode 0130
=capital I, dot above
Imacr012A
Unicode 012A
=capital I, macron
imacr012B
Unicode 012B
=small i, macron
ijlig0133
Unicode 0133
=small ij ligature
IJlig0132
Unicode 0132
=capital IJ ligature
inodot0131
Unicode 0131
/imath =small i, no dot
iogon012F
Unicode 012F
=small i, ogonek
Iogon012E
Unicode 012E
=capital I, ogonek
itilde0129
Unicode 0129
=small i, tilde
Itilde0128
Unicode 0128
=capital I, tilde
jcirc0135
Unicode 0135
=small j, circumflex accent
Jcirc0134
Unicode 0134
=capital J, circumflex accent
kcedil0137
Unicode 0137
=small k, cedilla
Kcedil0136
Unicode 0136
=capital K, cedilla
kgreen0138
Unicode 0138
=small k, Greenlandic
lacute013A
Unicode 013A
=small l, acute accent
Lacute0139
Unicode 0139
=capital L, acute accent
lcaron013E
Unicode 013E
=small l, caron
Lcaron013D
Unicode 013D
=capital L, caron
lcedil013C
Unicode 013C
=small l, cedilla
Lcedil013B
Unicode 013B
=capital L, cedilla
lmidot0140
Unicode 0140
=small l, middle dot
Lmidot013F
Unicode 013F
=capital L, middle dot
lstrok0142
Unicode 0142
=small l, stroke
Lstrok0141
Unicode 0141
=capital L, stroke
nacute0144
Unicode 0144
=small n, acute accent
Nacute0143
Unicode 0143
=capital N, acute accent
eng014B
Unicode 014B
=small eng, Lapp
ENG014A
Unicode 014A
=capital ENG, Lapp
napos0149
Unicode 0149
=small n, apostrophe
ncaron0148
Unicode 0148
=small n, caron
Ncaron0147
Unicode 0147
=capital N, caron
ncedil0146
Unicode 0146
=small n, cedilla
Ncedil0145
Unicode 0145
=capital N, cedilla
odblac0151
Unicode 0151
=small o, double acute accent
Odblac0150
Unicode 0150
=capital O, double acute accent
Omacr014C
Unicode 014C
=capital O, macron
omacr014D
Unicode 014D
=small o, macron
oelig0153
Unicode 0153
=small oe ligature
OElig0152
Unicode 0152
=capital OE ligature
racute0155
Unicode 0155
=small r, acute accent
Racute0154
Unicode 0154
=capital R, acute accent
rcaron0159
Unicode 0159
=small r, caron
Rcaron0158
Unicode 0158
=capital R, caron
rcedil0157
Unicode 0157
=small r, cedilla
Rcedil0156
Unicode 0156
=capital R, cedilla
sacute015B
Unicode 015B
=small s, acute accent
Sacute015A
Unicode 015A
=capital S, acute accent
scaron0161
Unicode 0161
=small s, caron
Scaron0160
Unicode 0160
=capital S, caron
scedil015F
Unicode 015F
=small s, cedilla
Scedil015E
Unicode 015E
=capital S, cedilla
scirc015D
Unicode 015D
=small s, circumflex accent
Scirc015C
Unicode 015C
=capital S, circumflex accent
tcaron0165
Unicode 0165
=small t, caron
Tcaron0164
Unicode 0164
=capital T, caron
tcedil0163
Unicode 0163
=small t, cedilla
Tcedil0162
Unicode 0162
=capital T, cedilla
tstrok0167
Unicode 0167
=small t, stroke
Tstrok0166
Unicode 0166
=capital T, stroke
ubreve016D
Unicode 016D
=small u, breve
Ubreve016C
Unicode 016C
=capital U, breve
udblac0171
Unicode 0171
=small u, double acute accent
Udblac0170
Unicode 0170
=capital U, double acute accent
umacr016B
Unicode 016B
=small u, macron
Umacr016A
Unicode 016A
=capital U, macron
uogon0173
Unicode 0173
=small u, ogonek
Uogon0172
Unicode 0172
=capital U, ogonek
uring016F
Unicode 016F
=small u, ring
Uring016E
Unicode 016E
=capital U, ring
utilde0169
Unicode 0169
=small u, tilde
Utilde0168
Unicode 0168
=capital U, tilde
wcirc0175
Unicode 0175
=small w, circumflex accent
Wcirc0174
Unicode 0174
=capital W, circumflex accent
ycirc0177
Unicode 0177
=small y, circumflex accent
Ycirc0176
Unicode 0176
=capital Y, circumflex accent
Yuml0178
Unicode 0178
=capital Y, dieresis or umlaut mark
zacute017A
Unicode 017A
=small z, acute accent
Zacute0179
Unicode 0179
=capital Z, acute accent
zcaron017E
Unicode 017E
=small z, caron
Zcaron017D
Unicode 017D
=capital Z, caron
zdot017C
Unicode 017C
=small z, dot above
Zdot017B
Unicode 017B
=capital Z, dot above

C. Greek Letters

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Greek Letters//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isogrk1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
agr03B1
Unicode 03B1
=small alpha, Greek
Agr0391
Unicode 0391
=capital Alpha, Greek
bgr03B2
Unicode 03B2
=small beta, Greek
Bgr0392
Unicode 0392
=capital Beta, Greek
ggr03B3
Unicode 03B3
=small gamma, Greek
Ggr0393
Unicode 0393
=capital Gamma, Greek
dgr03B4
Unicode 03B4
=small delta, Greek
Dgr0394
Unicode 0394
=capital Delta, Greek
egr03B5
Unicode 03B5
=small epsilon, Greek
Egr0395
Unicode 0395
=capital Epsilon, Greek
zgr03B6
Unicode 03B6
=small zeta, Greek
Zgr0396
Unicode 0396
=capital Zeta, Greek
eegr03B7
Unicode 03B7
=small eta, Greek
EEgr0397
Unicode 0397
=capital Eta, Greek
thgr03B8
Unicode 03B8
=small theta, Greek
THgr0398
Unicode 0398
=capital Theta, Greek
igr03B9
Unicode 03B9
=small iota, Greek
Igr0399
Unicode 0399
=capital Iota, Greek
kgr03BA
Unicode 03BA
=small kappa, Greek
Kgr039A
Unicode 039A
=capital Kappa, Greek
lgr03BB
Unicode 03BB
=small lambda, Greek
Lgr039B
Unicode 039B
=capital Lambda, Greek
mgr03BC
Unicode 03BC
=small mu, Greek
Mgr039C
Unicode 039C
=capital Mu, Greek
ngr03BD
Unicode 03BD
=small nu, Greek
Ngr039D
Unicode 039D
=capital Nu, Greek
xgr03BE
Unicode 03BE
=small xi, Greek
Xgr039E
Unicode 039E
=capital Xi, Greek
ogr03BF
Unicode 03BF
=small omicron, Greek
Ogr039F
Unicode 039F
=capital Omicron, Greek
pgr03C0
Unicode 03C0
=small pi, Greek
Pgr03A0
Unicode 03A0
=capital Pi, Greek
rgr03C1
Unicode 03C1
=small rho, Greek
Rgr03A1
Unicode 03A1
=capital Rho, Greek
sgr03C3
Unicode 03C3
=small sigma, Greek
Sgr03A3
Unicode 03A3
=capital Sigma, Greek
sfgr03C2
Unicode 03C2
=final small sigma, Greek
tgr03C4
Unicode 03C4
=small tau, Greek
Tgr03A4
Unicode 03A4
=capital Tau, Greek
ugr03C5
Unicode 03C5
=small upsilon, Greek
Ugr03A5
Unicode 03A5
=capital Upsilon, Greek
phgr03C6
Unicode 03C6
=small phi, Greek
PHgr03A6
Unicode 03A6
=capital Phi, Greek
khgr03C7
Unicode 03C7
=small chi, Greek
KHgr03A7
Unicode 03A7
=capital Chi, Greek
psgr03C8
Unicode 03C8
=small psi, Greek
PSgr03A8
Unicode 03A8
=capital Psi, Greek
ohgr03C9
Unicode 03C9
=small omega, Greek
OHgr03A9
Unicode 03A9
=capital Omega, Greek

D. Monotoniko Greek

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Monotoniko Greek//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isogrk2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
aacgr03AC
Unicode 03AC
=small alpha, accent, Greek
Aacgr0386
Unicode 0386
=capital Alpha, accent, Greek
eacgr03AD
Unicode 03AD
=small epsilon, accent, Greek
Eacgr0388
Unicode 0388
=capital Epsilon, accent, Greek
eeacgr03AE
Unicode 03AE
=small eta, accent, Greek
EEacgr0389
Unicode 0389
=capital Eta, accent, Greek
idigr03CA
Unicode 03CA
=small iota, dieresis, Greek
Idigr03AA
Unicode 03AA
=capital Iota, dieresis, Greek
iacgr03AF
Unicode 03AF
=small iota, accent, Greek
Iacgr038A
Unicode 038A
=capital Iota, accent, Greek
idiagr0390
Unicode 0390
=small iota, dieresis, accent, Greek
oacgr03CC
Unicode 03CC
=small omicron, accent, Greek
Oacgr038C
Unicode 038C
=capital Omicron, accent, Greek
udigr03CB
Unicode 03CB
=small upsilon, dieresis, Greek
Udigr03AB
Unicode 03AB
=capital Upsilon, dieresis, Greek
uacgr03CD
Unicode 03CD
=small upsilon, accent, Greek
Uacgr038E
Unicode 038E
=capital Upsilon, accent, Greek
udiagr03B0
Unicode 03B0
=small upsilon, dieresis, accent, Greek
ohacgr03CE
Unicode 03CE
=small omega, accent, Greek
OHacgr038F
Unicode 038F
=capital Omega, accent, Greek

E. Russian Cyrillic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Russian Cyrillic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isocyr1.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
acy0430
Unicode 0430
=small a, Cyrillic
Acy0410
Unicode 0410
=capital A, Cyrillic
bcy0431
Unicode 0431
=small be, Cyrillic
Bcy0411
Unicode 0411
=capital BE, Cyrillic
vcy0432
Unicode 0432
=small ve, Cyrillic
Vcy0412
Unicode 0412
=capital VE, Cyrillic
gcy0433
Unicode 0433
=small ghe, Cyrillic
Gcy0413
Unicode 0413
=capital GHE, Cyrillic
dcy0434
Unicode 0434
=small de, Cyrillic
Dcy0414
Unicode 0414
=capital DE, Cyrillic
iecy0435
Unicode 0435
=small ie, Cyrillic
IEcy0415
Unicode 0415
=capital IE, Cyrillic
iocy0451
Unicode 0451
=small io, Russian
IOcy0401
Unicode 0401
=capital IO, Russian
zhcy0436
Unicode 0436
=small zhe, Cyrillic
ZHcy0416
Unicode 0416
=capital ZHE, Cyrillic
zcy0437
Unicode 0437
=small ze, Cyrillic
Zcy0417
Unicode 0417
=capital ZE, Cyrillic
icy0438
Unicode 0438
=small i, Cyrillic
Icy0418
Unicode 0418
=capital I, Cyrillic
jcy0439
Unicode 0439
=small short i, Cyrillic
Jcy0419
Unicode 0419
=capital short I, Cyrillic
kcy043A
Unicode 043A
=small ka, Cyrillic
Kcy041A
Unicode 041A
=capital KA, Cyrillic
lcy043B
Unicode 043B
=small el, Cyrillic
Lcy041B
Unicode 041B
=capital EL, Cyrillic
mcy043C
Unicode 043C
=small em, Cyrillic
Mcy041C
Unicode 041C
=capital EM, Cyrillic
ncy043D
Unicode 043D
=small en, Cyrillic
Ncy041D
Unicode 041D
=capital EN, Cyrillic
ocy043E
Unicode 043E
=small o, Cyrillic
Ocy041E
Unicode 041E
=capital O, Cyrillic
pcy043F
Unicode 043F
=small pe, Cyrillic
Pcy041F
Unicode 041F
=capital PE, Cyrillic
rcy0440
Unicode 0440
=small er, Cyrillic
Rcy0420
Unicode 0420
=capital ER, Cyrillic
scy0441
Unicode 0441
=small es, Cyrillic
Scy0421
Unicode 0421
=capital ES, Cyrillic
tcy0442
Unicode 0442
=small te, Cyrillic
Tcy0422
Unicode 0422
=capital TE, Cyrillic
ucy0443
Unicode 0443
=small u, Cyrillic
Ucy0423
Unicode 0423
=capital U, Cyrillic
fcy0444
Unicode 0444
=small ef, Cyrillic
Fcy0424
Unicode 0424
=capital EF, Cyrillic
khcy0445
Unicode 0445
=small ha, Cyrillic
KHcy0425
Unicode 0425
=capital HA, Cyrillic
tscy0446
Unicode 0446
=small tse, Cyrillic
TScy0426
Unicode 0426
=capital TSE, Cyrillic
chcy0447
Unicode 0447
=small che, Cyrillic
CHcy0427
Unicode 0427
=capital CHE, Cyrillic
shcy0448
Unicode 0448
=small sha, Cyrillic
SHcy0428
Unicode 0428
=capital SHA, Cyrillic
shchcy0449
Unicode 0449
=small shcha, Cyrillic
SHCHcy0429
Unicode 0429
=capital SHCHA, Cyrillic
hardcy044A
Unicode 044A
=small hard sign, Cyrillic
HARDcy042A
Unicode 042A
=capital HARD sign, Cyrillic
ycy044B
Unicode 044B
=small yeru, Cyrillic
Ycy042B
Unicode 042B
=capital YERU, Cyrillic
softcy044C
Unicode 044C
=small soft sign, Cyrillic
SOFTcy042C
Unicode 042C
=capital SOFT sign, Cyrillic
ecy044D
Unicode 044D
=small e, Cyrillic
Ecy042D
Unicode 042D
=capital E, Cyrillic
yucy044E
Unicode 044E
=small yu, Cyrillic
YUcy042E
Unicode 042E
=capital YU, Cyrillic
yacy044F
Unicode 044F
=small ya, Cyrillic
YAcy042F
Unicode 042F
=capital YA, Cyrillic
numero2116
Unicode 2116
=numero sign

F. Non-Russian Cyrillic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Non-Russian Cyrillic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isocyr2.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
djcy0452
Unicode 0452
=small dje, Serbian
DJcy0402
Unicode 0402
=capital DJE, Serbian
gjcy0453
Unicode 0453
=small gje, Macedonian
GJcy0403
Unicode 0403
=capital GJE Macedonian
jukcy0454
Unicode 0454
=small je, Ukrainian
Jukcy0404
Unicode 0404
=capital JE, Ukrainian
dscy0455
Unicode 0455
=small dse, Macedonian
DScy0405
Unicode 0405
=capital DSE, Macedonian
iukcy0456
Unicode 0456
=small i, Ukrainian
Iukcy0406
Unicode 0406
=capital I, Ukrainian
yicy0457
Unicode 0457
=small yi, Ukrainian
YIcy0407
Unicode 0407
=capital YI, Ukrainian
jsercy0458
Unicode 0458
=small je, Serbian
Jsercy0408
Unicode 0408
=capital JE, Serbian
ljcy0459
Unicode 0459
=small lje, Serbian
LJcy0409
Unicode 0409
=capital LJE, Serbian
njcy045A
Unicode 045A
=small nje, Serbian
NJcy040A
Unicode 040A
=capital NJE, Serbian
tshcy045B
Unicode 045B
=small tshe, Serbian
TSHcy040B
Unicode 040B
=capital TSHE, Serbian
kjcy045C
Unicode 045C
=small kje Macedonian
KJcy040C
Unicode 040C
=capital KJE, Macedonian
ubrcy045E
Unicode 045E
=small u, Byelorussian
Ubrcy040E
Unicode 040E
=capital U, Byelorussian
dzcy045F
Unicode 045F
=small dze, Serbian
DZcy040F
Unicode 040F
=capital dze, Serbian

G. Numeric and Special Graphic

Identifiers for this entity set:

Public identifier: ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML
System identifier: http://www.oasis-open.org/docbook/xmlcharent/0.2/isonum.ent

The following character entities are defined in this entity set:

Entity
Name
Unicode
Code point
Sample
Glyph
Description
half00BD
Unicode 00BD
=fraction one-half
frac1200BD
Unicode 00BD
=fraction one-half
frac1400BC
Unicode 00BC
=fraction one-quarter
frac3400BE
Unicode 00BE
=fraction three-quarters
frac18215B
Unicode 215B
=fraction one-eighth
frac38215C
Unicode 215C
=fraction three-eighths
frac58215D
Unicode 215D
=fraction five-eighths
frac78215E
Unicode 215E
=fraction seven-eighths
sup100B9
Unicode 00B9
=superscript one
sup200B2
Unicode 00B2
=superscript two
sup300B3
Unicode 00B3
=superscript three
plus002B
Unicode 002B
=plus sign B:
plusmn00B1
Unicode 00B1
/pm B: =plus-or-minus sign
lt003C
Unicode 003C
=less-than sign R:
equals003D
Unicode 003D
=equals sign R:
gt003E
Unicode 003E
=greater-than sign R:
divide00F7
Unicode 00F7
/div B: =divide sign
times00D7
Unicode 00D7
/times B: =multiply sign
curren00A4
Unicode 00A4
=general currency sign
pound00A3
Unicode 00A3
=pound sign
dollar0024
Unicode 0024
=dollar sign
cent00A2
Unicode 00A2
=cent sign
yen00A5
Unicode 00A5
/yen =yen sign
num0023
Unicode 0023
=number sign
percnt0025
Unicode 0025
=percent sign
amp0026
Unicode 0026
=ampersand
ast002A
Unicode 002A
/ast B: =asterisk
commat0040
Unicode 0040
=commercial at
lsqb005B
Unicode 005B
/lbrack O: =left square bracket
bsol005C
Unicode 005C
/backslash =reverse solidus
rsqb005D
Unicode 005D
/rbrack C: =right square bracket
lcub007B
Unicode 007B
/lbrace O: =left curly bracket
horbar2015
Unicode 2015
=horizontal bar
verbar007C
Unicode 007C
/vert =vertical bar
rcub007D
Unicode 007D
/rbrace C: =right curly bracket
micro00B5
Unicode 00B5
=micro sign
ohm2126
Unicode 2126
=ohm sign
deg00B0
Unicode 00B0
=degree sign
ordm00BA
Unicode 00BA
=ordinal indicator, masculine
ordf00AA
Unicode 00AA
=ordinal indicator, feminine
sect00A7
Unicode 00A7
=section sign
para00B6
Unicode 00B6
=pilcrow (paragraph sign)
middot00B7
Unicode 00B7
/centerdot B: =middle dot
larr2190
Unicode 2190
/leftarrow /gets A: =leftward arrow
rarr2192
Unicode 2192
/rightarrow /to A: =rightward arrow
uarr2191