Contents Index Search Previous Next
3.5.2 Character Types
Static Semantics
1
{character type}
An enumeration type is said to be a
character
type if at least one of its enumeration literals is a
character_literal.
2
{Latin-1} {BMP}
{ISO 10646} {Character}
The predefined type Character is a character type
whose values correspond to the 256 code positions of Row 00 (also known
as Latin-1) of the ISO 10646 Basic Multilingual Plane (BMP). Each of
the graphic characters of Row 00 of the BMP has a corresponding
character_literal
in Character. Each of the nongraphic positions of Row 00 (0000-001F and
007F-009F) has a corresponding language-defined name, which is not usable
as an enumeration literal, but which is usable with the attributes (Wide_)Image
and (Wide_)Value; these names are given in the definition of type Character
in
A.1, ``
The Package
Standard'', but are set in
italics.
{italics
(nongraphic characters)}
3
{Wide_Character}
{BMP} {ISO
10646} The predefined type Wide_Character
is a character type whose values correspond to the 65536 code positions
of the ISO 10646 Basic Multilingual Plane (BMP). Each of the graphic
characters of the BMP has a corresponding
character_literal
in Wide_Character. The first 256 values of Wide_Character have the same
character_literal or language-defined
name as defined for Character. The last 2 values of Wide_Character correspond
to the nongraphic positions FFFE and FFFF of the BMP, and are assigned
the language-defined names
FFFE and
FFFF. As with the other
language-defined names for nongraphic characters, the names
FFFE
and
FFFF are usable only with the attributes (Wide_)Image and
(Wide_)Value; they are not usable as enumeration literals. All other
values of Wide_Character are considered graphic characters, and have
a corresponding
character_literal.
3.a
Reason: The language-defined
names are not usable as enumeration literals to avoid "polluting"
the name space. Since Wide_Character is defined in Standard, if the names
FFFE and FFFF were usable as enumeration literals, they would hide other
nonoverloadable declarations with the same names in use-d packages.
3.b
ISO 10646 has not defined the
meaning of all of the code positions from 0100 through FFFD, but they
are all considered graphic characters by Ada to simplify the implementation,
and to allow for revisions to ISO 10646. In ISO 10646, FFFE and FFFF
are special, and will never be associated with graphic characters in
any revision.
Implementation Permissions
4
{localization} In
a nonstandard mode, an implementation may provide other interpretations
for the predefined types Character and Wide_Character[, to conform to
local conventions].
Implementation Advice
5
{localization} If
an implementation supports a mode with alternative interpretations for
Character and Wide_Character, the set of graphic characters of Character
should nevertheless remain a proper subset of the set of graphic characters
of Wide_Character. Any character set ``localizations'' should be reflected
in the results of the subprograms defined in the language-defined package
Characters.Handling (see
A.3) available in
such a mode. In a mode with an alternative interpretation of Character,
the implementation should also support a corresponding change in what
is a legal
identifier_letter.
6
23 The language-defined
library package Characters.Latin_1 (see A.3.3)
includes the declaration of constants denoting control characters, lower
case characters, and special characters of the predefined type Character.
6.a
To be honest: The package
ASCII does the same, but only for the first 128 characters of Character.
Hence, it is an obsolescent package, and we no longer mention it here.
7
24 A conventional character
set such as EBCDIC can be declared as a character type; the internal
codes of the characters can be specified by an enumeration_representation_clause
as explained in clause 13.4.
Examples
8
Example of a
character type:
9
type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D', 'M');
Inconsistencies With Ada 83
9.a
{inconsistencies with Ada
83} The declaration of Wide_Character in package
Standard hides use-visible declarations with the same defining identifier.
In the unlikely event that an Ada 83 program had depended on such a use-visible
declaration, and the program remains legal after the substitution of
Standard.Wide_Character, the meaning of the program will be different.
Incompatibilities With Ada 83
9.b
{incompatibilities with Ada
83} The presence of Wide_Character in package
Standard means that an expression such as
9.c
'a' = 'b'
9.d
is ambiguous in Ada 95, whereas
in Ada 83 both literals could be resolved to be of type Character.
9.e
The change in visibility rules
(see 4.2) for character literals means that
additional qualification might be necessary to resolve expressions involving
overloaded subprograms and character literals.
Extensions to Ada 83
9.f
{extensions to Ada 83}
The type Character has been extended to have 256
positions, and the type Wide_Character has been added. Note that this
change was already approved by the ARG for Ada 83 conforming compilers.
9.g
The rules for referencing character
literals are changed (see 4.2), so that the
declaration of the character type need not be directly visible to use
its literals, similar to null and string literals. Context is
used to resolve their type.
Contents Index Search Previous Next Legal