2.3 Identifiers
1
Identifiers are used as
names.
Syntax
2/2
{
AI95-00285-01}
{
AI95-00395-01}
identifier ::=
identifier_start {identifier_start | identifier_extend} identifier_letter {[underline] letter_or_digit}
3/2
{
AI95-00285-01}
{
AI95-00395-01}
identifier_start letter_or_digit ::=
letter_uppercase
| letter_lowercase
| letter_titlecase
| letter_modifier
| letter_other
| number_letter identifier_letter | digit
3.1/2
{
AI95-00285-01}
{
AI95-00395-01}
identifier_extend ::=
mark_non_spacing
| mark_spacing_combining
| number_decimal
| punctuation_connector
| other_format
4/2
{
AI95-00395-01}
After eliminating the characters in category other_format,
an identifier shall not contain two consecutive
characters in category punctuation_connector, or end with a character
in that category. An identifier
shall not be a reserved word.
4.a/2
Reason: This rule
was stated in the syntax in Ada 95, but that has gotten too complex in
Ada 2005. Since other_format characters usually
do not display, we do not want to count them as separating two underscores.
Static Semantics
5/2
{
AI95-00285-01}
Two identifiers are
considered the same if they consist of the
same sequence of characters after applying the following transformations
(in this order): All characters of an identifier
are significant, including any underline character. {case
insensitive} Identifiers
differing only in the use of corresponding upper and lower case letters
are considered the same.
5.1/2
- {AI95-00285-01}
The characters in category other_format
are eliminated.
5.2/2
5.a/2
This paragraph
was deleted.Discussion: {
AI95-00285-01}
Two of the letters of ISO 8859-1 appear only as
lower case, "sharp s" and "y with diaeresis." These
two letters have no corresponding upper case letter (in particular, they
are not considered equivalent to one another).
5.3/2
{
AI95-00395-01}
After applying these transformations, an identifier
shall not be identical to a reserved word (in upper case).
5.b/2
Implementation Note:
We match the reserved words after doing these transformations so
that the rules for identifiers and reserved
words are the same. (This allows other_format
characters, which usually don't display, in a reserved word without changing
it to an identifier.) Since a compiler usually
will lexically process identifiers and reserved
words the same way (often with the same code), this will prevent a lot
of headaches.
5.c/2
Ramification: The
rules for reserved words differ in one way: they define case conversion
on letters rather than sequences. This means that some unusual sequences
are neither identifiers nor reserved words.
For instance, “ıf” and “acceß” have
upper case conversions of “IF” and “ACCESS” respectively.
These are not identifiers, because the transformed
values are identical to a reserved word. But they are not reserved words,
either, because the original values do not match any reserved word as
defined or with any number of characters of the reserved word in upper
case. Thus, these odd constructions are just illegal, and should not
appear in the source of a program.
Implementation Permissions
6
In a nonstandard mode, an implementation may support
other upper/lower case equivalence rules for identifiers[,
to accommodate local conventions].
6.a/2
Discussion:
{
AI95-00285-01}
For instance, in most languages, the uppercase
equivalent of LATIN SMALL LETTER I (a lower case letter with a dot above)
is LATIN CAPITAL LETTER I (an upper case letter without a dot above).
In Turkish, though, LATIN SMALL LETTER I and LATIN SMALL LETTER DOTLESS
I are two distinct letters, so the upper case equivalent of LATIN SMALL
LETTER I is LATIN CAPITAL LETTER I WITH DOT ABOVE, and the upper case
equivalent of LATIN SMALL LETTER DOTLESS I is LATIN CAPITAL LETTER I.
Take for instance the following identifier (which is the name of a city
on the Tigris river in Eastern Anatolia):
6.b/2
diyarbakır -- The first i is dotted, the second isn't.
6.c/2
Locale-independent
conversion to upper case results in:
6.d/2
DIYARBAKIR -- Both Is are dotless.
6.e/2
This
means that the four following sequences of characters represent the same
identifier, even though for a locutor of Turkish they would probably
be considered distinct words:
6.f/2
diyarbakir
diyarbakır
dıyarbakir
dıyarbakır
6.g/2
An
implementation targeting the Turkish market is allowed (in fact, expected)
to provide a nonstandard mode where case folding is appropriate for Turkish.
This would cause the original identifier to be converted to:
6.h/2
DİYARBAKIR -- The first I is dotted, the second isn't.
6.i/2
and the four sequences
of characters shown above would represent four distinct identifiers.
6.j/2
Lithuanian and Azeri are
two other languages that present similar idiosyncrasies.
6.1/2
3 {
AI95-00285-01}
Identifiers differing
only in the use of corresponding upper and lower case letters are considered
the same.
Examples
7
Examples of identifiers:
8/2
{
AI95-00433-01}
Count X Get_Symbol Ethelyn Marion
Snobol_4 X1 Page_Count Store_Next_Item
Πλάτων -- Plato
Чайковский -- Tchaikovsky
θ φ -- Angles
Wording Changes from Ada 83
8.a
We no longer include reserved words as identifiers.
This is not a language change. In Ada 83, identifier
included reserved words. However, this complicated several other rules
(for example, regarding implementation-defined attributes and pragmas,
etc.). We now explicitly allow certain reserved words for attribute designators,
to make up for the loss.
8.b
Ramification: Because syntax rules are
relevant to overload resolution, it means that if it looks like a reserved
word, it is not an identifier. As a side effect,
implementations cannot use reserved words as implementation-defined attributes
or pragma names.
Extensions to Ada 95
8.c/2
{
AI95-00285-01}
{extensions to Ada 95} An
identifier can use any letter defined by ISO-10646:2003,
along with several other categories. This should ease programming in
languages other than English.