Identifiers

{AI95-00395-01} {AI05-0091-1} An After eliminating the characters in category other_format, an identifier shall not contain two consecutive characters in category punctuation_connector punctuation_connector, or end with a character in that category. An identifier shall not be a reserved word.

Reason: This rule was stated in the syntax in Ada 95, but that has gotten too complex in Ada 2005. Since other_format characters usually do not display, we do not want to count them as separating two underscores.

Static Semantics

{AI95-00285-01} {AI05-0091-1} {AI05-0227-1} Two identifiers are considered the same if they consist of the same sequence of characters after applying locale-independent simple case folding, as defined by documents referenced in the note in section 1 of ISO/IEC 10646:2003. the following transformations (in this order): All characters of an identifier are significant, including any underline character. Identifiers differing only in the use of corresponding upper and lower case letters are considered the same.

Discussion: {AI05-0227-1} Simple case folding is a mapping to lower case, so this is matching the defining (lower case) version of a reserved word. We could have mentioned case folding of the reserved words, but as that is an identity function, it would have no effect. Two of the letters of ISO 8859-1 appear only as lower case, "sharp s" and "y with diaeresis." These two letters have no corresponding upper case letter (in particular, they are not considered equivalent to one another).

{AI05-0227-1} The “documents referenced” means Unicode. Note that simple case folding is supposed to be compatible between Unicode versions, so the Unicode version used doesn't matter.

Implementation Note: We match the reserved words after applying case folding doing these transformations so that the rules for identifiers and reserved words are the same. (This allows other_format characters, which usually don't display, in a reserved word without changing it to an identifier.) Since a compiler usually will lexically process identifiers and reserved words the same way (often with the same code), this will prevent a lot of headaches.

Ramification: {AI05-0227-1} The rules for reserved words differ in one way: they define case conversion on letters rather than sequences. This means that it is possible that there exist some unusual sequences that are neither identifiers nor reserved words. We are not aware of any such sequences so long as we use simple case folding (as opposed to full case folding), but we have defined the rules in case any are introduced in future character set standards. This originally was a problem when converting to upper case: For instance, “ıf” and “acceß” have upper case conversions of “IF” and “ACCESS” respectively. We would not want these to be treated as reserved words. But neither of these cases exist when using simple case folding. These are not identifiers, because the transformed values are identical to a reserved word. But they are not reserved words, either, because the original values do not match any reserved word as defined or with any number of characters of the reserved word in upper case. Thus, these odd constructions are just illegal, and should not appear in the source of a program.

Implementation Permissions

Discussion: {AI95-00285-01} {AI05-0227-1} For instance, in most languages, the simple case folded uppercase equivalent of LATIN CAPITAL SMALL LETTER I ( an upper a lower case letter without with a dot above) is LATIN SMALL CAPITAL LETTER I ( a lower an upper case letter with without a dot above). In Turkish, though, LATIN CAPITAL SMALL LETTER I and LATIN CAPITAL SMALL LETTER DOTLESS I WITH DOT ABOVE are two distinct letters, so the case folded upper case equivalent of LATIN CAPITAL SMALL LETTER I is LATIN SMALL CAPITAL LETTER DOTLESS I WITH DOT ABOVE, and the case folded upper case equivalent of LATIN CAPITAL SMALL LETTER DOTLESS I WITH DOT ABOVE is LATIN SMALL CAPITAL LETTER I. Take for instance the following identifier (which is the name of a city on the Tigris river in Eastern Anatolia):

A Turkish reader would expect that the above identifier is equivalent to Locale-independent conversion to upper case results in:

However, locale-independent simple case folding (and thus Ada) maps this to:

which is different from any of the following identifiers This means that the four following sequences of characters represent the same identifier, even though for a locutor of Turkish they would probably be considered distinct words:

including the “correct” matching identifier for Turkish. Upper case conversion (used in '[Wide_]Wide_Image) introduces additional problems.

An implementation targeting the Turkish market is allowed (in fact, expected) to provide a nonstandard mode where case folding is appropriate for Turkish. This would cause the original identifier to be converted to:

and the four sequences of characters shown above would represent four distinct identifiers.

Lithuanian and Azeri are two other languages that present similar idiosyncrasies.

Examples

Wording Changes from Ada 83

We no longer include reserved words as identifiers. This is not a language change. In Ada 83, identifier included reserved words. However, this complicated several other rules (for example, regarding implementation-defined attributes and pragmas, etc.). We now explicitly allow certain reserved words for attribute designators, to make up for the loss.

Ramification: Because syntax rules are relevant to overload resolution, it means that if it looks like a reserved word, it is not an identifier. As a side effect, implementations cannot use reserved words as implementation-defined attributes or pragma names.

Extensions to Ada 95

{AI95-00285-01} An identifier can use any letter defined by ISO-10646:2003, along with several other categories. This should ease programming in languages other than English.

Incompatibilities With Ada 2005

{AI05-0091-1} Correction: other_format characters were removed from identifiers as the Unicode recommendations have changed. This change can only affect programs written for the original Ada 2005, so there should be few such programs.

{AI05-0227-1} Correction: We now specify simple case folding rather than full case folding. That potentially could change identifier equivalence, although it is more likely that identifiers that are considered the same in original Ada 2005 will now be considered different. This change was made because the original Ada 2005 definition was incompatible (and even inconsistent in unusual cases) with the Ada 95 identifier equivalence rules. As such, the Ada 2005 rules were rarely fully implemented, and in any case, only Ada 2005 identifiers containing wide characters could be affected.