The text of a program consists of the texts of one or more compilations. The text of each compilation is a sequence of separate lexical elements. Each lexical element is either a delimiter, an identifier (which may be a reserved word), a numeric literal, a character literal, a string literal, or a comment. The effect of a program depends only on the particular sequences of lexical elements that form its compilations, excluding the comments, if any.
In some cases an explicit separator is required to separate adjacent lexical elements (namely, when without separation, interpretation as a single lexical element is possible). A separator is any of a space character, a format effector, or the end of a line. A space character is a separator except within a comment, a string literal, or a space character literal. Format effectors other than horizontal tabulation are always separators. Horizontal tabulation is a separator except within a comment.
The end of a line is always a separator. The language does not define what causes the end of a line. However if, for a given implementation, the end of a line is signified by one or more characters, then these characters must be format effectors other than horizontal tabulation. In any case, a sequence of one or more format effectors other than horizontal tabulation must cause at least one end of line.
One or more separators are allowed between any two adjacent lexical elements, before the first of each compilation, or after the last. At least one separator is required between an identifier or a numeric literal and an adjacent identifier or numeric literal.
A delimiter is either one of the following special characters (in the basic character set)
& ' ( ) * + , - . / : ; < = > |
or one of the following compound delimiters each composed of two adjacent special characters
=> .. ** := /= >= <= << >> <>
Each of the special characters listed for single character delimiters is a single delimiter except if this character is used as a character of a compound delimiter, or as a character of a comment, string literal, character literal, or numeric literal.
The remaining forms of lexical element are described in other sections of this chapter.
Notes:
Each lexical element must fit on one line, since the end of a line is a separator. The quotation, sharp, and underline characters, likewise two adjacent hyphens, are not delimiters, but may form part of other lexical elements.
The following names are used when referring to compound delimiters:
delimiter name => arrow .. double dot ** double star, exponentiate := assignment (pronounced: "becomes") /= inequality (pronounced: "not equal") >= greater than or equal <= less than or equal << left label bracket >> right label bracket <> box
References: character literal, comment, compilation, format effector, identifier, numeric literal, reserved word, space character, special character, string literal.
Rationale references: 2.1 Lexical Structure, 2.2 Textual Structure
Style Guide references: 2.1.1 Horizontal Spacing
Address any questions or comments to adainfo@sw-eng.falls-church.va.us.