1

In the strict mode, the predefined operations of
a floating point type shall satisfy the accuracy requirements specified
here and shall avoid or signal overflow in the situations described.
This behavior is presented in terms of a model of floating point arithmetic
that builds on the concept of the canonical form (see A.5.3).

2

Associated with each floating point type is an infinite
set of model numbers. The model numbers of a type are used to define
the accuracy requirements that have to be satisfied by certain predefined
operations of the type; through certain attributes of the model numbers,
they are also used to explain the meaning of a user-declared floating
point type declaration. The model numbers of a derived type are those
of the parent type; the model numbers of a subtype are those of its type.

3

The *model numbers* of a
floating point type T are zero and all the values expressible in the
canonical form (for the type T), in which *mantissa* has T'Model_Mantissa
digits and *exponent* has a value greater than or equal to T'Model_Emin.
(These attributes are defined in G.2.2.)

3.a

3.b

In hardware that is free of arithmetic anomalies,
T'Model_Mantissa, T'Model_Emin, T'Safe_First, and T'Safe_Last will yield
the same values as T'Machine_Mantissa, T'Machine_Emin, T'Base'First,
and T'Base'Last, respectively, and the model numbers in the safe range
of the type T will coincide with the machine numbers of the type T. In
less perfect hardware, it is not possible for the model-oriented attributes
to have these optimal values, since the hardware, by definition, and
therefore the implementation, cannot conform to the stringencies of the
resulting model; in this case, the values yielded by the model-oriented
parameters have to be made more conservative (i.e., have to be penalized),
with the result that the model numbers are more widely separated than
the machine numbers, and the safe range is a subrange of the base range.
The implementation will then be able to conform to the requirements of
the weaker model defined by the sparser set of model numbers and the
smaller safe range.

4

A *model interval* of a
floating point type is any interval whose bounds are model numbers of
the type. The *model interval* of a type T *associated
with a value* *v* is the smallest model interval of T that includes
*v*. (The model interval associated with a model number of a type
consists of that number only.)

5

The accuracy requirements for the evaluation of certain
predefined operations of floating point types are as follows.

5.a

6

An *operand interval* is
the model interval, of the type specified for the operand of an operation,
associated with the value of the operand.

7

For any predefined
arithmetic operation that yields a result of a floating point type T,
the required bounds on the result are given by a model interval of T
(called the *result interval*) defined in terms of the operand values
as follows:

8

The result interval is the
smallest model interval of T that includes the minimum and the maximum
of all the values obtained by applying the (exact) mathematical operation
to values arbitrarily selected from the respective operand intervals.

9

The result interval of an exponentiation is obtained
by applying the above rule to the sequence of multiplications defined
by the exponent, assuming arbitrary association of the factors, and to
the final division in the case of a negative exponent.

10

The result interval of a conversion of a numeric
value to a floating point type T is the model interval of T associated
with the operand value, except when the source expression is of a fixed
point type with a *small* that is not a power of T'Machine_Radix
or is a fixed point multiplication or division either of whose operands
has a *small* that is not a power of T'Machine_Radix; in these cases,
the result interval is implementation defined.

10.a

11

For any of
the foregoing operations, the implementation shall deliver a value that
belongs to the result interval when both bounds of the result interval
are in the safe range of the result type T, as determined by the values
of T'Safe_First and T'Safe_Last; otherwise,

12

if T'Machine_Overflows is True,
the implementation shall either deliver a value that belongs to the result
interval or raise Constraint_Error;

13

if T'Machine_Overflows is False, the result is
implementation defined.

13.a

14

For any predefined relation on operands of a floating
point type T, the implementation may deliver any value (i.e., either
True or False) obtained by applying the (exact) mathematical comparison
to values arbitrarily chosen from the respective operand intervals.

15

The result of a membership test is defined in terms
of comparisons of the operand value with the lower and upper bounds of
the given range or type mark (the usual rules apply to these comparisons).

16

If the underlying floating point hardware implements
division as multiplication by a reciprocal, the result interval for division
(and exponentiation by a negative exponent) is implementation defined.

16.a

16.b

The Ada 95 model numbers of a floating point
type that are in the safe range of the type are comparable to the Ada
83 safe numbers of the type. There is no analog of the Ada 83 model numbers.
The Ada 95 model numbers, when not restricted to the safe range, are
an infinite set.

16.c

Giving the model numbers
the hardware radix, instead of always a radix of two, allows (in conjunction
with other changes) some borderline declared types to be represented
with less precision than in Ada 83 (i.e., with single precision, whereas
Ada 83 would have used double precision). Because the lower precision
satisfies the requirements of the model (and did so in Ada 83 as well),
this change is viewed as a desirable correction of an anomaly, rather
than a worrisome inconsistency. (Of course, the wider representation
chosen in Ada 83 also remains eligible for selection in Ada 95.)

16.d

As an example of this phenomenon, assume that
Float is represented in single precision and that a double precision
type is also available. Also assume hexadecimal hardware with clean properties,
for example certain IBM hardware. Then,

16.e

16.f

results in T being represented in double precision
in Ada 83 and in single precision in Ada 95. The latter is intuitively
correct; the former is counterintuitive. The reason why the double precision
type is used in Ada 83 is that Float has model and safe numbers (in Ada
83) with 21 binary digits in their mantissas, as is required to model
the hypothesized hexadecimal hardware using a binary radix; thus Float'Last,
which is not a model number, is slightly outside the range of safe numbers
of the single precision type, making that type ineligible for selection
as the representation of T even though it provides adequate precision.
In Ada 95, Float'Last (the same value as before) is a model number and
is in the safe range of Float on the hypothesized hardware, making Float
eligible for the representation of T.

16.g

Giving the model numbers
the hardware radix allows for practical implementations on decimal hardware.

16.h

The wording of the model of floating point arithmetic
has been simplified to a large extent.

Ada 2005 and 2012 Editions sponsored in part by **Ada-Europe**