ADA Parser

The P-Code

P-Code (or pseudo code) is used by most compilers which take a text file and convert it to a binary executable file. The P-Code is an easy to generate code which will include all the necessary instruction useful to the language. Some versions will include quite special instructions either in link with the language or with the processor being used. For instance, if you have a processor which supports copying a memory buffer to another with a single instruction, it can be a good idea to have a similar instruction in your P-Code to then ease the convertion to assembly. P-Code needs to be generic also in order to be simplified. Most of the optimizations will be accomplished by the optimizer reading the P-Code and writing a new optimized version of P-Code. Later a process will transform the P-Code in a list of assembly instructions.

We have here a list of all the instructions we will use for our ADA compiler. These instructions are for most very self explanatory. We will define higher level instructions (such as copy and bounds) and lower level ones such as add and substract. The P-Code files are always intermediate files. These will be written in texte in order to allow easy debugging of the different compiler parts (parser, optimizer and assembly code generator). Thought we could use mnemonics, it is just as easy today to use full words so as to avoid having to learn yet another language to understand the P-Code.

With ADA, you can remove some of the code that the compiler generates. This is done using the pragma

The instructions are defined with a list of parameters. Comments explain the meaning of the instructions and their parameters. The following table gives a  reference of all the instructions and these are linked to more information when an instruction is complex.

The language also supports labels in order to allow the program flow to change.

Instruction
Parameters
Comments
add
source1,source2,destination
Add the integer or floating point numbers source1 to source2 and save the result in destination.

PROBLEMS:
(a) We need to somehow know the size of the integer or floating point values.
(b) We must have an overflow check (possibly using the processor overflow handler however in a multi-threaded environment it's certainly not a very easy thing to handle. Testing for overflow of integers is easy when no flags are available and most floating units get you flags set in case of floating points overflow, underflow or other errors). So the question is: should this check be part of the instruction knowing that the actual constrain exception may have been cancelled?

SOLUTIONS:
(a) we can extend the instruction set with a dot size (such as .b, .w, .l, .q for instegers and .f, .d for floating points, but we also need to support many formats such as fixed floating points and large integers...) NOTE: large integers may actually be handled as objects and thus not require special handling at this level.
(b) Ha! Again, we could extend the instruction set with a dot check (such as .c); if that dot check is set, then we need to do the overflow check. Otherwise we ignore any overflow. Note that the function should also generate the jump to the exception handler since it would already include a conditional jump.

NOTES:
(a) All the numbers (source1, 2 and destination) will always all be of the same type so there is no need for this instruction to handle any kind of casting (this is done in some other places).
call function

Save the current program pointer and then change the program flow to the specified function name. When the function called ends, the program flow comes back to the saved program pointer + 1 (i.e. after the call).

Note that the way this works can be completly costumized since we have total control of the generated code. For instance, the return address could be saved in a structure and the call could be a jump (possibly conditional which if it doesn't happen may continue somewhere else than if it does happen! This means we could avoid using the processor stack 99% of the time.)

IMPORTANT: note that according to [6.4(10)] the evaluation order of the parameters to a function call is arbitrary meaning that the same call could behave differently between different compilation or different compilers if any of the parameters calls a function with side effects. Here our goal should be to either force the order for our own compiler with the help of pragmas and also generate an error if any of the functions among the list of parameters has an out or in out parameter.



if_jump
condition,label
Change the program flow to the specified label if the variable condition is true. The condition variable must be of type boolean. Note that any conditional is automatically transformed in an if_jump instruction. Thus, when you have a while, until, for, case and if instruction in ADA, you get at least one if_jump instruction.
ifnot_jump
condition,label
This is the same as the if_jump, but the condition is inverted first (i.e. the jump is taken if the condition is false).
jump label Change the program flow to the specified label.
move
source,destination,size
Copy the value of source into destination. The value is composed of size bytes. The source can be a constant. When the source is a pointer and the content of a field at that pointer is necessary, then it will be written between parenthesis. Also, an offset can be specified as in:

move (_task.exception),(_stack.exception_copy),24
push
constant | variable
Push the specified constant or variable on the stack.


Examples:

Making a function call signify to use the call instruction after you pushed the function parameters on the stack. Note that the values are pushed backward and also we always push the _task variable which is used as the global context of the currently running thread. This task variable will include the latest exception generated by the called function.

push v2
push v1
push _task
call func
if_jump _task.exception,exception

Whenever necessary, values will be checked against the range in which they were defined. This is done by calling the "<" and ">" functions on the type of the range. Whenever the "<" and ">" are internally known by the compiler, then an inline simplification can be used (such as comparing two integers).

-- Lower Bound check
push value
push lower_bound
push _tmp_cond
push _task
call "<"
if_jump _task.exception,exception
if_jump _tmp_cond,exception

-- Higher Bound check
push higher_bound
push value
push _tmp_cond
push _task
call "<"
if_jump _task.exception,exception
if_jump _tmp_cond,exception