Phases: Language processor (Toy compiler)
Lexical Analysis (Scanning) [ Phases: Language processor ]
- The lexical analysis identifies the lexical unit in a source statement. Then it classifies the units into different lexical classes. E.g. id’s, constants, keyword etc…And enters then into different tables.
- The most important table is symbol table which contains information concerning all identifiers used in the SP.
- The symbol table is built during lexical analysis.
- The lexical analysis builds a descriptor, called a token. We represent token as Code #nowhere Code can be Id or Op for identifier or operator respectively and no indicates the entry for the identifier or operator in symbol or operator table.
- Consider following code
a, b: real;
a= b + i;
- The statement a=b+i represented as a string of token
Id#1 Op#1 Id#2 Op#2 Id#3
Syntax analysis (parsing) [ Phases: Language processor ]
- Syntax analysis processes the string of token to determine its grammatical structure and builds an intermediate code that represents the structure.
- The tree structure used to represent the intermediate code.
- Consider the statement a = b + i can represent in tree form as
- The semantic analysis determines the meaning of a statement by applying the semantic rules to the structure of the statement.
- While processing a declaration statement, it adds information concerning the type, length, and dimensionality of a symbol to the symbol table.
- While processing an imperative statement, it determines the sequence of actions that would have to perform for implementing the meaning of the statement and represents them in the intermediate code.
- Considering the tree structure for the statement a = b + i
- If a node is an operand, then a type of the operand added in the description field of an operand.
- While evaluating the expression the type of b real and i is int so a type of i is converted to real I*.
- The analysis ends when the tree has been completely processed.
Intermediate representation [ Phases: Language processor ]
- IR contains intermediate code and table.
- Symbol table
- Intermediate code
1. Convert(id1#1) to real, giving (id#4)
2. Add(id#4) to (id#3), giving (id#5)
3. Store (id#5) in (id#2)
Memory allocation [ Phases: Language processor ]
- So The memory requirement of an identifier computed from its type, length and dimensionality and memory allocated to it.
- The address of the memory area entered in the symbol table
Symbol Type length address
Code generation [ Phases: Language processor ]
- The synthesis phase may decide to hold the value of I* and temp in machine registers and may generate the assembly code.
- here given some of the code for
- CONV_R AREG, I
ADD_R AREG, B
MOVEM AREG, A