Ray Of Hope

Compiler Phases

A compiler performas following operations

  1.  Lexical analysis
  2. Pre-Processing
  3. Parsing
  4. Semantic Analysis
  5. Code generation
  6. Code optimization
  1. FRONT End : The front end analyzes the source code to build an internal representatin of the program, called the intermediate representation or IR. It also manages symbol table, a data structure mapping each symbol in the source code to associated information such as location type and scope. This is done over several phases , which includes some of the following:
  • Lexical Analysis : breaks the source code text into small pieces called tokens. Eac token is a single unit of the language , for instance a keyword, identifier or symbol name.</li>
  • Pre-processing – Some languages eg C, require a pre-processing phase which supports macro substitution and conditional compilation. Typically the pre-processing phase occurs before syntactic or semantic analysis; eg. in the case of C, the pre-processor manipulates lexical tokens rather thatn syntactic forms.
  • Syntax analysis : involves parsing the token sequence to identify the syntactic structure of the program. This phase typically builds the parse tree which replaces the linear sequence of tokens with a tree structure build according to the rules of a formal grammer which defines the language’s syntax . The parse tree is often analyzed , augmented and transformed by later phases in the compiler .
  • Semantic analysis : is the phase in which the compiler adds semantic information to the parse tree and builds the symbol table. This phase performs semantic checks such as type checking (checking for type errors ) or object building (associating variable and function reference with their definitions) or definite assignment (reuiring all local variables to be initialized before use ), rejecting incorrect programs or issuing warnings.

2. BACK End :


  • Analysis : this is gathering of program information from intermediate representation derived from the input.
  • Optimization : The intermediate language representation is transformed intor functionality equivalent bit faster (or smaller) forms. Popular optimizations are inline expansion , dead code elimination , constant propogation, loop transformation, register allocation or even automatic parallelization.
  • Code generation : The transformed intermediate language is translated into the output language, usually the native machine language of the system. This involves resource and storage decisions , such as deciding which variables to fit into registers and memory and the selection and scheduling of appropriate machine instructions along with their associated addressing modes.

Originally Posted On: 2010-04-10 02:59:49

Anshul Makkar, anshul_makkar@justkernel.com


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.