Welcome to Edukum.com

Introduction to Compiling

Introduction to Compiling
A compiler is a program which reads a program in one language, called as the source language and translate it into an equivalent program in another language, called as the target language. One of the important role of the compiler is to report any errors in the source program detected by it during the translation process. If the target program is an executable machine-language program, then it can be called by the user to process inputs and produce the outputs.


An interpreter is a common kind of language processor. Instead of producing a target program as a translation, an interpreter directly executes the operations specified in the source program on inputs supplied by the user.
Comparison between compiler and interpreter
Compiler is a faster method. The machine language target program produced by a compiler is usually much faster than an interpreter at mapping inputs to outputs. However, an interpreter can usually give better error diagnostics than a compiler, since it executes the source program statement by statement.
The compiler or interpreter must look up each "word" of our programming language in a kind of dictionary (or lexicon) and, in a series of steps, then translate it into machine code. Each word initiates a separate logical task.
An interpreter translates one line of source code at a time into machine code followed by its execution. Debugging and testing is relatively fast and easy in interpreted languages, since the entire program doesn't have to be reprocessed each time a change is made. E.g. BASIC, COBOL, MySQL

Compiler Interpreter

It is a translator which translates high level to low level language.

It is a translator which translates high level to low level language.

It displays the errors after the execution of whole program.

It checks line by line for errors.

Examples include Basic, lower version of Pascal.

Examples include C, C++, Cobol, higher version of Pascal.

Cousins of the Compiler
The pre-processors are those programs which perform a pre-compilation of the source program to expand any macro definitions.
Loader and Linkers:
If the target program is machine code, loaders are used to load the target code into memory for execution. Linkers are used to link target program with the libraries.
Interpreters perform compilation, loading and execution in lock –steps
JIT Compilers:
Just-in-time (JIT) compilers perform complete compilation followed immediately by loading and execution. JIT compilers represent a hybrid approach, with translation occurring continuously, as with interpreters, but with caching of translated code to minimize performance degradation.
Java language processors combine compilation and interpretation, as shown in fig below. A Java source program may first be compiled into an intermediate form called bytecodes. The bytecodes are then interpreted by a virtual machine. A benefit of this arrangement is that bytecodes compiled on one machine can be interpreted on another machine. (Alfred v. aho, p. 2)

In addition to a compiler, several other programs may be required to create an executable target program.

Hybrid Compiler

A source program may be separated into modules stored in separate files. The task of collecting the source program is sometimes entrusted to a separate program, known as a preprocessor. The preprocessor may also expand short hands, called macros, into source language statements.
The modified source program is later fed to a compiler. The compiler may produce an assembly language program as its output, because assembly language is easier to fabricate as output and is easier to debug.
The assembly language is subsequently processed by a program called an assembler, which produces relocatable machine code as its output. Large programs are often compiled in pieces, so the relocatable machine code may have to be linked together with other relocatable object files and library files into the code that actually runs on the machine.
Linker/ Loader
The linker resolves external memory addresses, where the code in one file may refer to a location in another file. The loader then puts together the entire executable object files into memory for execution. In order to achieve faster processing of inputs to outputs, some Java compilers, called just-in-time compilers, translate the bytecodes into machine language immediately before they run the intermediate program to process the input. (Alfred v. aho, Language Processors, p. 3)


Analysis consists of 3 phases:
Linear/Lexical Analysis:
Lexical analysis, also called scanning, is the process of reading the characters from left to right and grouping into tokens which have a collective meaning.
For example, in the assignment statement a = b + c * 2, the characters would be grouped into the following tokens:

  1. The identifier 1 ‘a’
  2. The assignment symbol (=)
  3. The identifier 2 ‘b’
  4. The plus sign (+)
  5. The identifier 3 ‘c’
  6. The multiplication sign (*)
  7. The constant ‘2’
    Syntax Analysis:
    Syntax analysis is called parsing (or hierarchical analysis) which involves the grouping of tokens of the source program into grammatical phrases which are used by the compiler to synthesize output.
    They are represented using a syntax tree as shown below:

A syntax tree is the tree generated as a result of syntax analysis. The interior nodes of the tree are called the operators and the exterior nodes are called the operands.

This analysis shows an error when the syntax is incorrect.

Semantic Analysis:
Semantic analysis checks the source programs for semantic errors and gathers type information for the subsequent code generation phase. It uses the syntax tree to identify the operators and operands of statements.
An important component of semantic analysis is type checking. Here, the compiler checks that each operator has operands that are permitted by the source language specification.\
References Alfred v. aho, M. S. (n.d.). Compiler principles and Techniques. New York: PEARSON Addison Wesley.Learn Compiler Design. (n.d.). Retrieved from TutorialPoint simply easy learning: http://www.tutorialspoint.com/compiler_design/


#Things To Remember