Tiger Parser

This project meets the requirements of the parsing stage of compiling the Tiger language. It uses JavaCC which reads my grammar specification and converts it to a parser written in Java.

The suitable grammar for a recursive descent parser, such as JavaCC, is an LL (1) grammar. The given non-LL (1) grammar below describes the Tiger language:

<Prog>         ->        <Exp> 

<Exp>          ->        <LValue>
               |         integer
               |         nil
               |         string
               |         id(<ArgList>)
               |         <Exp> <BinOp> <Exp>
               |         <UnOp> <Exp>
               |         id{FieldExpList}
               |         (ExpList)
               |         LValue := <Exp>
               |         if <Exp> then <Exp>
               |         if <Exp> then <Exp> else <Exp>
               |         while <Exp> do <Exp>
               |         for id := <Exp> to <Exp> do <Exp>
               |         break
               |         let <DecList> in <ExpList> end
               |         id[Exp] of Exp 

<DecList>      ->        <Dec> <Declist>
               |         ε
<Dec>          ->        <TyDec>
               |         <VarDec>
               |         <FunDec> 

<TyDec>        |         type id = <Ty>
<Ty>           ->        id
               |         {<FieldList>}
               |         array of id 

<VarDec>       ->        var id := <Exp>
               |         var id:id:= <Exp>

<FunDec>       ->        function id(FieldList)=Exp
               |         function id(FieldList):id=<Exp>

<LValue>       ->        id
               |         LValue.id
               |         LValue[Exp] 

<ExpList>      ->        <Exp> ExpList'
               |         ε
<ExpList'>     ->        ; <Exp> ExpList'
               |         ε 

<ArgList>      ->        <Exp> <ArgList'>
               |         ε
<ArgList'>     ->        , <Exp> <ArgList'>  | ε 

<FieldList>    ->        id:id FieldList' | ε
<FieldList'>   ->        , id:id FieldList' | ε
<FieldExpList> ->        id = Exp FieldExpList'
               |         ε
<FieldExpList'> ->       , id = Exp FieldExpList'
                |        ε 

<BinOp>        ->        +
               |         -
               |         *
               |         /
               |         &
               |         |
               |         =
               |         <>
               |         <
               |         <=
               |         >
               |         >= 

<UnOp>         ->        -

LL (1) Grammar

Creating LL (1) grammar which describes the Tiger language, firstly involved dealing with precedence; then left-factoring and then left-recursion elimination. More information on the reasoning of certain aspects of the grammar are included in the code comments.

The rewritten LL(1) grammar is as follows:

<Prog>         ->        <Exp>
<Exp>          ->        <ExpOR> <ExpORPr>
<ExpOR>        ->        <ExpAND> <ExpANDPr>
<ExpAND>       ->        <ArithExp> <RelExp>
<ExpORPr>      ->        | <Exp>
               | ε

<ExpANDPr>     ->        & <ExpOR>
               |         ε 

<ArithExp>     ->        <Term> < TermPr >
<RelExp>       ->        <RelOp ><ArithExp>
               |         ε

<Term>         ->        <Factor> <FactorPr>
<TermPr>       ->        (+|-) <Term> < TermPr >
               |         ε

<FactorPr>     ->        (*| /) <Factor> < FactorPr >
               |         ε 

<Factor>       ->        nil
               |         integer
               |         string
               |         (<ExpList>)
               |         <UnaryOp>  <Exp>
               |         if <Exp> then <Exp> [ else <Exp> ]
               |         while <Exp> do <Exp>
               |         for id := <Exp> to <Exp> do <Exp>
               |         break
               |         let <DecList> in <ExpList> end
               |         <LValue>

<DecList>      ->        {<Dec>}
<Dec>          ->        <TyDec> | <VarDec> | <FunDec>
<TyDec>        ->        type <TypeId> = <Ty>

<Ty>           ->        {<FieldList>}
               |         array of <TypeId>
               |         <TypeId>

<FieldList>    ->        [id : <TypeId> {, id : <TypeId>}]

<FieldExpList> ->        [id = <TypeId> {, id = <TypeId>}]

<TypeId>       ->        id
               |         int
               |         string

<VarDec>       ->        var id [: <TypeId>]:= <Exp>

<FunDec>       ->        function id (<FieldList>) [: <TypeId>] = <Exp> 

<LValue>       ->        id (<FunctionRecordArray> | <FunctionRecordArrayPr> ) 

<FunctionRecordArray>  ->       (<ArgList>)
                       |        {<FieldList>}
                       |        [<Exp>] (of <Exp> | <Av>) 

<FunctionRecordArrayPr>  ->        {(. id | [<Exp>])} [:= <Exp>]

<ExpList>       ->        [<Exp> {; <Exp>}]
<ArgList>       ->        [<Exp> {, <Exp>}]

<RelOp>         ->        =
                |         <>
                |         >
                |         <
                |         >=
                |         <=

<UnaryOp>       ->        -

The above grammar, to my knowledge, describes the grammar of the Tiger language in full.

It does NOT take semantics into consideration, as semantics should be subjected to the semantic analysing stage. For example, this parser does not detect errors such as duplicate variable names, declared variable names, type checking, scope checking, etc.

Constraints

The parser uses two LOOKAHEADs. One is to deal with the if/else issue, and the other to parse LValues. Therefore, the grammar is still LL (1), but not as efficient as possible.

Error Reporting

It was suggested to make use of ErrorMsg.java, a class provided by Appel. However, the exception class ParseException.java provides even better functionality than ErrorMsg.java. It provided a line and column number where an error occurs, and also a list of expected symbols (something which ErrorMsg.java does not do). Hence I did not make use of ErrorMsg.java.

Requirements

Files & Running

To generate the parser, you must first install JavaCC, download the above file, and then:

javacc TigerParser.jj
javac *.java
java TigerParser someTextFile

Evaluation

Having tested the parser against all the programs at http://www.computing.dcu.ie/~hamilton/teaching/CA448/testcases/index.html, I can conclude that the grammar describes the Tiger language in full.

References