Cormac Redmond
- Home
- Oracle
- Projects & Sites
- DaftDrop.com – Irish property price tracker
- FileTrack (Multilingual IR) Functional Spec
- FlightSim (Java3D)
- Forum on Public Procurement
- HealthSpa (using Jini’s SOA)
- Interactive Kitchen (Java3D)
- O2 Text.com
- S3OnTheGo (advanced backup system)
- Tiger Lexical Analyser
- Tiger Parser
- Unisex Bathroom Problem
- Wicklow PC Services
- CV
- About / Contact
Tiger Parser
This project meets the requirements of the parsing stage of compiling the Tiger language. It uses JavaCC which reads my grammar specification and converts it to a parser written in Java.
The suitable grammar for a recursive descent parser, such as JavaCC, is an LL (1) grammar. The given non-LL (1) grammar below describes the Tiger language:
<Prog> -> <Exp>
<Exp> -> <LValue>
| integer
| nil
| string
| id(<ArgList>)
| <Exp> <BinOp> <Exp>
| <UnOp> <Exp>
| id{FieldExpList}
| (ExpList)
| LValue := <Exp>
| if <Exp> then <Exp>
| if <Exp> then <Exp> else <Exp>
| while <Exp> do <Exp>
| for id := <Exp> to <Exp> do <Exp>
| break
| let <DecList> in <ExpList> end
| id[Exp] of Exp
<DecList> -> <Dec> <Declist>
| ε
<Dec> -> <TyDec>
| <VarDec>
| <FunDec>
<TyDec> | type id = <Ty>
<Ty> -> id
| {<FieldList>}
| array of id
<VarDec> -> var id := <Exp>
| var id:id:= <Exp>
<FunDec> -> function id(FieldList)=Exp
| function id(FieldList):id=<Exp>
<LValue> -> id
| LValue.id
| LValue[Exp]
<ExpList> -> <Exp> ExpList'
| ε
<ExpList'> -> ; <Exp> ExpList'
| ε
<ArgList> -> <Exp> <ArgList'>
| ε
<ArgList'> -> , <Exp> <ArgList'> | ε
<FieldList> -> id:id FieldList' | ε
<FieldList'> -> , id:id FieldList' | ε
<FieldExpList> -> id = Exp FieldExpList'
| ε
<FieldExpList'> -> , id = Exp FieldExpList'
| ε
<BinOp> -> +
| -
| *
| /
| &
| |
| =
| <>
| <
| <=
| >
| >=
<UnOp> -> -
LL (1) Grammar
Creating LL (1) grammar which describes the Tiger language, firstly involved dealing with precedence; then left-factoring and then left-recursion elimination. More information on the reasoning of certain aspects of the grammar are included in the code comments.
The rewritten LL(1) grammar is as follows:
<Prog> -> <Exp>
<Exp> -> <ExpOR> <ExpORPr>
<ExpOR> -> <ExpAND> <ExpANDPr>
<ExpAND> -> <ArithExp> <RelExp>
<ExpORPr> -> | <Exp>
| ε
<ExpANDPr> -> & <ExpOR>
| ε
<ArithExp> -> <Term> < TermPr >
<RelExp> -> <RelOp ><ArithExp>
| ε
<Term> -> <Factor> <FactorPr>
<TermPr> -> (+|-) <Term> < TermPr >
| ε
<FactorPr> -> (*| /) <Factor> < FactorPr >
| ε
<Factor> -> nil
| integer
| string
| (<ExpList>)
| <UnaryOp> <Exp>
| if <Exp> then <Exp> [ else <Exp> ]
| while <Exp> do <Exp>
| for id := <Exp> to <Exp> do <Exp>
| break
| let <DecList> in <ExpList> end
| <LValue>
<DecList> -> {<Dec>}
<Dec> -> <TyDec> | <VarDec> | <FunDec>
<TyDec> -> type <TypeId> = <Ty>
<Ty> -> {<FieldList>}
| array of <TypeId>
| <TypeId>
<FieldList> -> [id : <TypeId> {, id : <TypeId>}]
<FieldExpList> -> [id = <TypeId> {, id = <TypeId>}]
<TypeId> -> id
| int
| string
<VarDec> -> var id [: <TypeId>]:= <Exp>
<FunDec> -> function id (<FieldList>) [: <TypeId>] = <Exp>
<LValue> -> id (<FunctionRecordArray> | <FunctionRecordArrayPr> )
<FunctionRecordArray> -> (<ArgList>)
| {<FieldList>}
| [<Exp>] (of <Exp> | <Av>)
<FunctionRecordArrayPr> -> {(. id | [<Exp>])} [:= <Exp>]
<ExpList> -> [<Exp> {; <Exp>}]
<ArgList> -> [<Exp> {, <Exp>}]
<RelOp> -> =
| <>
| >
| <
| >=
| <=
<UnaryOp> -> -
The above grammar, to my knowledge, describes the grammar of the Tiger language in full.
It does NOT take semantics into consideration, as semantics should be subjected to the semantic analysing stage. For example, this parser does not detect errors such as duplicate variable names, declared variable names, type checking, scope checking, etc.
Constraints
The parser uses two LOOKAHEADs. One is to deal with the if/else issue, and the other to parse LValues. Therefore, the grammar is still LL (1), but not as efficient as possible.
Error Reporting
It was suggested to make use of ErrorMsg.java, a class provided by Appel. However, the exception class ParseException.java provides even better functionality than ErrorMsg.java. It provided a line and column number where an error occurs, and also a list of expected symbols (something which ErrorMsg.java does not do). Hence I did not make use of ErrorMsg.java.
Requirements
- JavaCC
- Java 1.5 or later
Files & Running
To generate the parser, you must first install JavaCC, download the above file, and then:
javacc TigerParser.jj
javac *.java
java TigerParser someTextFile
Evaluation
Having tested the parser against all the programs at http://www.computing.dcu.ie/~hamilton/teaching/CA448/testcases/index.html, I can conclude that the grammar describes the Tiger language in full.
References
- http://ag-kastens.uni-paderborn.de/lehre/material/uebi/parsdemo/
- http://cs.gettysburg.edu/~tneller/cs374/hw3.html
- https://babel.ls.fi.upm.es/local/cgi-bin/mtp/trac.cgi/browser/trunk/examples/tiger.pdf
- http://www.computing.dcu.ie/~hamilton/teaching/CA448/assignments.html
- http://www.computing.dcu.ie/~hamilton/teaching/CA448/testcases/index.html