where
Num is any numeric literal
(as written in either Java or Racket, your choice1).
For the provided parsers to work,
whitespace is required between all terminals
with the exception of punctuation.
Semantics (interpretation):
A Paren is a parenthesized expression (used for grouping only);
A Binop is a binary operator expression, in infixprefix;
boii means addition;
boi means subtraction;
boiii means multiplication.
A Parity evaluates its first expression;
if it is even then the result is the second expression (evaluated),
else the result is the third expression (evaluated).
Discussion
Give five example programs (Exprs) written in T0.
Give their parse trees.
How should we represent these trees, internally?
(In Racket, and equivalently in Java).
Where we're headed
For interpreting this language, we will implement three methods:
parse, which turns source-code
into an internal representation,
toString which turns internal representation back
into source-code,
and
eval, which actually interprets the program,
returning a value (a number for T0-T2, and a number-or-function in T3).
I recommend implementing and testing the methods in this order.
Some test-cases will be provided.
There might be other helper functions to implement as well.
Note: Download the racket files, and then choose Open… from DrRacket.
Do not copy/paste into an empty DrRacket window, since that
window is probably using a student-language, and some of the files below use full racket.
Java
T0-java/ (browse the directory), or see the .jar.
Note the composite pattern to implement union-types, in Java:
The classes Expr and its four variants Num/BinOp/Parity/Paren
are organized in the same way
AncTree and its two variants Child/Unknown were.
Discuss the implementation
Once we've talked in class about internal-representation
(and given examples of the T programs and corresponding internal-data),
then we can discuss the provided-implementation,
including recursive-descent parsing:
It does not exactly follow the the design recipe
(because it processes a string/scanner, not Exprs),
yet the code's shape is still extremely correlated to the
T0 grammar/data-definition.
This is called “recursive descent” parsing.
Note: Parsing T0 is particularly easy because each expression-type's
leading-token lets us know exactly what to expect,
removing the need for look-ahead, backtracking, or
cleverer algorithms.
The Java code is essentially the same,
up to class Scanner working a bit differently.
video (12m19s)
.
Optional, but helpful for your own test cases:
Writing test-cases the regular, same-ol’ same-ol’ way,
and then realizing what we can re-factor into
easier-to-write tests,
via an “test-harness” specific to this assignment.
In racket:
video (14m14s)
The same test-harness, in Java
(slightly more involved, and less flexible):
video (8m08s)
1 This is so
we can just use our language's built-in number-parsing functions,
without getting bogged down in tokening input.
So racket implementations will allow exactly those strings recognized by number?,
(including +nan.0, -inf.0, and 2+3i).
Similarly,
if using Java,
the semantics of T0's arithmetic will be similar
to IEEE floating point arithmetic
(rather than perfectly-correct arithmetic).
Don't confuse T0's class Num (which extends Expr)
with the
existing java.lang.Number, which doesn't extend Expr.