hw07
Interpreting M
project

Only problems 1 and 2 will be required. Problems 3 and 4 are extra credit. As a project, it is due May.07 (Thu) 23:59 sharp, no exceptions.

For this project, we'll implement a “tower” of languages (M0, M1, M2, M3, and M4), each language incorporating more features than the previous. M0 is provided for you. You may write your interpreter in either in Java or in scheme.

For each language, you must implement three methods:
- parse, which turns source-code into an internal representation,
- toString which turns internal representation back into source-code,
- eval, which actually interprets the program, returning a value (a number for M0-M2, and a number-or-function in M3).
I recommend implementing and testing the methods in this order. See the provided test-cases, for examples. There might be other helper functions to implement as well.
Your solution may be in Java or in Scheme (or see me, if you want to use a different language). See the comments in the provided files, for language-specific differences/nuances.
You do not need to worry about error-checking -- we will only concern ourselves with syntactic M0 (etc.) programs.
(As a matter of defensive programming, you might still want to add some assertions/sanity-checks in your code.)
Use the WebCT discussion board for group questions and answers.
This project is based on the first chapters of Programming Languages and Interpretation, by Shriram Krishnamurthi. As a result, this homework assignment is covered by the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. Although we're using a different dialect of scheme than that book, you might find it helpful to skim it.
Turn-in instructions to be announced.

The language M0

% M0 syntax:
% An Expr is one of:
Expr ::= Num | SumExpr | ProdExpr | IfZeroExpr 

SumExpr ::= (Expr + Expr)
ProdExpr ::= (Expr * Expr)
IfZeroExpr ::= ifZero( Expr, Expr, Expr)

See lecture notes for test cases and code.

A Num can be any conventional ¹ decimal numeral (and will be interpreted as a number in the standard ² way).

Here is a Scheme interpreter for M0 along with helper functions for a scanner for scheme (with an a scanner demo), and an optional test harness
Here is a java interpreter for M0 (hw07-M0-java.jar or browse); it includes unit tests, and helpful Java parsing functions (with example code (as .jar) from lecture).

(20pts) M1 is just like M0, but with two additional types of expressions:
Expr ::= … | ModExpr | IfNegExpr ModExpr ::= ( Expr mod Expr ) IfNegExpr ::= ifNeg( Expr, Expr, Expr )
Update parse, toString (a.k.a. expr->string), and eval appropriately.

The only method which cares what these new expressions mean (their semantics) is eval: ifNeg( Expr₀, Expr₁, Expr₂) first evaluates just Expr₀; if that value is negative, then it evaluates Expr₁ and returns its value; otherwise it evalutaes Expr₂ and returns its value. (Note how you are making sure that ifNeg short-circuits.)

(a mod b) should evaluate to ³ a mod b, where the result is always between 0 (inclusive) and b (exclusive); in particular, the result is never positive if b<0. Notice that this is slightly different behavior than either Java's built-in % (which behaves differently on negative numbers), and from Scheme's built-in modulo (which only accepts integers). In Java, you can use a%b (if b>0 or a%b=0) and a%b+b (otherwise). In scheme or Java, you can use b*(a/b-floor(a/b))

Hint: Most all the code for this problem is virtually identical to existing code. The only exception (and it's a mild one) is the eval for ModExprs, since the formula is a bit more involved.

You can paste/convert the M1 java test cases into the same format as the ones you've been using at the bottom of hw07-M0.ss.
(40pts) M2 adds identifiers to M1:
Expr ::= … | Id | WithExpr WithExpr ::= let Id be (Expr) in (Expr)
where Id can be any series of letters and digits which isn't interpretable as a number⁴.

Update your three methods parse, toString (a.k.a. expr->string), eval. We now need to define the semantics of let Id be (E₀) in (E₁):
- Evaluate E₀, calling the result v₀.
- Then, substitute v₀ for all free occurrences of Id inside E₁; name the result ⁶ E′
- Evaluate E′, and return that result.
The code to make a substitution in an Expr parse-tree is reminiscent of hw05's subst-in-tree (except that a list had only two cond-branches, while Expr has around ten (most of which are similar)). For example: let x = 5 in (x+3) → (5+3) → 8⁷.

However, there is one wrinkle: What if E₁ contains some non-free (“bound”) occurrences of Id?
That is, wat if E₁ contains a (nested) “let x be (…) in (…)”? In that case the inner one should shadow the outer one:
let x be (3) in (let x be (5) in ((x + 3))) ⇒ let x be (5) in ((x + 3)) ⇒ (5 + 3) ⇒ 8
Thus, when substituting, only substitute “free occurrences” of an Id in E₁, not any “bound occurrences”⁸. Think through exactly what you need to substitute, in the following examples:
let y be (3) in (let x be (5) in ((x + y))) ⇒ ⇒ ⇒ 8 let y be (3) in (let x be (y) in ((x + y))) ⇒ ⇒ ⇒ 6
let x be (5) in (let y be (3) in ((let x be (y) in ((x + y)) + x))) ⇒ ⇒ ⇒ ⇒ ⇒ 11
(You might find it helpful to draw the parse tree, rather than staring at the flat string, especially for this last example.)

Observe that when evaluating a (legal) M2 program, eval will never actually encounter an Id -- that Id will have been substituted out before we ever recur down to it.

Scheme test cases here; hw07-M2-tests.ss for Java test cases, uncomment the 'M2' section of ExprTest.java.
(40pts) M3 adds functions and function-application to our language:
Expr ::= … | FuncExpr | FuncApplyExpr FuncExpr ::= func(Id, Expr) FuncApplyExpr ::= Expr(Expr)
We restrict ourselves to unary functions (functions of one argument). (To think about: if we wanted to 'fake' functions of 2 arguments in M3, how could we do it?)

Just like numbers are self-evaluating, so are FuncExprs. If evaluating (an internal representation of) a FuncExpr, just return that same (internal representation of the) function. We won't actually evaluate the body until the function is applied.

In FuncApplyExpr, the first Expr had better evaluate to a function. (That is, it's either a FuncExpr, or it's an Id which gets substitued to a function value.) For example, here is a function for computing absolute value:
func( x, ifNeg( x, (-1 * x), x))
A funcApplyExpr represents calling a function. Here are two expressions, both computing the absolute value of -5:
func(x, ifNeg( x, (-1 * x), x))(-5) let abs be (func(x, ifNeg(x,(-1*x),x) )) in (abs(-5))
To evaluate a function-application Expr₀( Expr₁):
- Evaluate Expr₀; let's call the result f. (f had better be a function-value!)
- Evaluate Expr₁; let's call the result arg.
- Substitute f's parameter with arg in f's body; call this new expression E′.
- Evaluate E′ and return that value.
Hey, those semantics are practically the same as let!

Observe that our programs can now evaluate to either of two types: numbers or functions. In Java, we'll need to have a class which can represent either, as the return type for our eval.

You can paste/convert the M3 java test cases into the same format as the ones you've been using at the bottom of hw07-M0.ss.
(35pts extra credit: This problem and the next are really the crux of the assignment, but due to the impending end-of-semester, it will be extra credit.)
Deferred evaluation: L4 doesn't add any new syntax, but it is a different evaluation model which will give us more expressive semantics. Use a new file/project for L4, separate from M0-M3.

There are two problems ⁹ with the substitution approach used above: we can't make recursive functions, and it doesn't generalize to assigning to variables. We solve these problems with deferred evaluation:
- When interpreting a program, eval will take two arguments: the program to interpret, and a list of bindings (Ids and their associated values).
- Upon encountering an Id, we just look up its value in our list of bindings. (Recall that in M3, eval never actually reached an Id; in L4 it does.)
- In this approach, function-application and with each introduce (exactly one) binding; in these cases we'll just add the binding to our list ¹⁰ and proceed recursively.
- For full extra credit, make sure that you can evaluate recursive functions. (This means our ‘with’ must have letrec-like semantics.)
Note that this gives us dynamic scoping.
(35pts) Extra credit: static scope. Extend functions so that they include one extra piece of information: the bindings in effect when the function was declared. Be sure to make a careful suite of test-cases, to test for various ways of capturing bindings in different closures. (For example, consider the midterm's problem for testing scope.)
Further extra-credit options (of varying difficult, hence points):
- Add comments to the grammar. (Make sure they nest.) (An easier version would require that even within comments, any parentheses are balanced, to leverage scheme's reader.) It's up to you whether or not you store comments in the internal representation, or discared them entirely.
- Re-write the code for evaluating with so that it simply creates a function-application and evaluates that.
- Generalize functions to multiple arity; generalize with so that it takes any number of id/value pairs (call it with*).
- If you want, modify the with syntax: instead of let id be (Expr) in (Expr), you can use Id = Expr ; Expr ; . (Note that this still represent declaring a new variable, not mutating an existing one.)
- Add mutation (i.e. assigning to Ids).

The language M1

Add the following to M0:

The language M2

The language M3

¹ To keep from getting bogged down in tokenizing input, Java implementations can use exactly those strings recognized by java.util.Scanner.nextDouble() (including NaN) and Scheme implementations can use exactly those strings recognized by number?, (including +nan.0, -inf.0, and 2+3i). ↩

² For ease of implementation, we'll allow Java implementations to use floating-point arithmetic (doubles) instead of actual arithmetic. ↩

³ Because we don't need to check for bad inputs, it's fine to have your interpreter crash if b=0. ↩

⁴ Note that our different implementations are now varying by more than just precision of arithmetic: in a Java implementation, NaN is a Num, and in a scheme implementation it's an Id. We won't use any test cases involving such subtle differences. However, note how our choices in designing a new language are being influenced by the language we're trying to easily implement it in! This stems from the fact that a primary design constraint on M1 is that implementing an intepreter for M1 doesn't get bogged down in minutae when using either Java or Scheme. ↩

⁶ Okay -- if you implement the union data type of Exprs by using the Strategy pattern, then you can regain the simplicity of the functional approach in this case. ↩

⁵In particular, even if using Java, you'll want do the substitution using a functional approach: have a method which returns a new parse tree with the substitutions, rather than mutating the existing tree. The mutation approach leads to quite unwieldy code. (Try it, if you don't believe me ⁶ ↩

⁷ The notation “let x = 5 in (x+3) → (5+3) → 8” is shorthand for

  eval(parse("let x = 5 in (x+3)"))
= eval(parse("(5 + 3)"))
= 8

Observe how we definitely don't write “"let x = 5 in (x+3)" = "(5 + 3)" = 8” since strings are not equal to ints. More specifically: we distinguish between “⇒” (“code evaluates to”) and “=” (“equals”, just as “=” has meant since kindergarten). ↩

⁸ nor any “binding occurrences”: The first x in let x be (5) in ((x + 3)) is a binding occurrence, and the other x is a bound occurrence. (The variable is bound inside the scope of its binding occurrence.) ↩

⁹ A third issue, pointed out in Programming Languages and Interpretation, is that evaluating deeply nested withs is an O(n²) algorithm. ↩

¹⁰ Note that the list you recur with has the additional binding, but that recursive call shouldn't end up with any net change to your bindings. If you use lists which require mutation (viz java.util.List), you must either make duplicates before recurring, or be sure to remove your recently-added binding after your recursive call. ↩

home—info—lectures—exams—hws—archive

©2009, Ian Barland, Radford University
Last modified 2009.May.06 (Wed) Please mail any suggestions
(incl. typos, broken links)
to iba�rlandrad�ford.edu

hw07 Interpreting Mproject

The language M0

The language M1

The language M2

The language M3

hw07
Interpreting M
project