RU beehive logo ITEC dept promo banner
ITEC 380
2016fall
ibarland

homelecturesrecipeexamshwsD2Lbreeze (snow day; distance)

grammar-intro
intro grammars/BNF

Reading: 
         Scott §§2.1.  (Read, then re-read closely.)
         
         
Let's play the gramma  game! 
Can you turn the string “S” into the string “hotdog hit android .”?
Rules: you can take as many steps as you like, where a step is one of: 1a. “S” can be replaced with “N V .” 1b. “S” can be replaced with “N V N .” 2a-h. “N” can be replaced with “cat” or “ufo” or “apple” or “fox” or “dog” or “hotdog” or “android” or “itec” 3a-g. “V” can be replaced with with “bite” or “run” or “jump” or “hit” or “love” or “spin” or “hear” Write your answer as each string, with a “⇒” between them. (You can also include the exact rule# you used too, if you like). That's a pretty crude sentence we generated. Can we tweak (improve) these rules so that: (i) a noun can be optionally preceded by “a”, “an”, or “the”, and (ii) a verb can optionally have an “ed” added to the end? (We won't enforce the right articles; “a android” will be allowed.) Please write down your revised rules. (It's tedious to keep writing “can be replaced with”; you can just write “→” instead, as an abbreviation. …In fact, I'll require you to use the abbreviation, since it's tedious to keep reading that phrase. Conciseness can be clarity!) Likewise, let's abbreviate “or” with “|”. (Btw, can we write all our rules w/o needing to say “or”? It's already a shortcut — a bit of “syntactic sugar” to make it easier for us to write our exact rules!) Specifying Languages: syntax vs semantics Syntax: a grammar G1 (mongo the caveman's grammar): S → N V . rule 1 S → N V N . rule 2 N → cat | ufo | apple | fox | dog | hotdog | android | itec rules 3a-h V → bite | run | jump | hit | love | spin | hear rules 4a-g What is a sentence we can derive from S? What's one that we can't? G1 is a bit mongo-ish; how to add ... articles, plurals, and maybe some verb conjugation? Here's G2: S → NP CV NP . rule 1 S → NP CV . rule 2 NP → ART N rule 3 ART → a | an | the rules 4a-c CV → V s | V ed rules 5a-b N → cat | ufo | apple | fox | dog | hotdog | android | mongo | itec rules 6a-i V → bite | run | jump | hit | love | spin | hear rules 7a-g Example: S ⇒ NP CV NP . by rule 1 ⇒ ART N CV NP . by rule 3 ⇒ the N CV NP . by rule 4c ⇒ the N V ed NP . by rule 5b ⇒ the N hit ed NP . by rule 7d ⇒ the hotdog hited NP . by rule 6f ⇒ the hotdog hited ART N . by rule 3 ⇒ the hotdog hited a N . by rule 4a ⇒ the hotdog hited a android. by rule 6g
Group exercise: What is a grammar for generating: a Java field-declaration. (Let's make a few test-cases first!) How about an entire Java class-declaration?
recorded video:

Grammar for field-declarations: video part 1: start at 29m25s until 58m00s, but skip the technical difficulties from 42:30--47:00: [What was on my screen while talking is *different* from what the screen-share showed (arrgh), and it took 15min for students to even mention this (arrrrrgh!!!), so you'll want to refer to the completed notes at google docs]

Recursive rules: breeze.radford.edu/p6hy8cmbduf, starting at 06m30s (but again looking at that google-doc) where the problem is "Using the grammar G3, how can I derive the string “the cat bites the pretty smelly smelly hotdog”, starting from “S” and playing the grammar game?.

G3, a grammar that can generate infinite # of strings: S → NP CV NP . rule 1 S → NP CV . rule 2 NP → ART AP rule 3 ART → a | an | the rules 4a-c AP → N rule 5 AP → ADJ AP <---- note the circular def'n! rule 6 CV → V s | V ed rules 7a-b N → cat | ufo | apple | fox | dog | hotdog | android | mongo | itec rules 8a-i V → bite | run | jump | hit | love | spin | hear rules 9a-g ADJ → happy | pretty | nice | small | smelly rules 10a-e S ⇒ NP CV NP . ⇒* the cat bites NP . took multiple steps here — hence the “⇒*”) ⇒ the cat bites ART AP . ⇒ the cat bites the AP . ⇒ the cat bites the ADJ AP . by rule 6 ⇒ the cat bites the nice AP . ⇒ the cat bites the nice ADJ AP . by rule 6 ⇒ the cat bites the nice small AP . ⇒ the cat bites the nice small ADJ AP . by rule 6 ⇒ the cat bites the nice small smelly AP . ⇒ the cat bites the nice small smelly N . ⇒ the cat bites the nice small smelly android . [think "Sentence", "Verb Phrase", "Conjugated Verb", etc.] [Technically, I'm leaving out spaces from the grammar. How to fix?] Is this string in the language?: a cat bites a apple. YES (we can give a derivation for it) but note that a cat bites a apples. NO (...proof?...take ITEC420!) Derivations also correspond to syntax-trees: Note: syntax allows "colorless green ideas sleep furiously"; the semantics is a different issue. [example by Chomsky] Some terms: "nonterminal", "terminal", | is 'or', → is 'is' Book also uses the terms: token (nonterminal 'words'/punctuation of the grammar), lexeme (a concrete 'word'). Example: "DoubleLiteral" and "BinaryOp" are a tokens; "173.4" and "*" are lexemes. (I've always heard 'token' used for what the book calls lexemes, myself.) Derivations and trees (equivalent ways, essentially): (a) Here is a derivation of a cat bites the ufo .
S ⇒ NP CV NP .
  ⇒ ART N CV NP .
  ⇒ a N CV NP .
  ⇒ a N CV ART N .     (observe: we replace *any* non-terminal)
  ⇒ a N CV ART ufo .
  ⇒ a N CV the ufo .
          ...
  ⇒ a cat bites the ufo .
The same thing, as a tree: [draw] What is an example of something which does *not* meet the above grammar? Def'n: parsing: reading in a stream of characters/tokens, and building the corresponding parse tree (or, determine that the input string is not derivable from the grammar). We'll deal with parsing later; right now we'll keep doing it instinctively. What if we add further rules? G4 will be G3, plus the rules: Exercise: add adverbs and adjectives. ADJ → quick | lazy | brown | hot ADV → ADJ ly and changing G3's rules for `S` to be: S → NP CV NP . | NP CV . | NP CV NP ADV. Self-exercise: - Make a strings that is in G4 (but was not in G3). - Make a string that is not in G4. Okay, armed with the ability to recur to make "1 or more" or "0 or more", now let's try i coming up with a grammar for a Java class declaration: - example:
    class Puppy {
        int numSpots;
        String name = "fido";
        }
  
... To see the tree corresponding to our derivation go to http://mshang.ca/syntree/ and enter:
      [ClassDecl class 
                 [JavaId [Puppy]] 
                 { 
                 [FieldOrMethodDecls [FieldOrMethodDecl [FieldDecl [Type int] [JavaId [numSpots]] ;] ]
                                     [FieldOrMethodDecls [FieldOrMethodDecl [FieldDecl [Type String] [JavaId [name]] [=] "fido" ;]] 
                                     [FieldOrMethodDecls [ε]] ]]
                 }
     ]
What is a grammar for... - declaring a Java identifier (w/o init'ing, but including access modifiers) - a Java identifier? - a Java assignment statement? (we'll leave "expr" not fully defined) See: Java syntax for 120 (which is incomplete and technically wrong in some places), and some of the BNF from the Java Language Definition. Here is another version which includes both grammar and accompanying diagrams. Some limitations of CFG -- see ITEC420 (computability th'y). Also, if we place certain limitations on the grammar productions, (e.g. "Only one non-terminal on the right-hand-side) we end up with "regular grammars", which capture exactly the same thing as regular expressions. (A separate issue, btw: By constraining our grammar we can get nicer guarantees of algorithmic behavior -- parsing etc.)

homelecturesrecipeexamshwsD2Lbreeze (snow day; distance)


©2016, Ian Barland, Radford University
Last modified 2016.Sep.30 (Fri)
Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu
Rendered by Racket.