RU beehive logo ITEC dept promo banner
ITEC 380
2016summerIII
ibarland

homelecturesrecipeexamshwsD2Lbreeze (snow day; distance)

lect13-grammar-intro
intro grammars/BNF

Reading: 
         Scott §§2.1.  (Read, then re-read closely.)

Specifying Languages:
syntax vs semantics

Syntax: a grammar G1 (mongo the caveman's grammar):

S → N V .
S → N V N .
N → cat | ufo | apple | fox | dog | hotdog | android | itec
V → bite | run | jump | hit | love | spin | hear

What is a sentence we can derive from S?  What's one that we can't?

G1 is a bit mongo-ish; how to add ...
  articles, plurals, and maybe some verb conjugation?



Here's G2:

S → NP CV NP . 
S → NP CV .
NP → ART N
ART → a | an | the
CV → V s | V ed
N → cat | ufo | apple | fox | dog | hotdog | android | mongo | itec
V → bite | run | jump | hit | love | spin | hear


Example:
S ⇒ NP CV NP .
  ⇒ ART N CV NP .
  ⇒ the N CV NP .
  ⇒ the N V ed NP .
  ⇒ the N hit ed NP .
  ⇒ the hotdog hited NP .
  ⇒ the hotdog hited ART N .
  ⇒ the hotdog hited a N .
  ⇒ the hotdog hited a android.

  
Group exercise: What is a grammar for generating: a Java field-declaration. (Let's make a few test-cases first!) How about an entire Java class-declaration?
G3, a grammar that can generate infinite # of strings: S → NP CV NP . S → NP CV . NP → ART AP ART → a | an | the AP → N AP → ADJ AP <---- note the circular def'n! CV → V s | V ed N → cat | ufo | apple | fox | dog | hotdog | android | mongo | itec V → bite | run | jump | hit | love | spin | hear ADJ → happy | pretty | nice | small | smelly S ⇒ NP CV NP . ⇒* the cat bites NP . ⇒ the cat bites ART AP . ⇒ the cat bites the AP . ⇒ the cat bites the ADJ AP . ⇒ the cat bites the nice AP . ⇒ the cat bites the nice ADJ AP . ⇒ the cat bites the nice small AP . ⇒ the cat bites the nice small ADJ AP . ⇒ the cat bites the nice small smelly AP . ⇒ the cat bites the nice small smelly N . ⇒ the cat bites the nice small smelly android . [think "Sentence", "Verb Phrase", "Conjugated Verb", etc.] [Technically, I'm leaving out spaces from the grammar. How to fix?] Derivations also correspond to syntax-trees: Is this string in the language?: a cat bites a apple. YES (we can give a derivation for it) but note that a cat bites a apples. NO (...proof?...take ITEC420!) Note: syntax allows "colorless green ideas sleep furiously"; the semantics is a different issue. [example by Chomsky] Some terms: "nonterminal", "terminal", | is 'or', → is 'is' Book also uses the terms: token (nonterminal 'words'/punctuation of the grammar), lexeme (a concrete 'word'). Example: "DoubleLiteral" and "BinaryOp" are a tokens; "173.4" and "*" are lexemes. (I've always heard 'token' used for what the book calls lexemes, myself.) Derivations and trees (equivalent ways, essentially): (a) Here is a derivation of a cat bites the ufo .
S ⇒ NP CV NP .
  ⇒ ART N CV NP .
  ⇒ a N CV NP .
  ⇒ a N CV ART N .     (observe: we replace *any* non-terminal)
  ⇒ a N CV ART ufo .
  ⇒ a N CV the ufo .
          ...
  ⇒ a cat bites the ufo .
The same thing, as a tree: [draw] What is an example of something which does *not* meet the above grammar? Def'n: parsing: reading in a stream of characters/tokens, and building the corresponding parse tree (or, determine that the input string is not derivable from the grammar). We'll deal with parsing later; right now we'll keep doing it instinctively. What if we add further rules? G4 will be G3, plus the rules: Exercise: add adverbs and adjectives. ADJ → quick | lazy | brown | hot ADV → ADJ ly and changing G3's rules for `S` to be: S → NP CV NP . | NP CV . | NP CV NP ADV. Self-exercise: - Make a strings that is in G4 (but was not in G3). - Make a string that is not in G4. What is a grammar for... - declaring a Java identifier (w/o init'ing, but including access modifiers) - a Java identifier? - a Java assignment statement? (we'll leave "expr" not fully defined) See: Java syntax for 120 (which is incomplete and technically wrong in some places), and some of the BNF from the Java Language Definition. Here is another version which includes both grammar and accompanying diagrams. Some limitations of CFG -- see ITEC420 (computability th'y). Also, if we place certain limitations on the grammar productions, (e.g. "Only one non-terminal on the right-hand-side) we end up with "regular grammars", which capture exactly the same thing as regular expressions. (A separate issue, btw: By constraining our grammar we can get nicer guarantees of algorithmic behavior -- parsing etc.)

homelecturesrecipeexamshwsD2Lbreeze (snow day; distance)


©2015, Ian Barland, Radford University
Last modified 2016.May.16 (Mon)
Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu
Rendered by Racket.