ITEC 420 - Section 4.1 - Decidability Properties of
Regular and Context Free Languages
Overview
- In this section we will:
- See more examples of Turing Decidable languages
- Use encoded machines
- Learn about Turing Machines that operate on encoded machines
- Example 1: Encode a DFA M as a string and build a TM that reads this
string and determines if M accepts the empty language
- Example 2: Encode DFA M and combine it with string w to produce an
input string for a TM that executes M on w
- Study languages defined by machines that operate on such machines
- The latter two points are the most important
A New Notation: Machine Encodings
- We want to encode a machine as a string!
- This allows us to define classes of machines aslanguages!
- Let <M> represent the encoding of machine M
- Let <M, w> represent the encoding of machine M and string w
- Encoding must contain encoding of:
- List of states
- Input and tape alphabets
- Transition function
- Initial and final states, etc
- We assume that the machine that uses the encoding can interpret it
Learning Outcomes of Section 4.1
- Be able to formally and informally describe the contents of languages
such as and including the following:
- E_{DFA}
- E_{CFG}
- EQ_{DFA}
- A_{DFA}, A_{NFA}, A_{REX}
- A_{CFG}, A_{PDA}
- Prove that the following are Turing Decidable languages:
- the languages above
- every Context Free language A
- Anticipate the definition of A_{TM}
E_{DFA}
- E_{DFA} = {< A> | A is a DFA and
L(A) =
φ }
- TM T = On input < A>, where A is a DFA:
- Mark the start state of A
- Repeat until no new states get marked:
- Mark any state that has a transition into it from any already marked state
- If no accept state is marked, accept
- else reject
- Examples: Machines for a*b+ and same machine
- Machine for a^{*}b^{+}
- Same machine extra, unreachable state as final state
- A Python simulator
and (prettified)
- Notice that E_{DFA} is a language:
- It's a set of strings
- Each string is an encoded DFA
E_{CFG}
- E_{CFG} = {< G> | G is a CFG and L(G) = φ }
- TM R = On input < G>, where G is a CFG:
- Mark all terminal symbols in G
- Repeat until no new variables get marked:
- Mark all occurances of any variable that is the LHS of a rule whose entire RHS is marked
- Example: if A → PQR is a rule and P, Q, and R are all marked, then mark all occurences of A in G
- If the start symbol is not marked, then accept
EQ_{DFA}
- EQ_{DFA} = {< A, B> | A and B are DFAs and L(A) = L(B)}
- TM F = On input < A, B>, where A and B are DFAs:
- Construct a DFA C such that L(C) = (L(A) ∩ L(B)')
∪ (L(A)' ∩ L(B))
- Run TM T on input <C> to see if C is empty
- If T accepts, accept
- else reject
- L(C) is the symmetric difference of L(A) and L(B)
- Operations ∩, ∪, and complement are defined on regular languages
- We know how to combine DFA to get a new
DFA that accepts the resulting language
A New Kind of Machine and Input String: A_{DFA}
- In E_{DFA} and E_{CFG}, the input string was an encoded DFA or grammar
- In EQ_{DFA}, the input string was 2 encoded DFAs
- In A_{DFA}, the input string for the TM has 2 parts:
- An encoded DFA M
- An input string w for the DFA
- A_{DFA} simulates the computation of M on D
A_{DFA}
- A_{DFA} = {< B, w> | B is a DFA that accepts string w}
- Example: If L(B) = {a}, then what strings are in A_{DFA}?
- How many strings in A_{DFA}?
- Let's build a TM to decide language A_{DFA}
- TM M = On input < B, w>, where B is a DFA and w is a string:
- Simulate B on w
- If the simulation ends in an accept state of B, then accept
<B,w>
- else the simulation must end in a reject state of B, so reject <B,w>
- M First checks that B is a valid machine
- M simulates B by keeping track of current input position and current
state [easy, right?]
- Simulation ends when M reaches the end of w
A_{NFA}
- A_{NFA} = {< B, w> | B is a NFA that accepts string w}
- TM N = On input < B, w>, where B is a NFA and w is a string:
- Convert NFA B to an equivalent DFA C using procedure from Th 1.39
- Run TM M on < C, w>
- If M accepts, then accept
- We use M as a subroutine!
A_{REX}
- A_{REX} = {< R, w> | R is a regular expressions that generates string w}
- TM P = On input < R, w>, where R is a Regular Expression and w is a string:
- Convert Regular Expression R to an equivalent DFA A
using procedure from Th. 1.54
- Run TM M on < A, w>
- If M accepts, then accept
A_{CFG}
- A_{CFG} = {< G, w> | G is a CFG that generates string w }
- TM S = On input < G, w>, where G is a CFG and w is a string:
- Convert G to an equivalent grammar in Chomsky Normal Form (CNF)
- List all derivations with 2n-1 steps, where |w|=n
- If any of these derivations generate w, then accept
- else reject
- When a grammar is in CNF, every step in a derivation does what or what?
- When a grammar is in CNF, how many steps in a derivation of a string of length n?
A_{PDA}
- Results that hold for CFGs also hold for PDAs
EQ_{CFG} ???
- We will see in Chapter 5 that EQ_{CFG} is NOT decidable
Every context free language is decidable
- To prove: Every context free language A is decidable
- That is: Given a Context Free Language A, we can build a TM that decides A
- Proof:
- Let A be a Context Free Language
- Since A is CF, there is a grammar that generates A. Call this grammar G.
- Use G to build TM M_{G}.
- Machine M_{G} is specialized for grammar G: it
has a tape that holds an encoding of the grammar G.
Machine M_{G} first
combines the encoding of G with its input string w to create
the string <G, w>, which is then used as input for TM S.
- Machine M_{G} = On input w:
- Build string <G,w>
- Run TM S on <G,w>
- If S accepts w, then M_{G} accepts w.
- Else S rejects w, so M_{G} rejects w.
- If w is in A, then M_{G} accepts w, otherwise
M_{G} rejects w
- We have constructed a machine (ie M_{G}) that decides A, and so A is decidable
- Question: How do we know that we can find G:
- Either it's given or we have a PDA which can be converted to G
Every context free language is decidable - A Wrong Approach
- To prove: Every context free language A is decidable
- Incorrect Proof outline:
- Since A is CF, there is a PDA that recognizes A. Call this PDA P.
- Build a TM M_{P} that contains an encoding of P
- On input w, M simulates P on w
- If P accepts, then M will accept
- If P rejects, then M will reject
- What's wrong with this approach:
- if the NPDA has a non-halting branch then the simulation would not halt in the TM
- Can a NPDA have non-halting branches? Yes: it accepts if there
is a branch that is in an accept state at the end of the input and
rejects if there is no such branch.
- What's wrong with not halting, I thought we ignored the
non-halting branches? For a recognizer you can ignore them; for a
decider, every branch must halt.
- Possible approach?:
- Since A is CF, it has a PDA that recognizes it. Convert this PDA to a
grammar, then convert the grammar to a PDA. This PDA is guaranteed to halt (I
think).
- Of course, once we convert the PDA to a grammar, we could just use the
previous machine
Language Classes
- We have shown:
- Regular ⊆ context free ⊆ decidable ⊆ Turing-recognizable ⊆ All languages
- Let's identify an example language for each set that is not in the next
smaller set
Summary: Decidable Languages and Their Machines
- The following languages are decidable by the given machines:
- E_{DFA}: T
- E_{CFG}: R
- EQ_{DFA}: F
- A_{DFA}: M
- A_{NFA}: N
- A_{REX}: P
- A_{CFG}: S
- A_{PDA}: S
- All CF languages are decidable
- EQ_{CFG} is NOT decidable (proved in Chapter 5)
Summary: New Notation for Specifying Languages
- Example notation:
- E_{DFA} = {< A> | A is a DFA and L(A) = {} }
- A_{DFA} = {< B, w> | B is a DFA that accepts w}
- Action
- A = Accept
- E = Empty
- EQ = Equal
- Kind of Machine/Grammar
- DFA, NFA, REX
- CFG, PDA
- TM
- Input Strings:
- A is a DFA
- < B, w> contains an encoded DFA B and an input string for B
Next Up
- Prove that A_{TM} is RECOGNIZABLE
but NOT DECIDABLE
- Prove that A_{TM}' = A_{TM}^{C} is NOT RECOGNIZABLE
- Prove that if a language and its complement are recognizable, then the
language is decidable