# ITEC 420 - Section 4.1 - Decidability Properties of Regular and Context Free Languages

## Overview

• In this section we will:
• See more examples of Turing Decidable languages

• Use encoded machines

• Learn about Turing Machines that operate on encoded machines
• Example 1: Encode a DFA M as a string and build a TM that reads this string and determines if M accepts the empty language
• Example 2: Encode DFA M and combine it with string w to produce an input string for a TM that executes M on w

• Study languages defined by machines that operate on such machines

• The latter two points are the most important

## A New Notation: Machine Encodings

• We want to encode a machine as a string!
• This allows us to define classes of machines aslanguages!

• Let <M> represent the encoding of machine M

• Let <M, w> represent the encoding of machine M and string w

• Encoding must contain encoding of:
1. List of states
2. Input and tape alphabets
3. Transition function
4. Initial and final states, etc

• We assume that the machine that uses the encoding can interpret it

## Learning Outcomes of Section 4.1

• Be able to formally and informally describe the contents of languages such as and including the following:
• EDFA
• ECFG
• EQDFA
• ADFA, ANFA, AREX
• ACFG, APDA

• Prove that the following are Turing Decidable languages:
• the languages above
• every Context Free language A

• Anticipate the definition of ATM

## EDFA

• EDFA = {< A> | A is a DFA and L(A) = φ }

• TM T = On input < A>, where A is a DFA:
1. Mark the start state of A
2. Repeat until no new states get marked:
1. Mark any state that has a transition into it from any already marked state
3. If no accept state is marked, accept
1. else reject

• Examples: Machines for a*b+ and same machine
• Machine for a*b+
• Same machine extra, unreachable state as final state

• A Python simulator and (prettified)

• Notice that EDFA is a language:
• It's a set of strings
• Each string is an encoded DFA

## ECFG

• ECFG = {< G> | G is a CFG and L(G) = φ }

• TM R = On input < G>, where G is a CFG:
1. Mark all terminal symbols in G
2. Repeat until no new variables get marked:
• Mark all occurances of any variable that is the LHS of a rule whose entire RHS is marked
• Example: if A → PQR is a rule and P, Q, and R are all marked, then mark all occurences of A in G
3. If the start symbol is not marked, then accept
• else reject

## EQDFA

• EQDFA = {< A, B> | A and B are DFAs and L(A) = L(B)}

• TM F = On input < A, B>, where A and B are DFAs:
1. Construct a DFA C such that L(C) = (L(A) ∩ L(B)') (L(A)' ∩ L(B))
2. Run TM T on input <C> to see if C is empty
3. If T accepts, accept
1. else reject

• L(C) is the symmetric difference of L(A) and L(B)
• L(C) = φ iff L(A) = L(B)

• Operations ∩, ∪, and complement are defined on regular languages
• We know how to combine DFA to get a new DFA that accepts the resulting language

## A New Kind of Machine and Input String: ADFA

• In EDFA and ECFG, the input string was an encoded DFA or grammar

• In EQDFA, the input string was 2 encoded DFAs

• In ADFA, the input string for the TM has 2 parts:
• An encoded DFA M
• An input string w for the DFA

• ADFA simulates the computation of M on D

• ADFA = {< B, w> | B is a DFA that accepts string w}
• Example: If L(B) = {a}, then what strings are in ADFA?
• How many strings in ADFA?
• Let's build a TM to decide language ADFA

• TM M = On input < B, w>, where B is a DFA and w is a string:
1. Simulate B on w
2. If the simulation ends in an accept state of B, then accept <B,w>
• else the simulation must end in a reject state of B, so reject <B,w>

• M First checks that B is a valid machine
• M simulates B by keeping track of current input position and current state [easy, right?]
• Simulation ends when M reaches the end of w

## ANFA

• ANFA = {< B, w> | B is a NFA that accepts string w}

• TM N = On input < B, w>, where B is a NFA and w is a string:
1. Convert NFA B to an equivalent DFA C using procedure from Th 1.39
2. Run TM M on < C, w>
3. If M accepts, then accept
• else reject

• We use M as a subroutine!

## AREX

• AREX = {< R, w> | R is a regular expressions that generates string w}

• TM P = On input < R, w>, where R is a Regular Expression and w is a string:
1. Convert Regular Expression R to an equivalent DFA A using procedure from Th. 1.54
2. Run TM M on < A, w>
3. If M accepts, then accept
• else reject

## ACFG

• ACFG = {< G, w> | G is a CFG that generates string w }

• TM S = On input < G, w>, where G is a CFG and w is a string:
1. Convert G to an equivalent grammar in Chomsky Normal Form (CNF)
2. List all derivations with 2n-1 steps, where |w|=n
3. If any of these derivations generate w, then accept
1. else reject

• When a grammar is in CNF, every step in a derivation does what or what?
• When a grammar is in CNF, how many steps in a derivation of a string of length n?

## APDA

• Results that hold for CFGs also hold for PDAs
• Why?

## EQCFG ???

• We will see in Chapter 5 that EQCFG is NOT decidable

## Every context free language is decidable

• To prove: Every context free language A is decidable
• That is: Given a Context Free Language A, we can build a TM that decides A

• Proof:
1. Let A be a Context Free Language

2. Since A is CF, there is a grammar that generates A. Call this grammar G.

3. Use G to build TM MG.
• Machine MG is specialized for grammar G: it has a tape that holds an encoding of the grammar G. Machine MG first combines the encoding of G with its input string w to create the string <G, w>, which is then used as input for TM S.

• Machine MG = On input w:
1. Build string <G,w>
2. Run TM S on <G,w>
3. If S accepts w, then MG accepts w.
• Else S rejects w, so MG rejects w.

• If w is in A, then MG accepts w, otherwise MG rejects w

4. We have constructed a machine (ie MG) that decides A, and so A is decidable

5. Question: How do we know that we can find G:
• Either it's given or we have a PDA which can be converted to G

## Every context free language is decidable - A Wrong Approach

• To prove: Every context free language A is decidable

• Incorrect Proof outline:
1. Since A is CF, there is a PDA that recognizes A. Call this PDA P.
2. Build a TM MP that contains an encoding of P
3. On input w, M simulates P on w
4. If P accepts, then M will accept
5. If P rejects, then M will reject

• What's wrong with this approach:
• if the NPDA has a non-halting branch then the simulation would not halt in the TM
• Can a NPDA have non-halting branches? Yes: it accepts if there is a branch that is in an accept state at the end of the input and rejects if there is no such branch.
• What's wrong with not halting, I thought we ignored the non-halting branches? For a recognizer you can ignore them; for a decider, every branch must halt.

• Possible approach?:
• Since A is CF, it has a PDA that recognizes it. Convert this PDA to a grammar, then convert the grammar to a PDA. This PDA is guaranteed to halt (I think).
• Of course, once we convert the PDA to a grammar, we could just use the previous machine

## Language Classes

• We have shown:
• Regular ⊆ context free ⊆ decidable ⊆ Turing-recognizable ⊆ All languages

• Let's identify an example language for each set that is not in the next smaller set

## Summary: Decidable Languages and Their Machines

• The following languages are decidable by the given machines:
• EDFA: T
• ECFG: R
• EQDFA: F

• ANFA: N
• AREX: P

• ACFG: S
• APDA: S

• All CF languages are decidable

• EQCFG is NOT decidable (proved in Chapter 5)

## Summary: New Notation for Specifying Languages

• Example notation:
• EDFA = {< A> | A is a DFA and L(A) = {} }
• ADFA = {< B, w> | B is a DFA that accepts w}

• Action
• A = Accept
• E = Empty
• EQ = Equal

• Kind of Machine/Grammar
• DFA, NFA, REX
• CFG, PDA
• TM

• Input Strings:
• A is a DFA
• < B, w> contains an encoded DFA B and an input string for B

## Next Up

• Prove that ATM is RECOGNIZABLE but NOT DECIDABLE

• Prove that ATM' = ATMC is NOT RECOGNIZABLE

• Prove that if a language and its complement are recognizable, then the language is decidable