# Overview of NP-Completeness

## Outcomes

• Define the following language classes: P, NP, NP-Complete, NP-Hard

• Explain the significance of the language classes listed above.

## Complexity Theory - Basic Question

• Answer this Basic Question: What are the practical limitations of computers?
• What can we say about problems for which the best known algorithms are not practical because they take too long to execute? Particularly, do better algorithms exist?

• Initial Approach: Classify Problems

## Easy and Hard Problems

• Easy = best known solution is efficient
• Hard = best known solution is NOT efficient

• Efficient algorithm: a reasonably large problem can be solved in a reasonable amount of time

## Polynomial and Exponential Time

• Polynomial time: the time can be expressed as a polynomial of n, where n is the length of the input.
• Example: T(n) = O(n2) for bubble sort

• Contrast with Exponential time

• Exponential time: the time can be expressed as cn for some c
• Example: T(n) = O(kn) to calculate Fibonacci number ns

• Remember: we ignore lower order terms of time O(f(n))

## Some Example Problems

• Example 1: An Easy problem:
• Problem PATH = find a path between two graph nodes
• Performance: Polynomial in worst case

• Example hard problems:
• Example 2: Problem HAMPATH = Hamiltonian Path: find a path between two graph nodes that visits each node exactly once
• Example 3: TSP = Traveling Salesman Problem: find shortest path through all nodes of a graph
• Performance: Both are exponential in worst case

• A classic way of categorizing easy and hard problems is the language classes P and NP

## P and NP

• Class P = Problems whose solutions can be found in polynomial time

• Class NP = Problems whose solutions can be checked in polynomial time
• What does NP stand for? We'll see later.
• We will also see a different definition

• Is there a connection between exponential time and NP? Yes, but we see it later

## Classifying our Examples as P or NP

• Classifying our example problems:

1. Path through graph - a solution is easily found and easily checked, so the problem is ...
• the easy algorithm is to mark visited nodes until the end point is reached or until all nodes visited

2. Hamiltonian Path - a solution is easily checked but not easily found, so this problem is ...

3. TSP - solution hard to find and hard to check, so it is ...

## The Big Question

• Since easily found implies easily checked, it is clear that P ⊆ NP

• But, is P ⊂ NP, or is P = NP? No one knows!

• \$1,000,000.00 reward for an answer!

• Analysis focuses on a class of harder problems in NP: the NP-Complete problems

## NP-Complete Problems

• NP-Complete problems: Problem B is NP-complete iff

1. B ∈ NP and

2. Every problem in NP is reducible (in polynomial time) to B
• B is as hard or harder than every problem in NP

• An NP-complete problem is as hard or harder than all problems in NP.

• If P ≠ NP (ie P ⊂ NP) then P ∩ NP-complete = {}

## Reduction

• If problem A is reducible to B, then

• there is a solution to A that uses the solution to B

• B is the same hardness or harder than A

## Consequences of NP-Complete Problems

• A solution to any NP-Complete Problem provides a solution to ALL NP problems

## The First NP-Complete Problem

• Cook's Theorem: The problem SAT is NP-Complete

• SAT = {φ | φ is a logical expression with Boolean variables and there is an assignment of the variables that makes φ true}

• Later we will talk about how he proved this.

## Showing That Other Problems are NP-Complete

• The simple way to show that problem B is NP-complete by reducing a known NP-complete problem A to B
• Thus a solution to B ...

• What does a diagram look like?
• Cook showed that SAT is NP-complete (ie SAT ∈ NP and every NP problem reduces to SAT)

• He and others showed that SAT (or some other NP-complete problem) is reducible to many other problems (eg Hamiltonian Circuit)

## Performance of NP Problems

• NP problems are know to be solvable in 2n time

• NP means Non-Deterministic Polynomial time

• An NP problem can be solved in polynomial time using a nondeterministic TM
• But the nondeterministic machine must always choose the correct branch to solve the problem

• An NP problem can also be solved by a nondeterministic machine by repeatedly guessing a solution and then checking it.
• Simulating this machine with a deterministic machine would take exponential time (ie 2n time)

## Consequences of the class NP-Complete

• If you find a polynomial solution to any NP complete problem, you've solved all of NP!

• If you want to show that the P ⊂ NP, then you should concentrate on the hardest NP problems: the NP-Complete problems

• If you can prove that a problem is NP-Complete, then you can safely stop looking for a polynomial time solution (unless you're trying to become rich and famous!)

• Most researchers think that P ≠ NP

## Some NP-Complete Problems

• SAT: is there an assignment of values to boolean variables to make an expression true

• SAT: is there an assignment of values to boolean variables to make a 3CNF expression true

• Clique: Does a graph have a subgraph (of a specified size) in which every two nodes are connected by an edge

• Hamiltonian Circuit: Is that a path from p to q that touches every vertex

• Vertex Cover: Does a graph have a set of vertices that touch every node

• Subset Sum: Does a subset of a set of integers sum to some target number t

• Hundreds of others

## Proving Cook's Theorem (Th 7.37)

• How could you possible prove that every NP problem is reducible to SAT?

• Remember that a problem in NP can be solved by a nondeterministic TM

• For a problem A in NP with input w, we produce a formula Boolean φ that simulates the operation of the TM for A on w.
• The formula φ is satisfiable iff the TM accepts.
• Thus, w ∈ A iff φ is satisfiable.

• How do we simulate a TM with a Boolean formula?
• Assume that the TM for A is N
• When N processes w, each nondeterministic branch is a sequence of configurations
• This sequence of configurations can be listed in a finite list.
• This list is turned into a boolean formula
• The formula can be satisfied iff the list represents an accepting computation.
• See the book for the exciting details!

## Connection of Problems and Language Classes

• P and NP are language classes, but we have only discussed problems. What's the connection?

• A problem is expressed as a language.

• Example: PATH = {<G, s, t> | G is a directed graph that has a directed path from s to t}

• Example: RELPRIME = {<x,y> | x and y are relatively prime}

## NP-Hard Problems and the TSP

• A famous hard problem is the traveling salesman problem (TSP)
• The TSP is to find shortest route that visits a set of cities.

• TSP is as hard as the NP problems, but TSP is not in NP since a solution to TSP can't be checked in polynomial time.
• A version of the problem that has easy to check solutions is the TS decision problem: given a graph and a distance d, is there a route whose length is less than d
• The TS decision problem is NP-complete.

• The TS problem is said to be NP-hard.

• Problem B is NP hard if B is NOT in NP, but every problem in NP reduces to B.

## What to Do About Hard Problems?

• Hueristics: usually give a good solution
• Approximate solutions: get close to the solution

ITEC 420 Course Page