RU beehive logo ITEC dept promo banner
ITEC 122
2007fall
ibarland

homeinfolectureshwsexams

lect15
paths; orderings

Representing a graph: an adjacency matrix. (Example.)
More OO: each node has a list of its adjacent edges. (Note that we'd likely have class Edge.)

DFS:

boolean getFromTo( Vertex a, Vertex z ) {
  Container pending = new Stack();
  Set seen = new Set();
  pending.put( a );

  while (!pending.isEmpty()) {
    Vertex curr = pending.removeNext();
    if (curr==z) {
      return true;
      }
    else if !seen.contains(nhbr) {
      for ( Vertex nhbr : curr.neighbors ) {
        pending.add(nhbr);
        }
      }
    }
    return false;
  }
However, this algorithm doesn't compute the (shortest) distance from a to z.
Running time?
Note that we can do DFS w/o an explicity Stack object; the run-time stack suffices.

BFS: the same as DFS, but use a queue instead of a Stack. In fact, if we associate a number alongside each vertex in `pending`, we can compute the shortest distance.

int distFromToBFS( Vertex a, Vertex z ) {
  Container pending = new Queue();
  Set seen = new Set();
  pending.put( [a,0] );  // That is, put a, associated w/ distance 0.

  while (!pending.isEmpty() && !seen.contains(z)) {
    [Vertex,int] [curr,d] = pending.next();
    if (curr==z) {
      return d;
      }
    else if (!seen.contains(curr)) {
      for ( Vertex nhbr : curr.neighbors ) {
        pending.add([nhbr,(d+1)]);
        }
      }
    }
    return false;
  }

Q: is the following DFS, or BFS? The javadoc algorithm for {@inheritDoc}

What if there are weights on the edges, dist(u,v)? In that case, BFS generalizes to Dijkstra's algorithm, as long as no negative edges. Use a priority queue.

double[] dijkstra( Vertex a ) {
  Container pending = new PriorityQueue();
  Set seen = new Set();

  double[] dist;  // dist[i] is the known distance from a to vertex #i.
  for (Vertex v : V) {
    pending.put( v );
    dist[v] = Double.POSITIVE_INFINITY;
    }
  dist[a] = 0;

  while (!pending.isEmpty()) {
    Vertex curr = pending.remove(the vertex with lowest dist[]);
    seen.add(curr); // Distance to curr has been finalized.
    for ( Vertex nhbr : curr.neighbors ) {
      if !seen.contains(nhbr) {
        dist[nhbr] = Math.min( dist[nhbr], dist[curr] + e[curr,nhbr] ) 
        }
      }
    }
    return dist;
  }
Is this correct? Proof: Each time through the loop, the following is true: If seen.contains(i), then dist[i] is the actual minimum difference. Proof by contradiction; assume this isn't true; let j be the first node such that seen.contains(j), and yet dist[j] (at that moment) is not the true minimum. Take some (truly) shortest path from a to j; let i be the frist vertex on that path such that !seen.contains(i).


Running time?
n puts, m updates, and n gets (where n=|V|, and m=|E|).
What is time required for a put, using a sorted-list implementation? O(z), where z items currently in the queue. Time required for a get? O(1).
Total: O(mn+n) ⊆O(n³) (although this is pessimistic for sparse graphs)
What about time if we use an unsorted-list implementation? O(m+n&nsup2;) ⊆O(n²)
A fancy data structure -- a heap -- can do a priority queue ops in time log(z), yielding time O(m log(n) + n log(n)) (which is good for sparse graphs).
(Might a fancier data structure do even better, for a priority queue?)


Adjacency matrix.
Paths of length 1; 2; 3; 4; 5.
Multiplying the adjacency matrix.

/** return d*E, where d and E are both *square* matrices. */
double[][] oneStepMore( double[][] d, double[][] e ) {
  int size = d.length;

  for (int r=0; r < size; ++r)  {
    for (int c=0; c < size; ++c)  {

      // Calculate location d[r,c]
      for (int k=0;  k < size;  ++k ) {
        // Compare r-->k->c vs our current notion of r-->c :
        d[r,c] = Math.min( d[r,c],  d[r,k]+e[k,c] );
        }

      }
    }
  }
Running time?
Note that Floyd-Warshall does even better, with a simple trick: maintain "distance from x,y only using intermediate nodes numbered <i", for i=0..n. This gives O(n^3)!


Topological sort

(not covered)

When ≤ isn't an ordering: Example: teams rated on Offense, Defense, Stamina.

Sets w/ no natural order: Eg, you can some things are bigger than others (we can say that Gwen Stefani ≥ Backstreet Boys, but we can't compare everything: Gwen Stefani !≥ Leonard Bernstein, and Leonard Bernstein !≥ Stefani. However, we do know that Stefani ≥ Backstreet Boys and Backstreet Boys ≥ N*Sync is enough to conlude Stefani ≥ N*Sync. That is, "better music than" is not a complete relation, but it is transitive.

However, sometimes you have a relation that seems like it should be transitive, but it's not! For example: suppose Ultimate Frisbee teams really have three underlying ratings: Offense, Defense, and Stamina. And suppose that when two teams play, whoever has a better rating in two out of the three categories is the winner

It seems like the relation "beats(X,Y)" should always be a nice, complete ordered relation (especially since we've eliminated all randomness and psychology present in real-world sports), but it's not!: Say

          team A has offense=3, defense=2, stamina=1.
          team B has offense=2, defense=1, stamina=3.
          team C has offense=1, defense=3, stamina=2.
We can see that A > B, and B > C, yet C > A.

The upshot: be careful about trying to project multidimensional data down to a single scalar; you'll induce an ordering which may not reflect the original multi-dimensional relation. (And the original relation might not qualify as an ordering -- not even a partial ordering -- even though it sure seems like it should be one.)

Alternate upshot, to console Rockets fans: It's conceivable that every team in the league is equally good, and it's merely the order of the matches which is determining the outcome.


Snippet from Richard Tapia's University Professor talk, 2005.Nov.29: ***"Rankings are evil."***

Humans are many-dimensional; you can't meaningfully reduce them to a single number (or even, just their academic talent/creativity/ability). To try to say that every student with a 1425 SAT should be accepted before students with 1400 SAT is ludicrous — and, ludicrous in a way which is socially damaging.

Instead, set a threshhold or bin (say, 1350-1450), and then view everybody within that bin as potentially equal-scoring on the test. (Similarly, w/ faculty hiring: schools which hire only the candidate from the top-ranked grad schools are missing out on a lot of diversity and talent from students coming from schools ranked #6. So set a threshhold of competency, then choose from that entire pool.)

homeinfolectureshwsexams


©2007, Ian Barland, Radford University
Last modified 2007.Dec.04 (Tue)
Please mail any suggestions
(incl. typos, broken links)
to iba�rlandrad�ford.edu
Powered by PLT Scheme