On 5/10/05, Nuralanur / aol.com <Nuralanur / aol.com> wrote:
> now you have confused me almost totally.

Sorry about that. I have a tendency to do that. :) I'll think of
examples that in my mind are clear but introduce incidental
complexities that don't have anything to do with the problem at hand
and end up muddying the water.

> Actually, I was first responding to the subject because I had sowhat vaguely
> in mind that it might help to formulate the problem in as a shortest-path
> problem (define a cost function and try to minimise it. For example, in the
> song problem, pay me $1000000 each time the subsequent song  doesn't
> start with the letter the previous ended with. Otherwise, pay nothing. If you
> have to pay nothing after going through the whole path, you have found a
> correct solution and I am still poor.
> 
> For this sort of problem, there is a way to find the "single-source  shortest
> path solution" called Dijkstra's algorithm, which is nicely described here:
> 
> _http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/dijkstra.html_
> (http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/dijkstra.html)

Yes, I'm familiar with Dijkstra's algorithm and the connection to
incidence matrices. That's why in my first post I said "Though an
incidence [matrix] can work (and work quite well) for a simple
graph..." (mistype of 'graph' for 'matrix' in the original post
corrected). I'll reexplain the case for multiple edges per node pair
below and why Djikstra's algorithm (in its original form) is not
applicable.

> My confusion is about your use of multiple criteria/connections/... .

Ok, and this is where I muddied the water. My apologies on that.
Ignore the multiple criteria, and let's deal with an example that only
has one weight per edge.

Let N be the set of letters/characters with which a song may begin or
end, and E a set of (directed) edges representing songs, where each
edge connects the node representing the first character of the song to
the node representing the last character of the song. So, for
instance, the edge "Fingertips"[1] would connect the nodes [F] -> [S].
Now we can represent a graph G = { N, E }. Let also each edge be
weighted by the length of the song.

The goal is to determine the quickest (not necessarily shortest in
number of songs, but shortest in running time) song chain to get from
a starting letter to another letter B. This is a simple matter of
finding the cheapest path in our graph between the nodes [starting
letter] => [ending letter]. Djikstra's algorithm could be useful here,
except (as you say):

> In the original version of Dijsktra's algorithm, there is only one  connection
> from each edge to each other.

Elaboration: Djikstra's algorithm uses an incidence matrix in which
each entry in the matrix represents the weight of the edge between the
indices. Obviously then, the matrix can only represent one edge per
index pair, where the indices are nodes. So Djikstra's algorithm only
allows one edge per node pair.

But in our example multiple edges can connect the same pair of nodes.
For example, we connected [F] -> [S] by "Fingertips" above. But the
song "Fake Out In Buenos Aires"[2] also connects [F] -> [S]. So you
can't just expect one edge, and Djikstra's algorithm is no longer
available (in it's original form... there may be an adaptation for
this, but I've never seen one -- I would love to be wrong).

Ok, so we've seen an example where there are multiple edges per node
pair, and we've seen why Djikstra's algorithm wouldn't work in that
model. But Djikstra's algorithm is cool and /fast/! So why not try a
different representation of the model the *would* allow Djikstra's
algorithm? Instead of having songs as edges, lets try songs as nodes
(as you mentioned). What should we represent in edges? A logical
choice would be to have a directed edge from song A to song B iff song
B can follow song A according to our rules. So there'd be an edge from
"Fingertips" to "Sapphire Bullets of Pure Love"[3] (s to S), but no
edge from "Sapphire Bullets of Pure Love" to "I Palindrome I"[4] (e to
I). That builds a working graph, but there are problems:

(1) Where do we start? Where do we end?

Maybe we can use an extra node that represents the start state and add
edges to all songs starting with the desired start letter. Same for
the goal state -- we add a node and connect from all songs ending with
the desired letter to it. The problem is that we can't reuse the same
graph for multiple runs -- if we change out start/goal conditions, we
need to change the graph.

(2) How do we weight the edges?

For our example the weight appropriately belongs to the song, which is
now a node instead of an edge. We can weight all edges coming out of a
song node with the length of the song. This together with the solution
to (1) makes for a viable graph. But I contest that the shift of
weight from the appropriate entity (song) to multiple edges is a
duplication and obfuscation of the model.

(3) The revised model is much, much larger.

Assume a database of N songs beginning or ending with one of M
letters/characters. The actual distribution of connections between
songs is probably much more complex, but let's just assume each
character is equally likely. So N/M songs end with a given character
and N/M songs will start with the same given character. So we'll need
(N/M)^2 edges each for M characters, or a total of (N^2)/M edges. So
our graph has N nodes and (N^2)/M edges. Compare that to M nodes and N
edges (one edge per song). Add to that the fact that N is going to be
orders of magnitude greater than M and you'll notice the difference.

Simply, I just wanted to point out that there are situations in which
it makes sense to have a graph with multiple edges per node pair. In
those situations, the original version of Djikstra's algorithm using
incidence matrices is not applicable.

Jacob Fugal

[1] http://www.tmbw.net/wiki/index.php/Fingertips
[2] http://www.tmbw.net/wiki/index.php/Fake_Out_In_Buenos_Aires
[3] http://www.tmbw.net/wiki/index.php/Sapphire_Bullets_Of_Pure_Love
[4] http://www.tmbw.net/wiki/index.php/I_Palindrome_I