Hello together,

I got a problem grouping rows in a Database by similarity. What I try to 
reach is the following:
I have a table looking like (just an example):

ID   Companyname       Group
1    Mickeysoft Ltd.   NULL
2    Mickysoft LTD     NULL
3    Mickeysft Limited NULL
4    Aple Inc          NULL
5    APPLE INC         NULL

and so on, you get the point. Group should be 1 for the IDs 1 to 3 and 2 
for the IDs 4 and 5.

At the moment I compare two strings by making them lowercase, deleting 
dots etc., deleting the substrings 'Inc', 'Ltd' etc. and then building 
the Levenshtein-Distance of the metaphone-key of the two strings.
Works not really good and is damn slow, but it's okay and best I could 
figure out. (Nevertheless your hints on that are welcome too.)

My problem is, that I don't know how to apply my compare-method in an 
efficient way. What I'm doing now is selecting the first row where Group 
is NULL and then selecting each row (where Group is NULL) in my database 
in a loop again, comparing with the first selected string and setting 
Group to a certain number if comapare method says they match.

That lasts a lang time and - worse - my code looks very ugly, even to a 
very beginner as I am.

So if somebody of you has a good idea how to improve that, some docs on 
how to implement such a thing or even a totally different approach to 
get the results I want, I would be very glad.

Jan