------art_11883_13223355.1200991893721
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hi Alex,

Have the same problem with you :)

I think we should understand the 'adjacent' as 'not overlapped adjacent'
here. Each time we take two adjacent suffix, we check whether they
overlapped first - if so, make s2 point to next suffix, until we found a not
overlapped one.

For aaaa: on the step of comparing aa and aaa, we found they overlapped, so
we keep s1  a, but make s2  aaa, the one next to aaa.

Thanks
Jan

On Jan 22, 2008 4:10 PM, Alex Shulgin <alex.shulgin / gmail.com> wrote:

> On Jan 20, 7:04 pm, Dave Thomas <d... / pragprog.com> wrote:
> > My hack at the substring problem (nice one, James) is based on suffix
> > trees.
> >
> > Suffix trees and suffix arrays and a great tool for substring
> > operations. The idea is to create a list of all the possible suffixes
> > in the original string. For "banana" this would be
> >
> > banana
> > anana
> > nana
> > ana
> > na
> > a
> >
> > Because the list contains an entry starting at every position in the
> > string, we know that all possible substrings of the original string
> > must occur at the start of one of these list elements. The substring
> > "nan", for example, occurs at the start of the 3rd entry.
> >
> > Now sort them
> >
> > a
> > ana
> > anana
> > banana
> > na
> > nana
> >
> > Now we know that all substrings that start with the same sequence of
> > characters must occur at the start of items in the list that are
> > adjacent. Searching through to list to find these is a linear operation.
>
> Hi Dave,
>
> Thanks for the brilliant idea--this time I really feel like I've
> learned something new! :)
>
> I even took my time to implement it in C (80 lines) and I must admit
> that ruby performs very well in this case: my C version is only about
> 6 times faster than your ruby...
>
> However, with the algorithm presented, I don't see how one can obtain
> the correct result for the "aaaa" case.  Lets see; the suffix list
> would be:
>
> aaaa
> aaa
> aa
> a
>
> now, sorted:
>
> a
> aa
> aaa
> aaaa
>
> And the catch is--in order to obtain the correct result, that is "aa",
> one needs to observe "aa" and "aaaa", but these are not adjacent.  In
> this case two pairs of adjacent suffixes ("aa" and "aaa", "aaa" and
> "aaaa") might give "aa", but they do overlap.
>
> So do we need a special-case treatment here?
>
> --
> Cheers,
> Alex
>
>


-- 
Fly on wheels.

------art_11883_13223355.1200991893721--