On Nov 22, 3:34 am, Raul Parolari <raulparol... / gmail.com> wrote:
> RichardOnRails wrote:
> > Hi Raul,
>
> > I like your "battle plan"..
> > I especially appreciate your showing me how a regex can be written to
> > handle an arbitrary number of dot-separated numbers (rather than hard-
> > code distinct sub-expressions).
>
> >>   if  line =~ /^ (.*?) [a-zA-Z] /x
>
> > I thought I could simply remove the question-mark.
> > So, your question mark is clearly working,  but HOW?
>
> Richard
>
> I saw that Gavin has given you (in another thread) a general tutorial on
> this. I add a simpler explanation just in the context of the problem we
> treated;
>
>   .*  means 'as many characters as possible'
>
> Now, the point 1 of the 'battle plan' was (I quote):
> "1) we first collect everything until the first letter (not included);
> we
> will consider this the Prefix."
>
> So we want to tell the Regexp Engine: "as few characters as possible
> until you see a letter (a-zA-Z), then stop right there!".
>
> Let's examine the 2 expressions, with and without the question mark:
>
>            (.*?)                      [a-zA-Z]
>  minimal nr of chars needed until ..  1st letter
>
>            (.*)                       [a-zA-Z]
>  as many chars you can get
>  possible get away with,  and then ..  a letter
>
> An example:
>
> s="2.1Topic 2.1"
>
> md = s.match( /^ (.*?) [a-zA-Z] /x )
> md[1]  # => "2.1"
>
> md = s.match( /^ (.*) [a-zA-Z] /x )
> md[1]  # => "2.1Topi"
>
> Have you seen? Both expressions were satisfied, but in different ways:
> a) the first (with .*?) tried to find the minimal number of characters
> until the first letter, and so it stopped when it found the 'T' of
> Topic.
>
> b) the second expression tried to find as many characters as possible,
> only bounded by having to then find a letter, so it stopped at the 'c'
> of Topic.
>
> With sense of humour, somebody observed that ".*? values contentment
> over greed"; and since then the ".*?" were called "not greedy", while
> the ".*" were called "greedy".
>
> [I stop here as Gavin described to you '.+" & co].
>
> One advice: the key to learn the regular expression is to read a good
> book (just trying them drives one insane) while experimenting (just
> reading drives one insane too). The time spent pays you back very
> quickly at the first serious exercise (as you can develop a
> 'battle-plan' rather than a 'guerrilla war' with the regexps).
>
> I am glad that you found the script useful, and I hope that this helped
> too
>
> Raul
> --
> Posted viahttp://www.ruby-forum.com/.

Hi Raul,

Thank you very much for your expanded analysis.

> I saw that Gavin has given you (in another thread) ...

I started a new thread on the "(.*?)" because this thread was getting
too long.  And Gavin tuning me on to "greedy" was a big boost.  That
let me find some relevant stuff in "Mastering Reglar Expressions, 2nd
ed."

> An example: ...

Your example is great.  I went back to Hal Fulton's "The Ruby Way, 2nd
ed." and http://www.ruby-doc.org/core/classes/Regexp.html for
additional Regexp#match documentation.

Not withstanding your exposition and the documentation cited,  my
reptilian brain refuses acceptance on this issue.  But by running the
examples given and some of my own construction,  I should get over
this hump.  (I wrote my own NFSA in C for a client's application
roughly 30 year's ago, so I should be equal to the task.)

I'm not going expose my ignorance with any further questions on this
matter.  I'll do my homework :-)

With thanks and best wishes,
Richard