"Paul Prescod" <paulp / ActiveState.com> wrote in message
news:4di77.17718$YK4.1545491 / e420r-atl1.usenetserver.com...
> ...
> using regular expressions for these things? Why not use an HTML parser
> module,  XML parser module, FTP module, SMTP module and so forth. I'm
> sure you have a good reason but I just want to understand it.

Well, the XML module or FTP module (I guess) use regular expressions in
them, or they should if they want to make life easy, and that's exactly what
I'm working with.  If I'm not writing the module itself, I'm modifying an
existing module to do what I need.  Often times I need functionality that is
not provided by the standard module.  FTP modules have in the past tended to
lack features, although I can't talk about any Ruby FTP modules as I haven't
used them.  So many "Internet" protocols are text based including the new
xyz protocol that hasn't been made yet.  Writing the module for that
protocol is easier with the power of regular expression text processing
IMHO.

The biggest area I use text processing with is in web sites.  Dynamic sites
like web forums require tons of processing.  Moving database material
around, reading/writing text files and/or e-mails, etc.   Moving and
processing text based database material for reports is so easy in Perl (PHP)
or Ruby mostly because of the regex integration (for me anway).  Sometimes I
need scripts that will process raw HTML and suck out the data I need for
database storage or conversion to another format.

> Do you think that it is possible that we are moving into a
> post-text-munging age. XML and the new XML-based protocols are supposed
> to let you work on the structured data, not the raw text.

Possibly, but I'm processing more text now than I ever did back in the 80's
and 90's.   Mostly because of the web and HTML.   Even though XML makes
accessing the data elements easier, it doesn't process the data in those
elements.   This can often times be regular text (in the case of a web
forum) and regular expressions make working with it so much easier.  I've
worked on projects where we had to convert data from a database or
certificate into XML and back.   It's nice to have something like Perl or
Ruby to help with the conversion process.  Although you can create or read
the XML element it still has to be in the form you want it.  I know very
little about XML though so take all that with a grain of salt. :)

> How does Ruby's depth of RE features compare to Perl's? Is it just the
> case that Ruby has them language-integrated or does it really have all
> of the little tricks and quirks of Perl?

This one I'm not sure of yet because I just haven't used Ruby very much but
it seems really close.

> I tend to think that Python is easier to use and more powerful than Perl
> -- if you leave aside regular expressions. So I would say: "Ruby has the
> regular expressions of Perl and the other features of Python plus a
> little more OO from Smalltalk."

The Python vs. Perl thing probably depends on what you're doing so I can
agree with your statement.  :)

--
// Chris