>
>
> James Britt (rubydev) wrote:
>
> >> I don't see why 'xmlparser' is not perceived as the ideal candidate for
> >> that task.
> >
> > It does look good.
>
> Except that it is based on expat, which we don't control, and which are C
> sources (making it more difficult to maintain).

Why is C code more difficult to maintain? Isn't Ruby itself written in C?
Even if we can't use expat directly, isn't it worth thinking about having the low-level XML parsring handled by compiled native
code?  Do we need to *control* expat to use it?

This is the expat license :

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.



>
> >> I think that no one would seriously consider having "a pure ruby"
> >> regular expression engine as the default regexp library.
> ....
> >> The same should hold for XML processing. The default implementation
> >> should be a _native_ one, to be as fast as possible, because that's the
> >> only way to go when one wants to handle multi-megabyte documents in a
> >> reasonable amount of time.
>
> XMLParser isn't much faster than the native Ruby XML parsers, except in
> Stream parsing, and it is much slower in many other common XML tree
> operations.  I don't see any reason to take on the burden of maintaining
> the expat sources if we're not going to see a significant speed increase.
> I haven't investigated the memory usage of XMLParser compared to NQXML or
> REXML, so that may be a factor.

"isn't much fater" is still faster, and I would see using expat for tree operations, mainly for stream processing.

>
> >>From the xmlparser readme file it looks like it provides the familiar SAX
> >>callback methods (startElement,  endElement, character
>
> NQXML uses blocks.  REXML uses methods that are similar to the Java SAX
> API.  As has been mentioned before in this newsgroup, there is no
> "standard" SAX API, so the familiarity you're seeing is artificial.

And, as has been mentioned before, despite the absence of any formal specification, SAX implementations in numerous languages are
remarkably similar. Even Microsoft has based its SAX parser API on the Java implementation.  It's a "standard" in the sense of
"that's just how most people do it."  To have a stream parser that raises events at obvious times, such as the start or end of an
element, but to *not* call these events 'startElement' or 'endElement' (i.e., to avoid following the 'common law' SAX API) seems
perverse.



James

>
> --- SER
>
>
> -----=  Posted via Newsfeeds.Com, Uncensored Usenet News  =-----
> http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
>  Check out our new Unlimited Server. No Download or Time Limits!
> -----==  Over 80,000 Newsgroups - 19 Different Servers!  ==-----
>