>
>
>> The biggest problem is in dealing with namespaces.  A lot of the Ruby 
>> XML libraries don't seem to handle XML namespaces very well (if at all). 
>
> Did you submit your bugreports to the maintainers, or suggest needed 
> features, or solutions? 


As this is a problem which transcends more than one or two libraries, I 
was hoping to stimulate some discussion and maybe even some kind of a 
concensus before I went and tried to track down each individual author.

> Most XSLT processors seem to build a whole DOM(-like) tree. If I 
> understood you correctly, you need to avoid that, and use streams 
> exclusively; why XSLT&Sablotron then? 

and then...

> In some cases, it should be possible to avoid redundant rounds of 
> parsing. For example when one document will be transformed multiple 
> times, with various XSLTs and parameters, all documents *and* XSLTs 
> could be stored as preparsed objects, in a fast database. Parse once, 
> marshal, load when needed. 

I think you missed the point, maybe I wasn't clear enough.  Anyway, it's 
not that what are you saying isn't valid, it is for a lot of 
applications, just not the one I'm working on.

My framework works by essentially chaining a bunch of Ruby classes that 
consume and generate SAX2 events.  Some components may make changes to 
the SAX2 stream, some may not.  Some may even replace the SAX2 stream 
with a new stream!  What they do depends on the class, as it's entirely 
up to the programmer what classes to use and how they are chained together.

The problem is, each Ruby XML library has it's own different way of 
working with XML.  At some point, using my framework, somebody is going 
to want to apply a stylesheet to a SAX2 stream.  It's a very natural and 
powerfull way of manipulating XML.  My job as the framework creator is 
to not only make sure this is as easy as possible, but as efficient as 
possible, otherwise people won't use the framework (they'll move onto 
something that does what I couldn't get it to do).

Back to the XSL processor, there's no way of getting around it.  When 
the XSL processor runs, it needs to load the XML document into some sort 
of DOM to work.  Unfortunately, all the various Ruby XML implementations 
all consume XML in different ways, *NONE* of which are compatible at 
all.  To convert from one to another, you have to serialize the XML 
document into a string, and then reparse it into the new format.

If everybody had the option (but were not forced) to communicate via 
SAX2, that would go along way.  Even if it was faked, connecting the 
output of one to the input of another would be easy.  Over time, as the 
parser, processors, and you name it matrued, things would not just be 
easier but they would become more efficient.

I know, it may be somewhat hard to understand why this is important if 
you are building applications that use DOMs as opposed to Streams for 
XML documents.  But SAX2 has a huge performance and memory efficiency 
differential when compared to DOM based parsers.  My framework is an 
attempt to make building SAX2 based applications (and specifically web 
based applications) easier.  But at some point, the stream WILL have to 
integrate with other XML technologies and that's where we need a standard.

And if you think I'm just blowing smoke out my ass, both Microsoft's 
parser and the Java/C++ Xerces parsers are capable of generating SAX2 
streams from DOM documents, and capable of generating DOM documents from 
SAX2 streams.

Another conceptual example would be a File stream.  File streams are 
nice, because you can chain them together, for example:

contents = ZipReader.new(File.new('filename.zip')).read()

This is possible because everybody agrees on what a File IO stream looks 
like.  Nobody in the Ruby community seems to agree on what an XML stream 
looks like.

> AFAICS; please correct me if I'm wrong:
> If the last element needs to be output first, nothing will be output 
> before the whole doc is parsed (?)
>
> Tobi

I'm not quite sure what you are asking here, but I think it stems from 
the misunderstanding about what I was saying in the first place! :)

Anyway, as for me... I'm going to soon write an extension module for my 
framework that implements interfaces on top of existing Ruby XML 
technologies that fakes this interconnected standard (even if they're 
not the most efficient implementations).  Hopefully that will give me an 
example to show you guys.  Also, I won't do this without talking 
directly with some of the library authors.  This thread doesn't seem to 
be catching on as much as I'd hope so I may soon have to contact the 
various authors.

Thanks for you ideas,
Bryan