Bryan Murphy wrote:


> I wrote a similar framework (which started before I discovered Cocoon2) 
> in Visual Basic for my employer that used the DOM.  This worked, but 
> this is a VERY innefficient way to process an XML document, especially 
> when you start to chain a lot of components together. 

[...]

> each component has to scan the ENTIRE DOM tree just to make 
> one change.  This is clearly a bad idea when you start to chain things 
> together.


[...]

> The biggest problem is in dealing with namespaces.  A lot of the Ruby 
> XML libraries don't seem to handle XML namespaces very well (if at all). 


Did you submit your bugreports to the maintainers, or suggest needed 
features, or solutions?


> My framework streams SAX2 events into a Transformer (an object that 
> makes changes to the SAX2 stream) that uses the Sablotron interface to 
> apply a stylesheet to the stream.


Most XSLT processors seem to build a whole DOM(-like) tree. If I 
understood you correctly, you need to avoid that, and use streams 
exclusively; why XSLT&Sablotron then?

> To accomplish this, my component has 
> to collect the SAX2 stream into a ruby string.  Load the stylesheet, 
> pass the Ruby string and the stylesheet into the sablotron transformer 
> and let it do it's mojo.  Sablotron returns a ruby string, which I then 
> have to *reparse* to generate the new SAX2 stream and send that off to 
> the next component.


In some cases, it should be possible to avoid redundant rounds of 
parsing. For example when one document will be transformed multiple 
times, with various XSLTs and parameters, all documents *and* XSLTs 
could be stored as preparsed objects, in a fast database. Parse once, 
marshal, load when needed.


> That's the ideal.  Imagine if XSLT4R used REXML as it's internal 
> representation, REXML was capable of generating it's tree directly from 
> SAX2 events and then capable of generating SAX2 events directly from 
> it's tree. 


AFAICS; please correct me if I'm wrong:
If the last element needs to be output first, nothing will be output 
before the whole doc is parsed (?)

Tobi


-- 
http://www.pinkjuice.com/