--bp/iNruPH9dso1Pn
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

> >>It's quite fast. I use it for huge XML documents where
> >>rexml and nqxml are way too slow.
> >>
> > Just out of interest, how large was your "huge".  Some of my documents=
=20
> > are (literally) hundreds of megabytes.
>=20
> I just checked and found it to be about 10 megabytes. So actually not that
> much data. But it was already enough to let rexml run for hours :-)

There seems to be a problem/bug/whatever with the current version of
REXML that makes large files take extra long to process. It reads the
entire file in before it starts processing, which kills performance. Try
adding this code to your program:


  module REXML
    class IOSource
      alias_method :_initialize, :initialize

      def initialize(arg, block_size=3D500)
        @er_source =3D @source =3D arg
        @to_utf =3D false
        @line_break =3D '>'
        super @source.readline(@line_break)
        @line_break =3D encode( '>' )
      end
    end
  end

That seems to fix the problem for other people.

--
Zachary P. Landau <kapheine / hypa.net>
GPG: gpg --recv-key 0x24E5AD99 | http://kapheine.hypa.net/kapheine.asc

--bp/iNruPH9dso1Pn
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAJSXfCwWyMCTlrZkRAjNcAJsF8I511wJdohLMvWRR8xy68gD+NwCeK4pu
OGdtmqcZmrSc/a26S5RcfuM=
=ojMg
-----END PGP SIGNATURE-----

--bp/iNruPH9dso1Pn--

> >>It's quite fast. I use it for huge XML documents where
> >>rexml and nqxml are way too slow.
> >>
> > Just out of interest, how large was your "huge".  Some of my documents=
=20
> > are (literally) hundreds of megabytes.
>=20
> I just checked and found it to be about 10 megabytes. So actually not that
> much data. But it was already enough to let rexml run for hours :-)

There seems to be a problem/bug/whatever with the current version of
REXML that makes large files take extra long to process. It reads the
entire file in before it starts processing, which kills performance. Try
adding this code to your program:


  module REXML
    class IOSource
      alias_method :_initialize, :initialize

      def initialize(arg, block_size=3D500)
        @er_source =3D @source =3D arg
        @to_utf =3D false
        @line_break =3D '>'
        super @source.readline(@line_break)
        @line_break =3D encode( '>' )
      end
    end
  end

That seems to fix the problem for other people.

--
Zachary P. Landau <kapheine / hypa.net>
GPG: gpg --recv-key 0x24E5AD99 | http://kapheine.hypa.net/kapheine.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAJSXfCwWyMCTlrZkRAjNcAJsF8I511wJdohLMvWRR8xy68gD+NwCeK4pu
OGdtmqcZmrSc/a26S5RcfuM=
=ojMg
-----END PGP SIGNATURE-----