On Nov 20, 2006, at 11:35 AM, Paul Lutus wrote: > James Edward Gray II wrote: > >> I've seen valid XHTML that wouldn't be much fun to parse. You still >> need to worry about whitespace, namespaces, the kind of quoting used, >> CDATA sections, ... > > These are all relatively easy to parse. Even the CDATA sections are > clearly > and consistently delimited, so can be reliably skipped over and > encapsulated. That was the design goal of XHTML -- to be easy to > parse, to > be consistent -- assuming the syntax is followed. But if you use an already developed parser, you gain all their work on edge cases, all their testing efforts, all their optimization work, etc. I see what you are saying about knowing you can count on the data, but your messages are filled with a lot "as long as you are sure" conditions. Dropping a bunch of those conditions is just one more advantage to using a library. You say you are always surprised when people build up all this hefty library code when a simple regex will do, but I'm always shocked when I can replace hundreds of lines of code by loading and making use of a library. If we have to err on one side of that, I would prefer it be on the library using side. That said, I guess we'll just have to agree to disagree. That's for the intelligent and civil debate. James Edward Gray II