David Vallner wrote: > James Britt wrote: >> (Offhand, I don't see how static or explicit typing would help track >> these sorts of issues. Unit tests might.) > > Hrm. Mechanize or htmltools optionally passing HTML input through tidy > perhaps? I've no idea what the scope of htmltools markup error recovery > capabilities is, that just might help. Minimal, in my experience. There are some very, very broken pages out there. My current method for doing this sort of thing involves sniffing the character set, normalising to utf-8, chucking the output through tidy to get xml, ripping off the xml processing instruction and passing what's left through REXML. You have to take the processing instruction off because if the page actually includes text in more than one character set (you'd be surprised how often this happens), the normalising won't be complete and tidy will get it wrong half the time, which barfs REXML. I can show code if you want. In any case, this is tangential - the fundamental issue is that static and explicit typing can't catch semantic errors. The original paper on Hungarian notation (which Joel Spolsky goes on about at http://www.joelonsoftware.com/articles/Wrong.html) nails this problem, but I've never seen a language whose interpreter/compiler enforces matched variable and function naming conventions. -- Alex