On Sun, 24 Oct 2004 17:35:46 GMT, Martin Pfeffer <udlduz / chello.at> wrote: > hi > my problem is i need a file with german words and so i try to create a > file parsing html sites and write extracted words to a database so my > questizn is what is the easyest way to extract text from html pages? > thx > Martin there's a /usr/share/dict/ngerman on my Debian box > wc ngerman 308860 308860 3998536 ngerman which tells me that the average word length is about 13 (!) letters. Unvorstellbar! s.