On 11/29/05, basi <basi_lio / hotmail.com> wrote: > Yes, I learned this convention when I took a keyboarding (i.e., typing) > lesson in high school. Sometime ago, a style manual for word processing > appeared, and one of the advice is to use only one space to separate > sentences. The reason given is that in a justified format, those two > spaces can become four spaces, or even more. Anyway, a lot of text now > has one or two spaces between sentences, and this wouldn't be a > reliable indicator of sentence boundary. I too learned the two space after a period convention years ago and also recently learned that with modern fonts and word processors it is not necessary. It was tricky to retrain myself, but I did, and have been using just one space ever since. So like you say, that isn't a reliable way to discern sentences. I would recommend following the advice of first filtering out false positives (possibly even replacing them with temporary markers, Mr. becomes $MISTER$ or similar), then splitting on punctuation. If you then test on various sample texts you should be able to find more false positives that you might have missed. Ryan