-------- Original-Nachricht -------- > Datum: Sun, 25 May 2008 00:27:48 +0900 > Von: "Robert Dober" <robert.dober / gmail.com> > An: ruby-talk / ruby-lang.org > Betreff: Re: #plural? or #singular? > On Sat, May 24, 2008 at 5:08 PM, Dave Bass <davebass / musician.org> wrote: > Even worse sometimes it is undefined I guess, or caption may play a role. > > I can see data. > > Maybe some native speakers will tell me that this is not a correct > sentence, I do not know, but than there is > > I can see Data. > > Languages (plural) are just a big mess (singular) ;) > > Robert > -- > http://ruby-smalltalk.blogspot.com/ > > --- > Whereof one cannot speak, thereof one must be silent. > Ludwig Wittgenstein Dear Robert and Dave, well, this is what tree-tagger (see tags output below, for the tagset see my previous post) says: I can see data. (noun plural) I can see Data. (proper noun singular) England is a country. (proper noun singular) England are bound to lose the match. (proper noun singular) (nobody is perfect). English is a language. (proper noun singular) The English are eccentric. (noun plural) Languages (noun plural) are just a big mess (noun singular). Parts-of-speech tagging uses a Bayesian decision model, requiring training on a set of human-tagged text. There are large amounts of texts available for many languages, such as newspaper articles. The authors of tree-taggers claim about 96 % correct tagging somewhere in the docs ( can't find it right now). It's also fast - you can tag an entire novel in just a few seconds - and it's available for several major languages, not just English. Best regards, Axel ----------------------------------- I PP I can MD can see VV see data NNS datum . SENT . I PP I can MD can see VV see Data NP Data . SENT . England NP England is VBZ be a DT a country NN country . SENT . England NP England are VBP be bound VVN bind to TO to lose VV lose the DT the match NN match . SENT . English NP English is VBZ be a DT a language NN language . SENT . The DT the English NNS English are VBP be eccentric JJ eccentric . SENT . Languages NNS language are VBP be just RB just a DT a big JJ big mess NN mess . SENT . -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser