On Monday 02 January 2006 01:28, Wilson Bilkovich wrote: > This reminds me of U.S. street addresses. I (on and off) do work that > interfaces with a huge IBM mainframe application. That system has > many, many separate fields for street addresses: > Number, Direction, Street/Route, Quad, Suffix, Apartment, Line2, etc. > I laughed at the way the original engineers had overbuilt. Ha ha ha. > ... > ... > Then I had to write code that took a single address string and split > it into its component parts, and I stopped laughing. > e.g. 123 N. NESTOR LANE RD. SE #10B > > Life is complicated, it turns out. That's nothing. Addresses here are based on landmarks: De donde fue Texaco Viejo 1/2 C al E, 2 c al S City Name, Department name (sometimes), Nicaragua Break that one up. (literal translation is: From where the old texaco USED TO BE (it's a petronic now), 1/2 a block to the east and 2 blocks to the south) That is just how addresses are here. > On 1/1/06, Gerardo Santana Góíez Garrido <gerardo.santana / gmail.com> wrote: > > We had a similar problem at work. > > > > In the Spanish speaking world we use two last names: one from the > > father's family (apellido paterno) and another from the mother's > > family (apellido materno). For the "first name" there's no limit in > > the number of names. > > > > Fortunately for us, the names were stored in the database as: > > > > <apellido paterno> <apellido materno> <nombres> > > > > But there was a difficulty. In Spanish we have last names composed of > > more than one word like "de la Vega", "y Cruz", "de las Casas" > > > > Examples: > > > > Cruz y Cruz MarùÂ del Rosario > > de la Vega DomùÏguez Jorge > > Ponce de Leóî Ernesto Zedillo > > > > We couldn't avoid regular expressions: > > http://santanatechnotes.blogspot.com/2005/12/matching-iso-8859-1-strings- > >with-ruby.html > > > > ---------- Forwarded message ---------- > > From: mathew <meta / pobox.com> > > Date: 21-dic-2005 16:17 > > Subject: Re: The "ruby way" to break apart a name? > > To: ruby-talk ML <ruby-talk / ruby-lang.org> > > > > Jeff Cohen wrote: > > > Assume for simplicity that the the first name is the text up to the > > > first space, and the last name is the text after the last space. > > > > [...] > > > > > But something about split_name still feels a bit "wrong", > > > > Well, I think the bigger issue is that your assumptions are wrong. :-) > > > > In some countries, the surname is written first, then the 'first' name. > > Japan is an example. Some Japanese write their names in reverse when > > writing them transliterated to English, and some don't. (...which makes > > me wonder which is the case for Matz...) > > > > Also, the number of words in the full name can vary between 1 and a > > fairly large integer. (I knew a guy with 6.) The number of name words > > required to actually route mail to a unique person can vary between 1 > > and (at least) 3, and compound names are not always hyphenated. Then > > there are things like "Jr", and salutations that go after the name > > rather than in front. > > > > There are quite a few postings in comp.risks about this kind of thing. > > In general it's very hard to do it right, and if (for example) you want > > to produce a "Dear <salutation goes here>" header for a letter, it's > > best to store the salutation as a separate field, rather than try to > > guess what it might be from the name. > > > > Of course, if you're working with a badly structured database someone > > else has given you, you may not have the choice... > > > > > > mathew > > -- > > <URL:http://www.pobox.com/~meta/> > > My parents went to the lost kingdom of Hyrule > > and all I got was this lousy triforce. > > > > > > > > -- > > Gerardo Santana > > "Between individuals, as between nations, respect for the rights of > > others is peace" - Don Benito JuáÓez > > http://santanatechnotes.blogspot.com/