On Monday 02 January 2006 01:28, Wilson Bilkovich wrote:
> This reminds me of U.S. street addresses.  I (on and off) do work that
> interfaces with a huge IBM mainframe application. That system has
> many, many separate fields for street addresses:
> Number, Direction, Street/Route, Quad, Suffix, Apartment, Line2, etc.
> I laughed at the way the original engineers had overbuilt. Ha ha ha.
> ...
> ...
> Then I had to write code that took a single address string and split
> it into its component parts, and I stopped laughing.
> e.g. 123 N. NESTOR LANE RD. SE #10B
>
> Life is complicated, it turns out.

That's nothing.  Addresses here are based on landmarks:

De donde fue Texaco Viejo 1/2 C al E, 2 c al S
City Name, Department name (sometimes), Nicaragua

Break that one up.

(literal translation is: From where the old texaco USED TO BE (it's a petronic 
now), 1/2 a block to the east and 2 blocks to the south)  That is just how 
addresses are here.

> On 1/1/06, Gerardo Santana Gez Garrido <gerardo.santana / gmail.com> wrote:
> > We had a similar problem at work.
> >
> > In the Spanish speaking world we use two last names: one from the
> > father's family (apellido paterno) and another from the mother's
> > family (apellido materno). For the "first name" there's no limit in
> > the number of names.
> >
> > Fortunately for us, the names were stored in the database as:
> >
> > <apellido paterno> <apellido materno> <nombres>
> >
> > But there was a difficulty. In Spanish we have last names composed of
> > more than one word like "de la Vega", "y Cruz", "de las Casas"
> >
> > Examples:
> >
> > Cruz y Cruz Mar del Rosario
> > de la Vega Domguez Jorge
> > Ponce de Le Ernesto Zedillo
> >
> > We couldn't avoid regular expressions:
> > http://santanatechnotes.blogspot.com/2005/12/matching-iso-8859-1-strings-
> >with-ruby.html
> >
> > ---------- Forwarded message ----------
> > From: mathew <meta / pobox.com>
> > Date: 21-dic-2005 16:17
> > Subject: Re: The "ruby way" to break apart a name?
> > To: ruby-talk ML <ruby-talk / ruby-lang.org>
> >
> > Jeff Cohen wrote:
> > > Assume for simplicity that the the first name is the text up to the
> > > first space, and the last name is the text after the last space.
> >
> > [...]
> >
> > > But something about split_name still feels a bit "wrong",
> >
> > Well, I think the bigger issue is that your assumptions are wrong. :-)
> >
> > In some countries, the surname is written first, then the 'first' name.
> > Japan is an example. Some Japanese write their names in reverse when
> > writing them transliterated to English, and some don't. (...which makes
> > me wonder which is the case for Matz...)
> >
> > Also, the number of words in the full name can vary between 1 and a
> > fairly large integer. (I knew a guy with 6.)  The number of name words
> > required to actually route mail to a unique person can vary between 1
> > and (at least) 3, and compound names are not always hyphenated. Then
> > there are things like "Jr", and salutations that go after the name
> > rather than in front.
> >
> > There are quite a few postings in comp.risks about this kind of thing.
> > In general it's very hard to do it right, and if (for example) you want
> > to produce a "Dear <salutation goes here>" header for a letter, it's
> > best to store the salutation as a separate field, rather than try to
> > guess what it might be from the name.
> >
> > Of course, if you're working with a badly structured database someone
> > else has given you, you may not have the choice...
> >
> >
> > mathew
> > --
> >       <URL:http://www.pobox.com/~meta/>
> > My parents went to the lost kingdom of Hyrule
> >      and all I got was this lousy triforce.
> >
> >
> >
> > --
> > Gerardo Santana
> > "Between individuals, as between nations, respect for the rights of
> > others is peace" - Don Benito Juez
> > http://santanatechnotes.blogspot.com/