Matthew,

Excellent comments. You described exactly the effect I was going for, 
and also perfectly described the kind of conundrums that can take place 
when doing something like this. Since I was going for quotes, then 
commas, then spaces only if no commas existed, I have modified the code 
Ezra so kindly provided to look like this:

   def parse_tags(input)
     tags = []
     # pull out the quoted tags
     input.gsub!(/\"(.*?)\"/ ) { tags << $1; "" }
     #pull out comma separated tags
if input.include? ","
     #find all tags that end with comma - ex: tag1,tag2,tag3 ==> 
tag1,tag2,
     input.gsub!(/(.+?),/) { tags << $1; "" }
     #find all tags that begin with a comma - ex: tag1, tag2, tag3 ==> , 
tag2 ,  tag3
     input.gsub!(/,(.+?)/) { tags << $1; "" }
     tags << input
else
     tags.concat input.split(/\s/)
end

    # get whatever's left
    #    tags.concat input.split(/,/)
    # strip whitespace from the names
    tags.map! { |t| t.strip }
    # delete any blank tag names
    tags = tags.delete_if { |t| t.empty? }
    return tags
  end

#below is to test the function
puts parse_tags ('"jay" hello mary, goodbye stranger, its been long, 
"hello again"')


You'll notice the if else end section which looks for commas, and then 
decides to split tags by either commas, or spaces, conditionally

Thanks

- Jason

Matthew Smillie wrote:
> On Jun 1, 2006, at 21:48, web mail wrote:
> 
>> Jeff, Ezra,
>>
>> Thanks tons for the help. You guys both helped a lot. My only question
>> so far, for Ezra, is this: My second tag is "second tag", not in  
>> quotes,
>> but with a space between the two words. The tag IS meant to be two  
>> words
>> long. So I'm thinking that if you replace all the commas with spaces,
>> wont that split my second tag into two tags, when it was meant to be
>> just one tag ?
> 
> You're right, that's what would happen.  The problem is that in the
> case where a non-quoted multi-word tag (second tag) occurs in the
> same input as two space-delimited single-word tags (third_tag
> fourth_tag), you're out of luck.  This isn't a problem that's
> solvable without some sort of semantic knowledge, potentially quite a
> bit of it.  For example, how would you distinguish between ("first
> tag", second tag, dog pile) and ("first tag", second tag, dog pile)
> where (dog pile) is meant as one and two tags, respectively?
> 
>> The tags can appear in any order, and they can either be quoted,
>> separated by commas, or separated by spaces. I've decided that in the
>> case of all three, i would like quotes to have precedence, followed by
>> commas, followed by spaces. so, for an example, the following string:
> 
> I think that you're using 'precedence' in a way that's not quite
> formally correct.  Commas don't have precedence over spaces in the
> same way that * has precendence over +.  From your examples, a more
> accurate description might be: if commas occur outside of a quoted
> tag, assume that tags are comma-delimited; otherwise, assume that
> tags are space-delimited.

EXACTLY what i meant.

> 
> Even that, though, leaves you with a bit of a problem making the dog-
> pile decision, if it's the only input.  Is it meant as a single tag
> in a comma-delimited list (convention being to leave off the final
> delimiter in lists)?  Or is it meant as two tags in a space-delimited
> list?  It's still ambiguous as to what the user's intention might
> have been without delving into some sort of semantics.
> 
> Another poster (can't remember whom) interpreted your specification
> as "after the last comma, assume things are space-delimited", which
> might also be an option.
> 
> For reasons of simplicity (for the user - you're free to make your
> job as hard as you please!), though, I'd suggest that it would be
> best to stick to a single type of delimiter.  I prefer spaces,
> myself, since tags tend to be single words, and people are used to
> that sort of input (not only from the canonical examples of flickr or
> del.icio.us, but because it mimics search engines, for instance).
> 
> Matthew Smillie
> 
> [1] "Dog pile" in this sense: http://en.wikipedia.org/wiki/Pile-on
> which I thought of because, being the oldest and largest of all my
> cousins and siblings, it was always my misfortune to be on the bottom.


-- 
Posted via http://www.ruby-forum.com/.