On 1/6/06, Shot - Piotr Szotkowski <shot / shot.pl> wrote:> Hello.>> Jacob Fugal:>> > Be careful with email validation via regex, it's harder than you might> > think[1][2]:> >> > /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D> > -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\> > x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!> > |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.> > ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[> > \x00-\x7F])*\]))*$/>> It does match> " spaces! @s! \"escaped quotes!\" "@shot.pl> and it's the first one doing this that I know of, kudos!
Not the first, I've been preceded by others that are even more correct(and complex) :). Particularly:
  http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
> Unfortunately, it does not match 'international' domains, so> it wouldn't pass addresses in the domain of, say, ggka.pl
Good point. When I wrote this expression, I was only considering ASCIIcharacters in the 0x00-0x7F (0-127 decimal, which doesn't includeextended characters). Looking back at RFC822, it looks like that RFCis likewise limited. It has no support for extended ASCII or UNICODE.This is reasonable, based on the age of the RFC (1982).
As I understand from Yohanes' post in this thread, RFC2822 (2001)supercedes RFC822, so I assume RFC2822 probably takes extended ASCII-- and hopefully UNICODE, as well -- into account. Time to update theregex! I'll leave it to someone else, however. ;)
Jacob Fugal