there two bugs in my first program  :
1¡¢it's __END__ not _END_
2¡¢it's   "elsif" not "elseif"
i change my program into
#the filename is: /home/pt/htmlscan_test.rb
HTMLRegexp =/(<!--.*?--\s*>)|
              (<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
              ([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
 comment,tag,tdata=match[0..2]
 if comment
   p ["Comment",comment]
 elsif tag
  p ["Tag",tag]
 elsif tdata
  tdata.gsub!(/\s+/,"")
  tdata.sub!(/ $/,"")
  p [ "TextData",tdata] unless tdata.empty?
  end
}
__END__
<!DOCTYPE HTML>
<HTML>
 <BODY>
    < A name="FOO"  href="foo"  attr  >foo</A>
    < A name="BAR"  href="bar"  attr  >bar</A>
    < A name=BAZ  href=baz attr  >baz</A>
    <!--
       dummy
       -->
  <BODY>
</HTML>

there is another problem too:
it can be run on netbeans ide6.8,got correct answer
["Tag", "<!DOCTYPE HTML>"]
["Tag", "<HTML>"]
["Tag", "<BODY>"]
["Tag", "< A name=\"FOO\"  href=\"foo\"  attr  >"]
["TextData", "foo"]
["Tag", "</A>"]
["Tag", "< A name=\"BAR\"  href=\"bar\"  attr  >"]
["TextData", "bar"]
["Tag", "</A>"]
["Tag", "< A name=BAZ  href=baz attr  >"]
["TextData", "baz"]
["Tag", "</A>"]
["Comment", "<!--\n       <A href=\"dummy\">dummy</A>\n       -->"]
["Tag", "<BODY>"]
["Tag", "</HTML>"]

but when i run it on terminal
pt@pt-laptop:~$ ruby /home/pt/htmlscan_test.rb
/home/pt/htmlscan_test.rb:20: syntax error, unexpected '<', expecting 
$end
<!DOCTYPE HTML>
 ^

what's the matter?
can you try it on your computer?
please help me.
-- 
Posted via http://www.ruby-forum.com/.