Dave Thomas <Dave / PragmaticProgrammer.com> wrote:

> str = File.open("x.html") {|f| f.read}
> str =~ /.../m
> 
> > This very much goes against my sense of aesthetics. There's no need
> > to read in the file beyond a successful match, and there's no need
> > to read further when an orphaned </title> or a </head> tag are
> > encountered.
> 
> All true, but at the same time, if you can do it in two lines rather
> than writing a full parser, isn't there some compensating gain to be
> had?

The best solution would be to have someone else write the parser... ;-)


>    def findTitle(file)
>       str = ''
>       loop do
>         begin
>            str << file.sysread(2048)
>           puts "next"
>         rescue EOFError
>            raise "</title> not found in file"
>         end
>         break if str =~ %{</title>}
>       end
> 
>       return $1 if str =~ %r{<head.*?>.*?<title.*?>(.*?)</title>.*?</head>}m
> 
>       raise "Couldn't find title in file"
>    end
> 
>    title = findTitle(File.open("test.html"))
>    puts title
> 
> Can't say as I've tested this, but it _might_ work ;-)

I'll see if I use this snippet then.

Thanks!
Michael

-- 
Michael Schuerig
mailto:schuerig / acm.org
http://www.schuerig.de/michael/