On 20/nov/06, at 03:50, Paul Lutus wrote: > array = page_content.scan(%r{<p>(.*?)</p>}m).flatten Please note that the P end tag isn't required in HTML 4.01: http://www.w3.org/TR/html4/struct/text.html#h-9.3.1 -- Gabriele Marrone