--001485f20c161fe81104922accf9 Content-Type: text/plain; charset=ISO-8859-1 On Fri, Oct 8, 2010 at 4:10 PM, Paul <tester.paul / gmail.com> wrote: > On Oct 5, 9:33 pm, Steel Steel <angel_st... / ymail.com> wrote: > > > I can find the section I want with a regex but I don't know how to > > > iterate through the string looking for particular elements. I was > > > thinking about taking the section I'm interested in and saving it as > > > an array and then iterating through each array element (html line) > > > that way, but I thought there might be a quicker way to do it. > > > > $html.scan(%r{<div.*first section.*</div>}m).to_s.scan(/<li>/).size > > Thanks Steel. This worked fine. I just needed to make it a lazy > search with .*? > > I've got nothing against Nokogiri or the other solutions but I was > hoping for a solution like this that just uses the core libraries for > portability. > > Cheers! Paul. > > I would try REXML, then. It's an XML parser in the standard library. http://ruby-doc.org/stdlib/libdoc/rexml/rdoc/index.html I'd be reserve regex parsing of xml only for very informal situations where I just a quick solution non rigorous solution (ie a one-time solution that I plan to verify personally), I am pretty sure that it is not possible to correctly parse xml with regex. --001485f20c161fe81104922accf9--