On Mar 31, 2:59=A0pm, Nicolas Pioupiou <nicolas.e... / gmail.com> wrote:
> Hi everybody,
>
> I'm searching for a way to write a beautidull code which parse an HTML
> table.
>
> In fact, the table is dynamic.
> It always have three columns but have randoms lines.
>
> In each "line" (<tr></tr>) I want to extract the information inside the
> colums <td></td>. And then, I create a new object with these
> informations.
>
> I done it by splitting my html source with the method split("<tr>") and
> use regexp to extract what I want. But this solution do not satisfied
> me. It's unmaintanable.
>
> However, I'm pretty sure that I could do more clever code...
>
> Is there anyone has an idea, a clue a thought ?

Use a real parser. Example:

#---
require 'nokogiri'

html =3D <<eohtml
<html>
<body>
<table>
  <tr>
    <td>One</td><td>Two</td><td>Three</td>
  </tr>
</table>
</html>
eohtml

doc =3D Nokogiri::HTML(html)

doc.search('//tr').each do |line|
  puts line.search('td/text()')
end

#---
Output:
One
Two
Three