On Mar 28, 6:11 pm, Adam Akhtar <adamtempor... / gmail.com> wrote: > Hi im starting to use hrpicot and im having problems extracting > descriptions of various concert events from a page. Here is a sample of > the html > > <p> > <a name="concerts"/> > <span class="heading">Concerts</span> > <br/> > <span class="subheading">POPULAR</span> > <br/> > <br/> > <span class="textbold">Middle Field! Vol.4</span > > <br/> > Featuring electric-pop band The Stealth, Mac and Masaru, and others. Mar > 28, 7pm, ,500 (adv)/ ¡¦3,000 (door). Shibuya O-Nest. Tel: 03-3498-9999. > <br/> > <br/> > <span class="textbold">Philip Woo featuring Brenda Vaughn</span> > <br/> > Japanese pianist and soul singer performing with Andy Wulf and Kaori > Kobayashi. Mar 28 & 29, 7 & 9:30pm, ¡¦3,150. Cotton Club, Marunouchi. > Tel: 03-3215-1555. > <br/> > .. > .. > .. > etc > > I can get the artist band names fine using > names = doc.search("//span[@class='textbold']") > > but i cant get teh descriptions. In fact the descriptions aren't > indvidually wrapped up in any tags but rather just clumped together > under the paragraph tab with line breaks <br/> > > So I thought id just try > descriptions = > doc.search("/html/body/div/table/tbody/tr[4]/td/table/tbody/tr/td[2]/table/tbody/tr/td/span/p") > but when i try to puts descriptions nothing is printed to the screen. > > How would i go about getting this info??? any tips or ideas? > > Thanks > -- > Posted viahttp://www.ruby-forum.com/. Once you have the 'name' node you can use next_node to get the next elements in the document This method should work for your example: def print_names_and_descriptions(hpricot_doc) names = hpricot_doc.search("//span[@class='textbold']") names.each do |name| node = name.next_node node = node.next_node until node.text? and node.inner_text =~ /\w +/ puts name.inner_text puts node.to_s.strip puts end end