Gonzalo Rubio wrote:
> I created a Ruby proxy for a FoxPro app that needs to fetch data from a 
> WebService (which returns it in XML) and read it in CSV format (for 
> which i use REXML parser and output the CSV by hand)
> To do this the WebService returns me a Base64 encoded XML that i then 
> decode and process.
> 
> Everything works ok until you have non-standard characters in the XML 
> data (non 7-bit characters, i.e. Western European accented characters) 
> since the REXML parser dies complaining about a closing tag not found.
> I looked for an entities processor or a character encoding converter in 
> the standard library and i coudn't find it.
> 
> I ended doing an ugly hack by feeding a Hash with the accented character 
> as the key, and the entity as the value, and then replacing back and 
> forth the returned data.
> my function looks like this:
> 
> def iso2entities(str, inverse)
>   rep = Hash.new
>   rep['á'] = 'á'
>   # ... snipped code ...
>   rep['©'] = '©'
> 
>   unless inverse
>     rep.each{|code, entity| str.gsub!(code, entity) }
>   else
>     rep.each{|code, entity| str.gsub!(entity, code) }
>   end
>   return str
> end
> 
> It works, but feeding the Hash by hand is time consuming and code 
> obviously looks like an ugly work-around... is there a "ruby standard" 
> way to do it?

Hey Gonzalo.  I was having the same problem.  I'm not at a final 
solution, but part of what worked for me was changing my XML character 
encoding (find it in the first line of your XML file) from UTF-8 to 
ISO-8859-1.  For some reason REXML can parse characters encoded by more 
than 7-bits in this format (like é).  Hope that helps.


Russ

-- 
Posted via http://www.ruby-forum.com/.