Hi, On 2010-08-24, at 9:49 AM, Michel Demazure wrote: > Michel Demazure wrote: >> Michel Demazure wrote: > >> 2. but when parsing "<foo>deuxième</foo>", I get "ème" (this was the >> initial bug I discovered in my app). >> >> This is not the first time I see the "grave accented e" giving trouble >> when scanning or parsing in ruby, whatever tool is used... >> > Sorry for posting again. Actually, in this last example, 'characters' is > called twice, the first call giving "deuxi", the second one "ème". trange feature, still a bug (?), but one can do with... Actually this is allowed by the XML spec, annoying as it is. Many parsers do this when encountering an entity (e.g. ') in the input stream (you get three strings, before, entity character, after). Some XML parsers have a parameter that tells it to join adjacent strings together before reporting a single string. I don't know if Nokogiri provides this functionality, but it might be worth a quick peek. Cheers, Bob > > _md > > > -- > Posted via http://www.ruby-forum.com/. > ---- Bob Hutchison Recursive Design Inc. http://www.recursive.ca/ weblog: http://xampl.com/so