On Fri, Jul 15, 2011 at 1:46 AM, Michael Edgar <adgar / carboni.ca> wrote: > On Jul 15, 2011, at 12:45 AM, Austin Ziegler wrote: >> I've had folks asking me for a release of text-hyphen that works with >> Ruby 1.9, and while I've got something that passes the tests that I've >> created and added for MRI 1.9, it *loses* compatibility with Ruby >> 1.8.7 (and does so loudly in the tests) and JRuby (in either 1.8 or >> 1.9 mode, it appears). I need some help to get the last bits ready, >> because I'm not ready to drop Ruby 1.8 entirely (at least one more >> version). > Running with the debugger on for 1.8.7 brings up this discrepancy: > > The "letters" array for 1.8.7 is this: > ["d", "a", "m", "p", "f", "s", "c", "h", "i", "f", "f", "f", "a", "h", "r", "t", "s", "k", "a", "p", "i", "t", "\303", "\244", "n", "s", "m", "\303", "\274", "t", "z", "e", "n", "h", "a", "l", "t", "e", "r", "h", "e", "r", "s", "t", "e", "l", "l", "e", "r"] > > Now, "\303", "\244" is a UTF-8 encoding of umlauts-over-a (. In your 1.8 german > hyphenation file, you encode the in itwith the latin-1 encoding \344. > > Your input text is UTF-8, but the library searches for the latin1 encoding. Changing > the input to \344 for and \374 for made the test pass for me on 1.8.7. I think you're right. Now to figure out how to fix it properly in this case. -a -- Austin Ziegler halostatue / gmail.com austin / halostatue.ca http://www.halostatue.ca/ http://twitter.com/halostatue