> Running with the debugger on for 1.8.7 brings up this discrepancy:
>
> The "letters" array for 1.8.7 is this:
> ["d", "a", "m", "p", "f", "s", "c", "h", "i", "f", "f", "f", "a", "h", "r", "t", "s", "k", "a", "p", "i", "t", "\303", "\244", "n", "s", "m", "\303", "\274", "t", "z", "e",     "n", "h", "a", "l", "t", "e", "r", "h", "e", "r", "s", "t", "e", "l", "l", "e", "r"]
>
> Now, "\303", "\244" is a UTF-8 encoding of umlauts-over-a (. In your 1.8 german
> hyphenation file, you encode the in itwith the latin-1 encoding \344.
>
> Your input text is UTF-8, but the library searches for the latin1 encoding. Changing
> the input to \344 for and \374 for made the test pass for me on 1.8.7.

I second that analysis. It seems to use text-hyphen in Ruby 1.8 with 
other languages than english (with any languages that use exotic 
characters not in ASCII), you will have to make sure that your input is 
in the same character encoding as the language file is. In the case of 
german, this is LATIN1. So opening and changing the file in your text 
editor has probably converted the file to utf8, Austin.

Fixing the 1.8 version in the general case (any input, any language file 
encoding) will be hard... and useless, since you would program towards a 
use case that should go extinct.

More than one solution offers itself ;)

a) convert the file test_bugs.rb back to latin1 (-> bad, will break soon 
again)

b) digging back through the old version history (I am sure you have it 
;)) - trying to see if [1] was specifically about german umlauts or if 
it was just the german and the size of the word that tripped the bug. If 
it was one of the latter - then remove those damn umlauts from the word 
(-> ae, -> ue) and use the new test expectations that derive from 
that. This would make the file ASCII again, and less sensible to editor 
conversion.

c) The solution you say you don't want: Dropping 1.8 support from newer 
gems. Since bundler & rvm this is increasingly simple to manage - I'll 
just limit my old projects to use an old version of text-hyphen.

Considering the impossible (aka: very laborious and quite not to the 
point) nature of the bug in 1.8, I would choose c) or (if must be) b).

best regards,
kaspar

[1] 
http://rubyforge.org/tracker/index.php?func=detail&aid=9807&group_id=294&atid=1195