On Jun 18, 2006, at 11:51 AM, Julian 'Julik' Tarkhanov wrote: >> Since there's been a lot of talk about Unicode lately, I thought >> I'd throw out a Ruby library I've been working on to support >> Unicode characters and strings based on the 4.1.0 standard and key >> specifications from the Unicode Consortium. > > Holy wow. But the tables are just _huge_. I should point out that I'm not presently using most of these tables; Unihan.txt alone is 27M. They're included purely for completeness as I've been developing the library. No doubt the actual data storage requirements can be reduced considerably. -- Rob Leslie rob / mars.org