On Jun 18, 2006, at 11:51 AM, Julian 'Julik' Tarkhanov wrote:
>> Since there's been a lot of talk about Unicode lately, I thought  
>> I'd throw out a Ruby library I've been working on to support  
>> Unicode characters and strings based on the 4.1.0 standard and key  
>> specifications from the Unicode Consortium.
>
> Holy wow. But the tables are just _huge_.

I should point out that I'm not presently using most of these tables;  
Unihan.txt alone is 27M. They're included purely for completeness as  
I've been developing the library.

No doubt the actual data storage requirements can be reduced  
considerably.

-- 
Rob Leslie
rob / mars.org