ashbb shoeser wrote in post #1061553:
> Hi Sebastjan,
>
>> Now the process runs through
> Good!
>
>> the stuff that is written in the file is mixed with
>> the legacy content
> Me too.
> But your code re-open the file with 'a+' mode.
> So, I think this is a normal behavior.
>
>> the new content is again jibberish.
> Ah,... what does that mean?

Actually the new content is not even written to the file, but the file is stil encoded as Unicode so some special characters in my language ( and ) are not displayed corectly. For example, "" is printed out like ڵڵط

> I got the file mixed the following:
>
>    Here are the unused characters:
>    "&a", "&b", "&c", "&d", "&e", .....

If this is on the end of your file, then this is correct. I don't get any of the added content written anywhere in the file.

> Do you mean that this is jibberish?

Jibberish: ڵط

> Sorry, I don't understand what you want to do correctly.

1. Input file: two column tab delimited and Unicode encoded
2. Replace the first column with ""
3. Run the rest of the code (finding duplicates, used and unused charactersd)
4. Write the unused characters to the input file

I've attached the code which is compiled as *shy app again.

I know the main issue is, that my input file is Unicode encoded, but I get that from another program that supports only Unicode.

Thank you for your patience:)

Two more notes:
- the *shy app is about 420 MB in size. Is that normal?
- the *shy app takes quite some time to load. Is that normal?

regards,
seba

Attachments:
http://www.ruby-forum.com/attachment/7415/dup_app.rb