On 19.06.2007 09:33, Alin Popa wrote:
> Alin Popa wrote:
>> Alex Young wrote:
>>> Alin Popa wrote:
>>>> Hi guys,
>>>>
>>>> After some research I still cannot find a way how to see if a file is
>>>> plain text or binary. In fact I want to check if a file is plain text no
>>>> matter what characters are in it.
>>>> This thing may be possible by using ruby ?
>>> I think so, but it's a little unclear exactly what you're trying to
>>> achieve.  Do you have an example?
>> I'm trying to do a replace in file for some text but I don't want to 
>> consider files like archives or other binary files.
> 
> Of course, when I'm on windows I can go after the file extension and try 
> to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want) 
> but I don't know how to do it on Linux/Unix OS where file extension is 
> not mandatory.

You could read the file (or portion of the file), create a histogram of 
byte (or groups of bytes) occurrences and compare that to what you 
expect for text files (e.g. most chars are "0-9a-zA-Z" and punctuation).

You could as well use command "file" and parse its output.

Kind regards

	robert