In article <5.1.0.14.2.20021002165913.0305ae10 / zcard04k.ca.nortel.com>,
Mark Probert wrote:

> At 05:43 AM 10/3/2002 +0900, Mike wrote:
>>In article <5.1.0.14.2.20021002161707.030aecd8 / zcard04k.ca.nortel.com>,
>>Mark Probert wrote:
>> >
>> > Hi.
>> >
>> > A simple one, again.  How do I strip out all chars
>> > except valid ASCII from a file?
>>
>>What do you mean by "valid ASCII" here?  Are you really after printable
>>characters < chr(128)? e.g.
>>
>>   line.gsub!(/[^ -~]/, ' ')
> 
> My apologies, I wasn't clear.
> 
> The file is plain text that has curses detritus all the way
> through it.  A sample is:
> 
> [23DOPC Save and Restore  [0m[m
> [23DOPC Shutdown
> [19DOPC Date               [0;7m[m
> [23DPort Configuration    [0;7m[m
> [23DIP Routing Admin
> [C    Unix Shell
> [10DOPC PM Coll. Filter
> [19DTL1 Configuration
> [20DLog Archive
> 
> Your suggestion is excellent at getting rid of the \C-?
> characters.  Any ideas on the escape strings?

Ick ;-)  Here you aren't dealing with characters, but sequences.  For
example the (implied Esc)[23D is the ANSI escape sequence to move the
cursor back (left) 23 columns.  There are a "metric buttload" of escape
sequences.

If this were a perl news group then I would concoct a complex regexp,
but you might be better doing something like this and then refining it
once it works.  First deal with the escape sequences:

    line.gsub!(/\e\[\d+[ABCD]/, ' ')      # cursor up, down, forward, back
    line.gsub!(/\e\[2J/, ' ')             # clear screen (could combine
                                          # with previous regex)

    line.gsub!(/\e\[\d+;\d+[Hf]/, ' ')    # cursor positioning
    line.gsub!(/\e\[[suK]/, ' ')          # save / restore / erase line

    line.gsub!(/\e\[\d+(?:;\d+)*m/, ' ')  # set graphics mode
    line.gsub!(/\e\[=\d+[hl]/, ' ')       # set mode

    # etc.

Then see what you have left in the line.

Some of the qauntifiers might need to be * rather than + if you are
allowed zero or more digits (e.g. there seem to be some \e[m in your
example, so /\e\[\d+(?:;\d+)*m/ might need to be /\e\[\d*(?:;\d+)*m/

There are more codes than this, they are easy to find on the web via
google.

Hope this helps,

Mike

-- 
mike / stok.co.uk                    |           The "`Stok' disclaimers" apply.
http://www.stok.co.uk/~mike/       | GPG PGP Key      1024D/059913DA 
mike / exegenix.com                  | Fingerprint      0570 71CD 6790 7C28 3D60
http://www.exegenix.com/           |                  75D2 9EC4 C1C0 0599 13DA