"Michael T. Richter" <ttmrichter / gmail.com> writes:
> On Wed, 2007-22-08 at 08:13 +0900, Bertram Scharpf wrote:
>
>> The distinction between "text" and "binary" is the archetype
>> misdesign in DOS and Windows. 
>
>
> And this explains the distinction between opening binary vs. opening
> text in UNIX APIs since *LONG* before MS-DOS how?

Hmm.  So which Unix system call do you use to open a file in text
mode?

>> It means nothing more than
>> that in "text" mode line ends are translated from "\n" to
>> "\r\n" what is of no use but to disturb file positions and
>> string lengths. The only purpose of this is to detain
>> programmers from doing anything in a non-Microsoft way.
>> Anywhere else you don't need to care.
>> 
>> Sorry for the flame but that's the way it is.
>
> It would help if you actually said things the way they were.  This "text
> mode" vs. "binary mode" thing is a UNIX "innovation" (one of many which
> has plagued the computing world since UNIX's misdesign).  Let me
> introduce to you what "the way it is" really is....
>
> Way back in the bad old days, people talked to computers on teletype
> machines: combination printer/keyboard.  We didn't have these fancy,
> schmancy glass-screened terminals all over the place.  On these
> terminals "carriage return" meant "move the printer head to the far
> left".  "Line feed" meant "scroll the paper down one line".  These were
> completely separate actions requiring completely separate control codes.
> ("\n" is the "line feed" or "newline".  "\r" is the "carriage return".)
>
> Most systems of the day wrote everything in a single format.  There was
> no binary/text distinction.  Each line was ended by a carriage return
> and a line feed.  (I still have some of these systems up and running on
> my laptop thanks to good old SIMH.)  When you printed these files,
> whatever their contents were was run straight to the teletype and
> printed out verbatim.  That meant each line ended with "\r\n".
>
> UNIX, of course, being the half-bastard-child of real operating systems
> (MULTICS and ITS) that it was, had to do things differently.  To save on
> space (!) its creators, in their nigh-infinite wisdom and judgement,
> plagued the world with the notion of only using "\n" to terminate text
> lines in text files.  (Apparently saving one byte out of every line was

In fact, Multics used LF as newline and Unix copied the convention
from it.

> important!  Never mind that OSes on smaller machines than ever ran UNIX
> had no problem with that "wasted" carriage return....)  Of course this
> meant that you couldn't just copy the bits of a document directly to the
> teletype.  Oh, no.  You had to open the file in a special text mode so
> the OS would convert things behind the scenes for you, switching every
> "\n" into a "\r\n" before sending it off to the teletype.  This was
> perceived (incorrectly) as a Great Innovation.

You seem to be claiming that open(2) and read(2) had additional
functionality that was later removed (?).  That's interesting.  Do you
have a reference?

> Later, as the UNIX infection set in, "smart" terminals (teletypes and
> glass screen) started to, if set appropriately, automatically convert
> line feeds into carriage return/line feed combinations.  This was a
> feature added to make up for a misfeature in UNIX systems, though, not
> something that was really necessary.  (Indeed it breaks the definition
> of a line feed according to the ASCII definition thereof.)

Translation in Unix is done by the tty driver in the kernel, not the
tty device itself...

> MS-DOS arrived on the scene from a different direction.  It came from
> the CP/M side of things which was itself heavily influenced by IBM's
> operating systems (scaled down, of course, to the teensy CPU that ran
> it).  CP/M?  Used the more traditional (at the time) CR/LF combinations
> found in pretty much every operating system of the day other than UNIX.

You seem to be forgetting various systems that used other newline
conventions, e.g. Macintosh.  And that EBCDIC has a newline character.

> MS-DOS was a hack off of a CP/M clone for the new 8086 processor and, as
> such, inherited CP/M's approach to text files (and command line
> switches) which itself was inherited from IBM's (and others') various
> operating systems.

...And that MS-DOS also copied from CP/M the practice of marking the
end of a text file within the last cluster with ^Z which,

  1. required distinguishing between text and binary files independent
     of newline conventions, and
  2. indeed breaks the definition of ^Z according to the ASCII
     definition thereof.

Steve