- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 27
Description
@rmkaplan reported:
Windows may still be the outlier, Mac OSX and Unix/Linux are LF.
It’s really a question of what the default EOL convention should be when output streams are created. It shouldn’t matter for input streams, at least for the operations that read characters and not bytes. The character reading functions should all recognize any of the EOL conventions on files and map them into the internal CR (the value of (CHARCODE EOL). (I remember setting it up that way—silly to have to know where or how a file was created before you could read it).
I thought that Unicode might have something to say about this, but they aren’t very helpful. They point out (in the Unicode 3.0 book that I have) that the CR/LF/CRLF conventions are confused…and then they add to the confusion. They define a new code U+2028 as the unambiguous “line separator” (also an unambiguous “paragraph separator" U+2029).
My Xerox XCCS book doesn’t say anything about this, so I’m not sure what the representation is in XCCS-compliant files (which would have run-codes for mixed character-set strings, but control characters are unique in any run).
My temptation is to change the default, so that we are more compatible with Unix/Mac files. We were never compatible with Windows/CRLF. If not for all files, then at least for UTF8 and UTF16 files.
In prowling around, I have also discovered that some of the low-level files got corrupted by the Japanese. A substantial part of the LLREAD file, for example, is filled with conversion tables for various Japanese coding systems, and this stuff is mixed in in a number of other places. Should have been in separate and later files—hard to imagine that these would be needed in the INIT.SYSOUT.