Digest #5 2016-02-03

Contributions

[14] 2016-02-02 15:47:01 AntonErtl wrote:

comment - Dealing with newlines

Up to Gforth 0.4, we used the C approach to text files: let the C library translate between OS-dependent newlines in the file and one newline character (typically LF) in memory on input and on output. That approach turned out to cause problems when dealing with CRLF-containing files in combination with READ-FILE and REPOSITION-FILE (among other cases), because READ-FILE referred to the in-memory length, while REPOSITION-FILE referred to the in-file length.

So, in Gforth 0.5 we switched to opening all files as binary files (whether BIN is used in fam or not); READ-FILE recognizes all three kinds of newlines (LF, CR, and CRLF), and CR and WRITELINE output the standard newline of the platform (LF on Unix, CRLF on Windows). If the user reads text files with READ-FILE or writes them with WRITE-FILE, they have to worry about that themselves.

The experience with this new (well, by now,16-year old) approach is positive; no problems have been reported, and the problems we had with the previous approach were solved.

This approach works so well, because Forth has tended to avoid dealing with newlines as characters or strings: We have CR and WRITE-LINE for outputting a newline, and READ-LINE and ACCEPT for inputting lines. In all these places the actual value of the newline is abstracted away. The C approach, OTOH is due to the fact that in the Unix roots of C newline was visible as a single character, and they wanted to make programs written for that model run on OSs that have CRLF newlines.

, but no problems with that approach have been reported.

Replies

[r16] 2016-01-15 20:43:41 enoch replies:

comment - Proposal: end-case

You meant, no doubt, dup endcase. No point to rehash the old arguments for and against endcase current behavior. What is needed IMHO is for the committee to invite user code statistics. In my applications endcase and end-case find equal use.