Digest #202 2022-12-22
Contributions
These documentation requirements seem burdensome and, in many cases, of little benefit to end users. I assume it is a holdover from ANS94. Is there any intent to relax this section in the future?
Replies
u2 should always be set to the number of characters read, excluding line terminators, not just when the line has less than u1 characters. (A strict reading of the current wording leaves u2 undefined when the line has u1 or more characters, excluding terminators).
If a line, excluding line terminators, is exactly u1 characters long, u2 will be set to u1 and it won't be clear if a complete line has been read or not: longer lines will also return u2 := u1.
To reliably distinguish between fully read lines and too-long lines, u1 should be selected to be one greater than the largest line length one is prepared to handle. The different cases can then be disambiguated as follows:
u2 < u1 ... a line was read completely u2 == u1 ... the first u1 characters of the next line were read
The whole point of the complicated specification of READ-LINE is to support reading arbitarily long lines using buffers that may be shorter than the line. When reading a line with u1 non-terminator characters (followed by a line terminator), READ-LINE has to return u2=u1, and the next READ-LINE has to return u2=0. That's not entirely clear from the specification (so maybe we should rewrite it for more clarity), but it's the only interpretation that allows knowing that the line actually has u1 characters rather than being the start of a longer line.