Digest #7 2016-03-22

Contributions

[16] 2016-03-21 02:27:27 BerndPaysan wrote:

comment - 3.3.3.5 Input buffer: "A program shall not write into the input buffer."

There are a number of reasons why this might have undesirable side-effects or not work.

In EVALUATE, the input buffer is the actual string, likely in the dictionary. Writing to it may modify that memory permanently, so the second call will use the modified version. If the input string is in flash, writing might not be possible, or result in unexpected behavior (writing to flash only flips the bits in one direction, usually from 1 to 0, erasing is only possible in larger blocks).

On blocks, the input buffer is the actual block buffer, modifying it can result in writing the changes back to disk (if you UPDATE the block later, or if the block is memory-mapped, and no UPDATE is necessary).

On files, the input buffer may also be mapped into memory, as loading the whole file and then only scanning for newlines may be faster than reading the file line-by-line. Here, the file may be mapped read-only or copy-on-write, which either results in a memory access exception when writing, or in no permanent change on disk.

For terminal input, which is obviously writable use-once memory, it is likely that the change is predictable.

Therefore, it is not a good idea in general to write into the input buffer: The possible reactions are non-portable and the side effects are unlikely desirable.


[17] 2016-03-21 02:32:54 BerndPaysan wrote:

testcase - Check for evaluate SOURCE is the string itself, not a copy

: GS1 S" SOURCE" ;
T{ GS1 EVALUATE -> GS1 }T

Replies

[r18] 2016-03-21 13:26:08 AntonErtl replies:

comment - 3.3.3.5 Input buffer: "A program shall not write into the input buffer."

If a file is mapped copy-on-write (and writable), then writing there is relatively harmless. One harmful (if unintended, otherwise just unportable) case is if the file is mapped shared (and writable); then the write will eventually change the source file. Alternatively, the file could be mapped read-only, the the write would cause an exception, which is probably not intended.

However, if we wanted to tighten the standard in this respect, it would be easy to require and implement that the input buffers from files are private and writable; but the other reasons still exist. In particular, the Forth-94 TC discussed this at length in RFI 6, which list some more reasons, among them:

Storing into 'input buffers' is disallowed because we permit input sources to nest indefinitely and it is not practical for systems that conserve resources to guarantee unique concurrent addressability of all nested input sources, nor is it practical to create separate save areas for all current input buffers just in case someone stored into one of them. The TC specifically intends that, when input is coming from refreshable sources, implementations may refresh their buffers on un-nesting to conserve resources, and that when logically possible implementations may use transient, shared buffers (as is common practice with LOAD on multiprogrammed systems.)

[...]

The TC expects all Systems to process buffers provided by EVALUATE in place. This is logically necessary, in our view, since there are no upper limits on the lengths of these buffers. Since it is semantically permissible to describe more than half of addressable memory in an EVALUATE string it is not in general possible to copy such a string elsewhere and address it consistently with the definition of SOURCE .

The Forth-94 committee also discusses a possible tightening wrt EVALUATE:

Given these conditions, it is deterministic for an application to store (with great care) into EVALUATE buffers that it knows to be active, although such methods pertain exclusively to EVALUATE and certainly not to any other input stream source.