Proposal: Obsolescence for SAVE-INPUT and RESTORE-INPUT

Formal

This page is dedicated to discussing this specific proposal

ContributeContributions

ruv [291] Obsolescence for SAVE-INPUT and RESTORE-INPUTProposal2023-03-02 19:04:49

Author

Ruv

Change Log

2023-03-02 Initial version

Problem

The words SAVE-INPUT and RESTORE-INPUT almost don't bear any usefulness to programs, and they only burden implementers.

These words have the following problems:

Too few guarantees to programs: RESTORE-INPUT may work or fail depending on on the input source kind (file, pipe, string, keyboard).
The returned flag of RESTORE-INPUT is inconsistent with other words: true means fail, false means success. A better variant could be a throwable ior.
In some systems RESTORE-INPUT works incorrectly in some cases: it restores the position in the input buffer and returns success, but doesn't restore the content of the input buffer.

These words are almost not used in programs. I asked in comp.lang.forth and ForthHub. Only one program was mentioned: "Lambda Expressions in ANS Forth" by Gerry Jackson (sources), and in this case the problem can be also solved without these words.

What is sometimes required in programs is an ability to parse (extract) a text fragment from the input stream, and later translate (evaluate) this fragment in the current context, regardless whether the input stream was switched or not. Such an API should be designed separately. The words SAVE-INPUT and RESTORE-INPUT cannot help on that.

Solution

Declare the words SAVE-INPUT and RESTORE-INPUTas obsolescent, to destandardize (remove from the standard) them on the next iteration.

Consequences

This change doesn't affect the standard systems. The new (or updated) standard systems can be made slightly simpler by not providing an implementation and documentation for these words (if they are not used internally).

The standard programs that employ these words gain a new environmental dependency, and later they become non compliant to the new versions of the standard.

Proposal

In the section 1.4.2 Obsolescent features, after the phrase "This standard designates the following word as obsolescent:", add:

6.2.2182 SAVE-INPUT

6.2.2148 RESTORE-INPUT

In each of the glossary entries 6.2.2182 SAVE-INPUT and 6.2.2148 RESTORE-INPUT add the following note:

Note:

This word is obsolescent and is included as a concession to existing implementations.

JohanKotlinski [r999] 2023-03-03 07:04:18

SAVE-INPUT/RESTORE-INPUT are used in some user-space code, for example in Gforth assemblers + cross-compilers. Example:

: .times{  ( n -- input n )
  dup >r 1 > IF  save-input  THEN  r> ;
: .}times  ( input n -- input n-1 / 1 / )
  1- dup 0>
  IF  >r restore-input throw r@ 1 >
      IF  save-input  THEN  r>
  THEN ;
:D

I can only imagine one workaround, which is to copy input to a buffer, and EXECUTE it from there. Obviously, such a workaround increases CPU+RAM usage.

ruv [r1000] 2023-03-04 20:37:49

SAVE-INPUT/RESTORE-INPUT are used in some user-space code, for example in Gforth assemblers + cross-compilers.

It's a good finding. The words .times{ and .}times are formally defined in the standard Forth. Although, as I can see, they are not used anywhere.

One problem with this construct is that it can work or fail without any explanation, — since the standard does not guarantee any conditions in which restore-input shall work. So the construct .times{ ... .}times is actually a system-specific means (or it has an environmental dependency concerning behavior of restore-input).

Another problem with this construct is that the body is performed at least once regardless of n. To correctly implement the case of n=0, we need to parse the input stream till the corresponding closing .}times (with possible nesting). But if we parse the input stream in this way, we can save the extracted text into a buffer and just translate (evaluate) it the given number of times, without employing save-input/restore-input.

I believe, we should have an API that helps in such tasks. Under the hood it can save a content or position (depending on the input source kind), and provides a uniform interface. Actually, such an API can be even implemented in a portable way if we standardize two things:

When the input stream is a string, the line comment shall skip up to the nearest line terminator or to the end of the input buffer (what is encountered first) (see a comment).
A program should not depend on the input source kind from which it's translated. It means, it should not rely that the input buffer contains a single line, and that refill reads a single line (see Portable line-oriented parsing). Probably, some helper words should be standardized for that.

Obviously, such a workaround increases CPU+RAM usage.

It depends on how save-input is implemented. A reliable implementation can also save a fragment of the input stream into the memory. OTOH, repeating reading from the file can take more CPU than reading from memory.

GeraldWodni [r1077] 2023-09-14 15:33:36

The committee considers this proposal formal and asks the author to change its status to "CfV - Call for Votes" whenever he deems it ready.

Note: The committee likes to point out, that these words cannot be made informal, as they are used to implement interpreted loops.

Formal

albert [r1128] 2023-11-07 09:16:20

There is a technique to reuse input that doesn't disturb the current input stream. That is saving and restoring >IN. This may be restricted to the current input buffer, but that may cover a substantial part of the cases.

Clearly the original intent was a possibility to be a factor of INCLUDE , interrupting the current input stream. However as the proposal points out, this is not going to be a useful system word, rather a later burden. I would call this "meddling in matters that should be up to the implementer". This kind of words should be exterminated from the standard. They are almost never needed, and hard to implement. So obsolete these words!

albert [r1135] 2023-11-13 20:26:44

I don't understand what this proposal has to do with "interpreted loops" . All noforths and ciforths (Dutch Forths) have interpreted loops with using SAVE-INPUT and RESTORE-INPUT.

albert [r1136] 2023-11-13 20:27:07

I don't understand what this proposal has to do with "interpreted loops" . All noforths and ciforths (Dutch Forths) have interpreted loops without using SAVE-INPUT and RESTORE-INPUT.

ruv [r1237] 2024-06-17 19:42:29

what this proposal has to do with "interpreted loops"

As far as I know, "interpreted loops" are not used in standard programs (i.e., Forth programs that are independent on a particular Forth system). And the systems that provide save-input and restore-input may continue to provide these words.

An alternative way to implement an interpreted loop is to parse the input stream (the loop body) into a buffer and evaluate the buffer.

Concerning possible problems, see #217 New Line characters in a string passed to EVALUATE.

Reply New Version

albert [350] SAVE-INPUTComment2024-07-10 09:40:03

If you have a Forth that always slurps the file for including and lock blocks that are interpreted in memory, life becomes much easier. I remembered testing properly handling exceptions coming from a string evaluated from a block that you have loaded from a file. That was in transputer forth and Marcel Hendrix was able to pull that off. I am not sure I could do that flawlessly in ciforth and honestly I'm not sure tforth was defect-free despite the elaborate testing. If everything is in memory you can get away with SAVE and RESTORE. The remember the start and end and the current parsepointer, and restoring it is a breeze. E.g. : EXECUTE-PARSING ROT ROT SAVE SET-SRC CATCH RESTORE THROW ; A simple exeampe counting words in a string
SAVE SET-SRC 0 BEGIN NAME NIP WHILE 1+ REPEAT RESTORE shows that you don't have to manipulate execution tokens, as long as SET-SRC can make a given string the input buffer. I can't propose to replace SAVE-INPUT and RESTORE-INPUT by SAVE and RESTORE, because not having REFILL and line by line compilation is too revolutionary. Most modern Forth will slurp files for include nowadays, however.

ruv [r1266] 2024-07-11 12:53:42

 : EXECUTE-PARSING ROT ROT SAVE SET-SRC CATCH RESTORE THROW ;

The standard does not allow save-input and restore-input to be used this way, because the identity of the input source on restore-input will not be the same as on save-input (but it shall be the same).

not having REFILL and line by line compilation is too revolutionary.

Not quite. When the input source is a buffer, refill loads the next buffer (not the next line). When the input source is a multiline string, refill does almost nothing (see also my proposal for "\").

So, not having "line by line compilation" is not something revolutionary.

Reply New Version