6.1.2050 QUIT CORE

( -- ) ( R: i * x -- )

Empty the return stack, store zero in SOURCE-ID if it is present, make the user input device the input source, and enter interpretation state. Do not display a message. Repeat the following:

  • Accept a line from the input source into the input buffer, set >IN to zero, and interpret.
  • Display the implementation-defined system prompt if in interpretation state, all processing has been completed, and no ambiguous condition exists.

See:

Implementation:

: QUIT
   ( empty the return stack and set the input source to the user input device )
   POSTPONE [
     REFILL
   WHILE
     ['] INTERPRET CATCH
     CASE
     0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
     -1 OF ( Aborted ) ENDOF
     -2 OF ( display message from ABORT" ) ENDOF
     ( default ) DUP ." Exception # " .
     ENDCASE
   REPEAT BYE
;

This assumes the existence of a system-implementation word INTERPRET that embodies the text interpreter semantics described in 3.4 The Forth text interpreter. Further discussion of the interpret loop can be found in A.6.2.0945 COMPILE,.

ContributeContributions

alextangentavatar of alextangent [76] The reference implementation is incorrectSuggested reference implementation2019-03-16 14:59:29

: QUIT ( empty the return stack and ) ( set the input source to the user input device ) POSTPONE [ BEGIN ( missing in the reference implementation) REFILL WHILE ['] INTERPRET CATCH CASE 0 OF STATE @ 0= IF ." OK" THEN CR ENDOF -1 OF ( Aborted) ENDOF -2 OF ( display message from ABORT" ) ENDOF ( default ) DUP ." Exception # " . ENDCASE REPEAT BYE ;

Reported by marek.brunda at gmail.com

BerndPaysanavatar of BerndPaysan

Almost. I would put the BEGIN in before the POSTPONE [.

: QUIT
   ( empty the return stack and set the input source to the user input device )
   BEGIN
     POSTPONE [
     REFILL
   WHILE
     ['] INTERPRET CATCH
     CASE
     0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
     -1 OF ( Aborted ) ENDOF
     -2 OF ( display message from ABORT" ) ENDOF
     ( default ) DUP ." Exception # " .
     ENDCASE
   REPEAT BYE
;

BerndPaysanavatar of BerndPaysan

Ok, looking at it once more: Only in the error exit POSTPONE [ is necessary. So:

: QUIT
   ( empty the return stack and set the input source to the user input device )
   POSTPONE [
   BEGIN
     REFILL
   WHILE
     ['] INTERPRET CATCH
     CASE
     0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
     POSTPONE [
     -1 OF ( Aborted )  ENDOF
     -2 OF ( display message from ABORT" ) ENDOF
     ( default ) DUP ." Exception # " .
     ENDCASE
   REPEAT BYE
;

AntonErtlavatar of AntonErtl

[r327] still misses the clearing of the data stack as discussed in [r1154]. The present version also makes the clearing of the exception stack explicit (the parenthesis could be misunderstood as describing the following code).

One other thing to note here is that handler (from the reference implementation of throw) is not reset, because there is never a throw that goes below the exception frame pushed by the catch here; because in the reference implementation the return stack is used as the exception stack, no additional work is necessary to clear the exception stack (partially addressing [259]; however, the specification of quit also needs to be updated).

: QUIT
   RP0 @ RP! ( empty the return stack )
   ... ( set the input source to the user input device )
   POSTPONE [
   BEGIN
     REFILL
   WHILE
     ['] INTERPRET CATCH
     CASE
     0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
     POSTPONE [
     -1 OF ( Aborted )  ENDOF
     -2 OF ( display message from ABORT" ) ENDOF
     ( default ) DUP ." Exception # " .
     ENDCASE
     SP0 @ SP! ( perform the "clear the data stack" action of ABORT )
   REPEAT BYE
;

where sp0 and rp0 are (user) variables containing the addresses that point to the bottom of the data and return stacks, respectively.

Look at this reference implementation together with the one for catch and throw.

Reply New Version

ruvavatar of ruv [259] Should QUIT propagate exceptions?Request for clarification2022-08-18 12:09:37

After a recent ForthHub discussion Is QUIT expected to THROW? the following question arises.

In the presence of the following facts:

  1. QUIT is not updated in the Exception word set
  2. QUIT empties the return stack, but doesn't empty the exception stack.
  3. THROW restores depth of the return stack when there is an exception frame on the exception stack.

How the code :noname ['] quit catch . ." test-passed "; execute should work if the user pass abort from the user input device to the Forth text interpreter?

Does the Standard implies that the return stack is emptied, and then its depth is restored, and it's content is not garbled? Or that QUIT should not empty the return stack in presence of an exception frame? And, should the system eventually print "-1 test-passed"?

ruvavatar of ruv

AntonErtlavatar of AntonErtl

Of course in Forth-94 and Forth-2012 QUIT does not empty the exception stack because there is no exception stack in CORE. Now, with the exception wordset being mandatory, emptying the exception stack can be specified.

I tried your example (but with "3 throw" instead of "abort" to make it easier to see what happens) on Gforth, iForth, SwiftForth, and VFX64, and none of the printed "test-passed"; all of them behaved just like when you do "3 throw" on the command line without going through your code; i.e., in all of them QUIT empties the exception stack. So we should specify that in the standard.

AntonErtlavatar of AntonErtl

Concerning the throw code -56 that you mention on github, I don't think it has anything to do with the Forth word QUIT. My guess is that it's for the Unix signal SIGQUIT (however, none of Gforth, iforth, lxf, SwiftForth, VFX converts it into a Forth exception; instead, SIGQUIT terminates them all).

ruvavatar of ruv

because there is no exception stack in CORE.

Actually no, since the behavior of QUIT could be updated in the Exception word set. It looks like just an omission.


Concerning the throw code -56 that you mention on github, I don't think it has anything to do with the Forth word QUIT. My guess is that it's for the Unix signal SIGQUIT

  1. Then it should be named "SIGQUIT". It would be confused to call it "QUIT" when you have a word having name "QUIT".

  2. Then some hint to POSIX signals should be given.

  3. Then throw codes should be assigned to other signals too.

So SIGQUIT version doesn't look quite convinced to me.

AntonErtlavatar of AntonErtl

Some other throw codes corresponding to signals:

signal throw code
SIGINT -28
SIGBUS -23
SIGFPE -55
SIGSEGV -9
Reply New Version

lmravatar of lmr [324] control stack clearing?Request for clarification2023-12-24 03:10:07

QUIT must clear the return stack. But is there any part of the spec that mentions clearing the control stack? I don't see any mention of that in QUIT, THROW, ABORT or :.

AntonErtlavatar of AntonErtl

The control-flow stack may be (and usually is) on the data stack, so the obvious intent is that QUIT preserves it, and that THROW and ABORT restore it to the depth it had at the CATCH that caches the ball. However, the exception wordset chapter mentions the control-flow stack only in the names of exceptions, not as something that THROW should restore. And I don't find any such requirement elsewhere. I think we should fix this in the document.

AntonErtlavatar of AntonErtl

BTW, if you are thinking about "system-execution types" like nest-sys and loop-sys, they are specified to be conceptually on the return stack. If you place them elsewhere, they have to follow the changes on the return stack. In particular, they have to be restored on THROW, CATCH and eliminated at QUIT.

Recently a bug was found that hit two Forth systems that independently decided to keep the loop-control parameters in registers, and failed to restore them on THROW.

lmravatar of lmr

that QUIT preserves it, and that THROW and ABORT restore it to the depth it had at the CATCH that caches the ball

... and that (to be pedantic), subsequently, THROW and ABORT clear the data stack — that is specified

lmravatar of lmr

control-flow stack may be (and usually is) on the data stack

Actually I've never quite understood the rules for manipulating the data stack while STATE = -1 (via [ ] or immediate words). Do not let pushed items cross control-flow primitive boundaries (IF AHEAD etc)? Is this clearly specified for "standard programs" anywhere?

AntonErtlavatar of AntonErtl

THROW and ABORT only clear the data stack if there is no exception frame on the exception stack.

Section "3.1.5.1 System-compilation types" says:

These data types denote zero or more items on the control-flow stack (see 3.2.3.2). The possible presence of such items on the data stack means that any items already there shall be unavailable to a program until the control-flow-stack items are consumed.

And the programmer also has to ensure that no data stack items are left on the data stack above a control-flow stack item when the program tries to access the control-flow stack item. Maybe that should be added to the standard.

These restrictions are valid in any state.

lmravatar of lmr

THROW and ABORT only clear the data stack if there is no exception frame on the exception stack.

an "exception frame" only appears due to CATCH, right? Now I'm confused:

  • in GForth and others, if a no-good user program calls undefined words, or if they call THROW / ABORT explicitly, then I don't see any of their junk with .S when QUIT returns to refill user input. The data stack is cleared after (un-handled?) exceptions.
  • on the other hand, the suggested reference implementations above do use CATCH. So, when is the data stack cleared after a program throws (explicitly or via an error)?

Section "3.1.5.1 System-compilation types" says [...] The possible presence of such items on the data stack means that any items already there shall be unavailable to a program until the control-flow-stack items are consumed ... no data stack items are left on the data stack above a control-flow stack item ...

3.1.5.1 only talks about X-sys structures, but THEN REPEAT etc specs also imply consuming the control stack. OK.

These restrictions are valid in any state

Hm, are there any primitives that need the control stack available in interpreter STATE? What are you alluding to?

AntonErtlavatar of AntonErtl

The specification of throw specifies the case when there is no exception frame from the program on the exception stack under the label "there is no exception frame on the exception stack", but systems are free to implement throw and catch in any way that ensures that programs see the specified behaviour. In case of throw an efficient implementation is to handle the "no exception frame" case with a system-supplied exception frame so that throw does not have to distinguish these cases. The suggested reference implementation of throw does not separate out the "no exception frame" case, and the suggested reference implementation of quit provides this system-supplied exception frame. What is missing from the reference implementation of quit is the emptying of the data stack that is specified in throw when there is no exception frame. I'll do another iteration on the reference implementation to fix this.

3.1.5.1 explicitly mentions colon-sys, do-sys, case-sys, of-sys, orig and dest as system-compilation types, and these live on the control-flow stack (usually the data stack). There are also system-execution types (nest-sys and loop-sys) that live on the return stack.

State just selects whether the text interpreter performs interpretation or compilation semantics, and it has no other meaning in the standard (and, e.g., in Gforth). Programs can switch the state with ] and [ (more typically postpone [) around calls to standard words, and if the program switches the state back before the text intrerpreter is reentered again, such switching should not make a difference except for the result of state @ and the few words (such as to) where it is specified that performing their interpretation or compilation semantics in a certain state is ambiguous.

In particular, there are no separate rules for manipulating the data stack for different states. The data stack has traditionally been used as control-flow stack because programs usually do not use the data stack for anything else while defining a word, so the restrictions imposed by using the data stack as control-flow stack are not particularly burdensome. But they exist, because programs are allowed to use the data stack while compiling words. And certain uses are not allowed in a standard program, e.g.

5 : foo literal ;

because the colon-sys blocks the access to the value 5. Instead, you have to write

: foo [ 5 ] literal ;

(Of course, for a literal value like 5, you would normally compile it directly, but let's assume that the value is the result of a call to a word or somesuch).

BTW, the term "primitive" is used for words implemented in a lower-level language (typically assembler) than Forth, not for all pre-defined words. And in most Forth systems, the words dealing with the control-flow stack are defined in Forth.

lmravatar of lmr

Amusingly, I've tried this in gforth:

: q quit ; immediate
: dd BEGIN q ;
.s

… and surely enough, there's the naked control stack for all to see. As a consequence of the fact that QUIT doesn't clear the control stack (or the data stack, as per the spec).

Reply New Version

lmravatar of lmr [330] "make the user input device the input source"Example2024-01-06 15:25:11

Since QUIT is bound to reset the input source, what's a "clean" way to handle, say, command-line arguments (like -e or file arguments in Gforth)? Of course there are plenty of hacks to come up with.

AntonErtlavatar of AntonErtl

OS command-line processing is outside the scope of the standard, so the standard is silent on these issues.

Gforth uses evaluate to process the -e arguments; and actually -e is a shorthand for --evaluate.

Concerning quit, in a system with the exception wordset there is little reason for programs to call quit. quit is generally used for implementing the startup of a system, or the recovery from an uncaught throw, but one does not need to standardize it to use it as an internal system word. It seems like a no-longer useful legacy word to me.

Closed

lmravatar of lmr

there is little reason for programs to call quit

It can be (mis-?) used to avoid displaying the prompt, to exit from a deeply nested include / eval... or when writing an interpreter / DSL in Forth, one can call MY-LANG-WID 1 SET-ORDER QUIT as a final step — then let QUIT handle the new language. But I suppose there is also ABORT, THROW / CATCH etc

Gforth uses evaluate to process the -e arguments; and actually -e is a shorthand for --evaluate.

Maybe I spent too much time thinking about this. Conceptually the problem was that those "virtual" EVALUATE's (and "virtual" INCLUDE's for file CLI args) should also run under QUIT, and QUIT cannot be run multiple times since it doesn't relinquish control (except to the HOST OS); so how does QUIT know about them? My hack was to start QUIT with a "cooked" input source, above stdin in the logical input stack, that contains all those virtual calls (plus a call to." blah blah" to print the greeting).

AntonErtlavatar of AntonErtl

You can avoid the prompt by using abort. Likewise, you can exit from a deeply nested include/eval with abort (but I have never had the need).

After sealing the search order with MY-LANG-WID 1 SET-ORDER, you do not need to quit, just return to the text interpreter normally (which also allows you to load the DSL and then continue with the program written in the DSL in an included file, without the user having to use some including word interactively after loading the DSL.

Conceptually the problem was that those "virtual" EVALUATE's (and "virtual" INCLUDE's for file CLI args) should also run under QUIT

Why? They don't run under quit in Gforth, but before it, if quit is run at all (in development Gforth it isn't). It seems to me like you are thinking that there is a requirement to use quit somewhere; there isn't. If the system is easier to implement without calling quit, don't call quit.

lmravatar of lmr

You can avoid the prompt by using abort. Likewise, you can exit from a deeply nested include/eval with abort (but I have never had the need).

That's actually ("quit(e)" ?) interesting and perhaps merits a separate discussion. So we could try

: "ABORT" S" ABORT" ;
: "QUIT"  S" QUIT" ;
: weird EVALUATE ." done" CR ;
: tst ['] weird CATCH ." caught: " . CR ;
"ABORT" tst
"QUIT" tst

The first tst should print "caught", -1, and leave [ -1 <garbage> ] on the stack.

I think that under the current letter of the standard, the second tst should hand control back to the user device while leaving an exception frame on the stack (I don't see anything in the spec about QUIT clearing the exception stack). The text interpreter then behaves normally, until the user causes a throw (for example by typing XXX; there is no other way to exit QUIT, although that would also be interesting). Then the CATCH would come into effect.

However the implementations I've tried seem to clear the exception stack upon QUIT, so maybe I'm misreading something, or maybe the standard isn't clear.

AntonErtlavatar of AntonErtl

As described in 9.3.3, the exception stack may be on the return stack. I am sure that the intent of the committee was that anything (including quit) that removes stuff from the return stack also removes the stuff from the exception stack and correspondingly sets the pointer to the current exception frame. This should be fixed in the document (see also r1148).

Open

lmravatar of lmr

OK. Maybe there should be an explicit paragraph or something about which stacks have return-stack semantics and which ones have data-stack semantics (i.e. not cleared on QUIT). As another point I've just realized that TRAVERSE-WORDLIST can't use the control stack to save a working copy of the list of name-tokens in the wid as a higher-level data structure (the control stack already presupposes structured items, so it's tempting to abuse it, depending on implementation details). So far I've seen:

  • control stack: data stack semantics
  • N>R stack: return stack semantics
  • exception stack: return stack semantics
  • input source stack: return stack semantics (?)

AntonErtlavatar of AntonErtl

Sounds right. The input source specification is explicitly mentioned in throw, which binds it to the exception stack and thus the return stack. It would be better to specify this in load, included etc., because that's where the implementations actually ensure this requirement (by catching, restoring, and rethrowing), and the current specification has led to confusion among some readers. Another possible fix to the document.

lmravatar of lmr

by catching, restoring, and rethrowing

I've been thinking about this. It seems that for both the data and the return stack, there is a need for a "structured" version (i.e. one that can somehow accomodate blobs / arrays / higher-level data structures, whatever you want to call them) as opposed to just integers.

For the data stack, the structured version is the control stack. What is the structured version of the return stack? I think it is the exception stack. This would make catching and re-throwing unnecessary, and abiding by the spec would become automatic.

Specifically, pushing a new N>R item, pushing a new input source, and pushing an exception frame would all push to the "enhanced" exception stack (call it estk).

There are two ways estk can be popped. THROW has to search from the top of estk downward to the most recent item that is an exception frame, discard everything above it, and reset the data stack / control stack / return stack pointers as specified in the exception frame thus located. The action to take when discarding an item from estk varies (for a N>R item it means freeing up a buffer; for an input source it means closing the file etc).

Now let's see what happens in non-throw situations. NR> needs to pop off a N>R item from estk, INCLUDE/EVALUATE need to pop off an input source from estk when REFILL returns 0, CATCH needs to pop off an exception frame from estk when the XT returns. If the item popped does not match the expected tag / type, we throw an error. The questions is whether these actions could happen out of order in a standard program — and tt seems to me they couldn't.

There are some other details to take care of (e.g. when popping off an input source from estk, the next most-recent one has to be searched for and set as stdin; when popping off a N>R item, the "dumb" return stack also has to be popped).

This seems tempting to implement … provided I haven't missed things.

AntonErtlavatar of AntonErtl

The usual implementation is to have control-flow stack items as a bunch of cells on the data stack and exception stack items as a bunch of cells on the return stack; the words that deal with them know the structure. If you instead use several stacks, you have to save all their depths on catch and restore these depths on throw.

The usual way to find the current exception frame is by having a (user) variable that points to the frame; when the exception frame is dropped, the pointer to the previous frame is taken from the dropped one and stored in the variable. No need for a search.

Concerning the input sources, what you have to restore when include etc. are left depends on the implementation. E.g., usually a file has to be closed. This does not happen automatically when throwing, unless you put the restoring action into throw, which would slow down throw and catch even for cases where the input source does not change. That's why the common approach is catch, restore, rethrow in included etc.

Concerning the order of popping, yes, they are popped in the reverse order of pushing and there cannot be any reorderings. That's because catch puts the stuff on the return/exception stack before executeing the xt, and restores it afterwards. Likewise, included etc. put the input stream stuff on the return stack before interpreting the contents of the file and restore them afterwards; so this is all fully nested. The restrictions on n>r nr> also mean that no reordering can happen.

Your idea reminds me of my EuroForth 2022 paper, where I sketch having separate stacks to avoid subverting memory safety. The paper does not discuss these things in depth, though.

lmravatar of lmr

[ … ] control-flow stack items as a bunch of cells [ … ] use several stacks, you have to save all their depths on catch and restore these depths on throw

Of course — and also clear the ones with return-stack semantics on QUIT. I've also thought about adding a word to create custom stacks, but then the cleanup problem multiplies (I'll have to look a little closer at what GForth does).

when the exception frame is dropped, the pointer to the previous frame is taken from the dropped one

Ah, an implicit linked-list of exception frames; the same can be done for input-source frames.

This does not happen automatically when throwing, unless you put the restoring action into throw, which would slow down throw and catch even for cases where the input source does not change

This I'm not sure I understand. Items placed on this "enhanced" return stack estk absolutely need "destructors". If a N>R frame was placed on estk and gets discarded by THROW, the memory needs to be free()'d somewhere. If an input-source frame gets discarded by THROW, someone needs to close the file (except for EVALUATE) and restore stdin.

If so, what standard programs would get slowed down by moving the destructor invocations into THROW? For a program that doesn't leave behind any INCLUDE, EVALUATE or N>R frames between (1) CATCH (which pushes the exception frame onto estk) and (2) whatever causes the THROW, there would be no destructors to invoke (or actually at all) on estk — so no slowdown...?

Also, how does an "allocating" N>R ensure memory is free()d in a single "dumb" return stack implementation? By also catching and rethrowing...?

reminds me of my EuroForth 2022 paper

Thanks, that's quite relevant. Is it still just a "paper design"?

AntonErtlavatar of AntonErtl

Gforth's throw just checks whether the ball is non-zero, and if so, mainly restores a few virtual-machine registers. It does not need to check whether there is any destructor and invoke it; instead, destructors are invoked through the idiom

( xt ) catch destruct throw

which means that you have an exception frame for every destructor, but given that exception frames are cheap, that's fine.

For n>r you cannot use the idiom above, because n>r nr> are a pair of words rather than one word that wraps around some inner execution mechanism (whether it's the invocation of an xt or the text interpretation of some source code). But then I don't know of any n>r implementation that allocates.

Safe Forth is still a paper design and likely to stay one for at least several more years. So many ideas, so little time:-).

Reply New Version

lmravatar of lmr [334] INTERPRETSuggested reference implementation2024-01-30 22:30:13

This is a simple way to fulfill the INTERPRET part of the reference implementation, plus ]][[ via a special STATE of -2. FWIW, it relies on as few predefined words as possible. It computes "actions" to undertake for each token/state, with a single EXECUTE in the main loop. I think it might be useful; there are other sections of the spec that are relevant but (1) I haven't seen a self-contained version of INTERPRET and (2) those sections are rather large, so I thought I'd leave this here. I've tested it in GForth (with a suitable prelude). Prerequisites:

  • compile(ish) words: : ; [ ] PARSE-NAME NAME>COMPILE NAME>INTERPRET COMPILE,
  • : FIND-NAME ( c-addr u -- nt ) ( find a name in search order ) ;
  • : LIT, ( n -- ) ( ... ) ; \ non-immediate, non-STATE-bound LITERAL
  • : PARSE-NUM?#-13 ( c-addr u -- n ) ( parse num or #-13 THROW ) ;
: LITERAL ( n -- ) LIT, ; IMMEDIATE RESTRICT

: FIND-NAME?#-13   ( c-addr u -- nt )      \ FIND-NAME or #-13 THROW
  FIND-NAME  DUP 0= #-13 AND THROW
;
: PARSE-NAME?#-16  ( "name" -- c-addr u )  \ PARSE-NAME or #-16 THROW
  PARSE-NAME DUP 0= #-16 AND THROW
;

: NAME'      ( c-addr u -- nt  ) PARSE-NAME?#-16 FIND-NAME?#-13 ;
: '          ( "name"   -- ixt ) NAME' NAME>INTERPRET ;
: NAME>COMPILED ( nt    -- cxt ) NAME>COMPILE DROP ;
: COMPILE'   ( "name"   -- cxt ) NAME' NAME>COMPILED ;
: [NAME']    ( c-addr u -- nt  ) NAME' LIT, ; IMMEDIATE RESTRICT
: [']        ( "name"   -- ixt ) ' LIT, ; IMMEDIATE RESTRICT
: [COMPILE'] ( "name"   -- xt  ) COMPILE' LIT, ; IMMEDIATE RESTRICT
: [COMPILE]  ( "name"   -- xt  ) COMPILE' COMPILE, ; IMMEDIATE RESTRICT

: SWAP&LIT&COMP  ( x xt -- )
  DUP ['] EXECUTE =
  IF DROP  \ optimization
  ELSE SWAP LIT,
  THEN COMPILE,
;
: POSTPONE-NAME ( nt -- )     NAME>COMPILE SWAP&LIT&COMP ;
: POSTPONE      ( "name" -- ) NAME' POSTPONE-NAME ; IMMEDIATE RESTRICT

: NAME>STATE?ACTION  ( nt -- x xt )
  STATE @ IF
    NAME>COMPILE
  ELSE
    NAME>INTERPRET DUP 0= #-14 AND THROW
    ['] EXECUTE
  THEN
;
: NUMBER>STATE?ACTION  ( n -- n xt )
  STATE @ IF ['] LIT,
  ELSE       ['] NOOP
  THEN
;

: ]] #-2  STATE ! ; IMMEDIATE RESTRICT
: [[ TRUE STATE ! ; IMMEDIATE RESTRICT
: _XT_[[ [COMPILE'] [[ ;  \ CONSTANT ...
: POSTPONE?ACTION  ( x xt -- x xt pxt )
  \ takes an \<x xt\> pair as produced by NAME>COMPILE
  \ adds SWAP&LIT&COMP if STATE = -2 and word <> [[
    OVER _XT_[[ <>
    STATE @ #-2 =
  AND IF ['] SWAP&LIT&COMP
  ELSE   ['] EXECUTE
  THEN
;

: STRING>STATE?ACTION  ( c-addr u -- x xt pxt )
  2DUP FIND-NAME
  ?DUP 0= IF
    PARSE-NUM?#-13 NUMBER>STATE?ACTION
  ELSE NIP NIP NAME>STATE?ACTION
  THEN
  POSTPONE?ACTION
;

: INTERPRET
  BEGIN
    PARSE-NAME DUP 0= IF 2DROP EXIT THEN
    STRING>STATE?ACTION EXECUTE
  AGAIN
;

ruvavatar of ruv

The following test case

: x]] postpone ]] ; immediate
: to, [ 123 . x]] to [[ 456 . ] ;

is expected to print "123 456" during compilation, but it prints only "123" in your implementation.

Thus, state should not be used, but another variable should be used to hold "postponing" mode.

Anyway, I prefer to implement ]] as a parsing word.

lmravatar of lmr

: x]] postpone ]] ; immediate to, [ 123 . x]] to [[ 456 . ] ; is expected to print "123 456" during compilation, but it prints only "123" in your implementation.

GForth and VFX print the same. Are you assuming a ]] ...[[ that saves & pops the previous STATE, as opposed to just switching STATE to -2 / -1? In any case this is shaky ground: ]] and [[ are not actually in the spec. Also, postponing TO is flagged as an ambiguous condition in this standard.

GForth only says "Postpone state ends when [[ is recognized" (not how it ends) and I can't completely follow what ]] does using SEE. In VFX SEE seems to show a simple MOV just like for ]. I made ]] compile-only specifically to skim over this issue, but as your example shows this is easily bypassed.

I also used a parsing definition before in my implementation, but Anton's evil smartness paper made me think about possibilities.

ruvavatar of ruv

postponing TO is flagged as an ambiguous condition in this standard.

I think this ambiguous condition will be removed as being outdated, since postpone can be implemented via find-name and name>compile, or via classic find, so that it works correctly when applied to to, regardless of how to is implemented (see links below).

In my test case we can use postpone if — it does not matter.

GForth and VFX print the same. [...] In any case this is shaky ground: ]] and [[ are not actually in the spec.

Yes. So I'm not appealing to the spec, but to reasonable expectations.
Anyway, due to absent of a specification, it's better not to provide a reference implementation for these words.

I also used a parsing definition before in my implementation, but Anton's evil smartness paper made me think about possibilities.

Anton implemented ]] as a parsing word: https://theforth.net/package/compat/current-view/macros.fs

Re state-smartness is evil, — It's only evil when you try to do optimization implementing incorrect execution semantics, or when you rely on incorrect "postpone".

See my posts How POSTPONE should work and About POSTPONE semantics in edge cases.

lmravatar of lmr

not appealing to the spec, but to reasonable expectations.

Just so we're on the same page, you expect ]] to push the current STATE and [[ to pop it (e.g. from the control stack), which is not what GForth or VFX currently do? This is easy to implement, but it seems it's not current practice (i.e., you get just 123 from [ ... 123 . ]] IF/TO/whatever [[ 456 . ... in GForth and VFX)

absent of a specification, it's better not to

I agree, and POSTPONE?ACTION (and references to it) can be dropped. I thought it would be interesting to see what this "action-preparing + EXECUTE" implementation strategy could achieve.

Anton implemented ]] as a parsing word

That is of course possible, and it's basically what I've used before (except string COMPARE can be replaced with comparing name-tokens after FIND-NAME). But if you try SEE ]] in "current" GForth you will see a recognizer-stack-based implementation, and if you try it in VFX you will see something that is probably #-2 STATE !. I think both strategies re-use the text-interpreter PARSE-NAME WHILE loop and QUIT's REFILL WHILE loop (as opposed to having ]] loop over PARSE-NAME and maybe REFILL, depending on whether ]] is multi-line or not).

I was referring to the part of the "evil smartness paper" (5.4) that talks about having a separate postpone-xt in addition to interpret-xt and compile-xt (though I haven't actually done that).

lmravatar of lmr

: to, [ 123 . x]] to [[ 456 . ] ;

There's another, more serious divergence from "existing practice" here: with "current" ]] semantics, to, will compile a ] into the current definition. The word using to, will in turn switch STATE when called. You can SEE to, in GForth.

ruvavatar of ruv

you expect ]] to push the current STATE and [[ to pop it (e.g. from the control stack),

This is an implementation detail. And I don't expect any implementation details. I expect the behavior shown in the test case.

Regarding implementations. Saving the current STATE and then restoring it will only work if ]] is a parsing word or if postpone is implemented incorrectly. Thus, if ]] is not a parsing word, it probably should not change STATE at all. It should change another variable, e.g. PSTATE.

Regarding the interface for ]]. Why do you prefer the variant without parsing to the variant with parsing?

I was referring to the part of the "evil smartness paper" (5.4) that talks about having a separate postpone-xt in addition to interpret-xt and compile-xt (though I haven't actually done that).

Ah, I see. It is also compatible with parsing ]].
But this approach is not compatible with existing practice of defining immediate parsing words.

I implemented postponing mode in the frame of compilation, and it does not require postpone-xt or compile-xt at all, and usual immediate parsing words (including comments) work as expected. See c-state in my GitHub repo.


There's another, more serious divergence from "existing practice" here: with "current" ]] semantics, to, will compile a ] into the current definition. The word using to, will in turn switch STATE when called. You can SEE to, in GForth.

Then the word ]] in VFX and Gforth is broken in edge cases. There is no any obvious formal reason why it should not work as expected.

A more formal test case:

: lit, postpone literal ;
: x]] postpone ]] ;
t{ : xif, [ 123 lit, x]] if [[ 456 lit, ] ; -> }t
t{ : foo [ xif, ( 123 orig 456 ) lit, ] then [ lit, ] ; -> }t
t{ 0 foo  1 foo -> 123  456 123 }t

This test case is passed in my system.

lmravatar of lmr

I expect the behavior shown in the test case

OK so you expect : ... [ ]] [[ to return to STATE 0. Then let me ask you this: do you also expect : ... [ [ ] to return to STATE 0? If (presumably not), why?

Why do you prefer the variant without parsing to the variant with parsing

I'm trying to avoid a parallel INTERPRET (and parallel QUIT). Centralizing all parsing and all execution (as in my code above, by first stacking up XTs and then calling a single EXECUTE) would (vaguley speaking) seem to make it easier to hook up a debugger, profiler, optimizer, doc generator etc.

usual immediate parsing words (including comments) work as expected

I see that you prefer it this way, but it's debatable what (for instance) ]] s" s" [[ should do. Or if comments like ]] \ : [[ should act as comments. I guess I prefer the "postpone each PARSE-NAME'd word separately" behavior (along with "[[ ]] [ ] don't nest" behavior), because otherwise it becomes harder to reason about code.

ruvavatar of ruv

so you expect : ... [ ]] [[ to return to STATE 0.

I think, interpretation semantics are not defined for ]], so this code is ambiguous. That's why I defined x]].

If they are defined to postpone each containing word, than yes, the state after [[ should be 0 in this case.

It's similar to quotations [: ... ;] — interpretation semantics are not defined for it, but the reasonable interpretation semantics for this construct are similar to :noname ... ;, and state is 0 after ;].
For example:

: foo [ [: ." test passed ;] execute ] ." foo end" ; \ it should print "test passed"
foo \ it should print "foo end"

do you also expect : ... [ [ ] to return to STATE 0? If (presumably not), why?

Interpretation semantics are not defined for [ (see 6.1.2500 [), but the expected behavior is to do nothing in interpretation state (and this word is usually implemented like this). Thus, the expected behavior is that state is non-zero after ].

: ...
  [   \ leave-compilation
    [ \ do nothing (due to interpretation state)
  ]  \ enter-compilation
  \ state is non-zero
;

I guess I prefer the "postpone each PARSE-NAME'd word separately" behavior

Me too, for ]] ... [[. A more advanced construct should use another name.

MarcelHendrixavatar of MarcelHendrix

In iForth64 thefollowing happens:

FORTH> : x]] postpone ]] ; immediate ok FORTH> : to, [ 123 . x]] to [[ 456 . ] ; 123 456 ok [2]FORTH> .s Data: 554736352 19651504 --- System: --- Float: --- ok

-marcel

Reply New Version