Digest #199 2022-09-08


[266] 2022-09-07 11:41:37 ruv wrote:

comment - Support several versions of the standard in parallel

At the moment, forth-standard.org contains only Forth-2012. When the next standard will be published, it should be on forth-standard too, in parallel with Forth-2012, I think. What do you think?

If we want to provide several versions in parallel, the URI space should be developed accordingly. Since currently the URI space supports only one version. For example, in the URI https://forth-standard.org/standard/words the fragment standard refers Forth-2012.

A website example that supports several versions of a document in parallel: https://www.postgresql.org/docs/

[267] 2022-09-07 21:15:16 AntonErtl wrote:

comment - Etymology of SYNONYM

Victor H. Yngve presented SYNONYM (with the syntax shown here) in Forth Dimensions, Vol. VII No. 3, p. 11-13, September/October 1985. The presented implementation is terrible, however.


[r864] 2022-09-07 09:35:56 ruv replies:

testcase - Incorrect Test Pattern

It's wrong. Where have you tested this code?

[r865] 2022-09-07 09:45:56 ruv replies:

testcase - Bogus Test Case for SAVE-INPUT

Again, where have you tested the code?

RESTORE-INPUT changes the input source state, so after the next REFILL the parse area contains not " NeverExecuted" but " siv @" (that returns 0 in this time).

[r866] 2022-09-07 21:40:41 AntonErtl replies:

requestClarification - Should QUIT propagate exceptions?

Some other throw codes corresponding to signals:

signal throw code

[r867] 2022-09-07 23:27:23 BerndPaysan replies:

proposal - minimalistic core API for recognizers


Bernd Paysan

Change Log:

  • 2020-09-06 initial version
  • 2020-09-08 taking ruv's approach and vocabulary at translators
  • 2020-09-08 replace the remaining rectypes with translators
  • 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion


The current recognizer proposal has received a number of critics. One is that its API is too big. So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions. The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.


Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that. Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

  • Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
  • Make the recognizer sequence executable with the same effect as a recognizer
  • Make the system's forth-recognizer a deferred word to allow plugging in new recognizer sequences

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is. If you have for some reason legacy code that looks like

: rec-nt ( addr u -- translator )
  here place  here find dup IF
      0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
  ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

: word-translator ( xt flag -- )
  0< state @ and  IF  compile,  ELSE  execute  THEN ;
: rec-word ( addr u -- ... translator )
  here place  here find dup IF  [']  word-translator
  ELSE  drop ['] notfound  THEN ;

Typical use



XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a recognized xt and additional data on the stack (no additional data for NOTFOUND):

REC-SOMETYPE ( addr len -- i*x recognized | NOTFOUND )

XY.3 Additional usage requirements

XY.3.1 Recognized

recognized: subtype of xt, and executes with the following stack effect:

RECOGNIZED-THING ( j*x i*x state -- k*x )

A recognized xt acts on the state passed to it on the stack

  • 0 for interpretation
  • -1 for compilation
  • -2 for POSTPONE

i*x is the additional information provided by the recognizer, jx and kx are the stack inputs and outputs of interpreting/compiling or postponing the thing.

XY.6 Glossary

XY.6.1 Recognizer Words

FORTH-RECOGNIZE ( addr len -- i*x recognized-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the recognized xt and additional information if successful, or NOTFOUND if not.


Performs -13 THROW. An ambiguous condition exists if the exception word set is not available.

XY.6.2 Recognizer Extension Words


Assign the recognizer xt to FORTH-RECOGNIZE.


FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using IS FORTH-RECOGNIZE.


Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.


FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using ACTION-OF FORTH-RECOGNIZE.

REC-SEQUENCE: ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

SET-REC-SEQUENCE ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

GET-REC-SEQUENCE ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

RECOGNIZED: ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a recognized word under the name "name", which performs xt-int for state=0, xt-comp for state=-1 and xt-post for state=-2.

Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix:

Defer forth-recognizer ( addr u -- i*x translator / notfound )
: interpret ( i*x -- j*x )
      ?stack parse-name dup  WHILE
      forth-recognizer execute

: lit,  ( n -- )  postpone literal ;
: notfound ( state -- ) -13 throw ;
: nt-translator ( nt -- )
  case  state @
      0  of  name>interpret execute  endof
      -1 of  name>compile execute  endof
      -2 of  name>compile swap lit, compile,  endof
      nip // do nothing if state is unknown; possible error handling goes here
  endcase ;
: num-translator ( n -- )
  case  state @
      -1 of   lit,  endof
      -2 of   lit, postpone lit,  endof
  endcase ;

: rec-nt ( addr u -- nt nt-translator / notfound )
  forth-wordlist find-name-in dup IF  ['] nt-translator  ELSE  drop ['] notfound  THEN ;
: rec-num ( addr u -- n num-translator / notfound )
  0. 2swap >number 0= IF  2drop ['] num-translator  ELSE  2drop drop ['] notfound  THEN ;

: minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
  2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

' minimal-recognizer is forth-recognizer

Extensions reference implementation:

: set-forth-recognize ( xt -- )
  is forth-recognize ;
: get-forth-recognize ( -- xt )
  action-of forth-recognize ;
: recognized: ( xt-interpret xt-compile xt-postpone "name" -- )
  create , , ,
  does> state @ 2 + cells + @ execute ;

Stacks TBD.