Digest #199 2022-09-08
At the moment, forth-standard.org contains only Forth-2012. When the next standard will be published, it should be on forth-standard too, in parallel with Forth-2012, I think. What do you think?
If we want to provide several versions in parallel, the URI space should be developed accordingly. Since currently the URI space supports only one version. For example, in the URI https://forth-standard.org/standard/words the fragment standard refers Forth-2012.
A website example that supports several versions of a document in parallel: https://www.postgresql.org/docs/
Victor H. Yngve presented SYNONYM (with the syntax shown here) in Forth Dimensions, Vol. VII No. 3, p. 11-13, September/October 1985. The presented implementation is terrible, however.
It's wrong. Where have you tested this code?
Again, where have you tested the code?
RESTORE-INPUT changes the input source state, so after the next
REFILL the parse area contains not "
NeverExecuted" but "
siv @" (that returns 0 in this time).
Some other throw codes corresponding to signals:
- 2020-09-06 initial version
- 2020-09-08 taking ruv's approach and vocabulary at translators
- 2020-09-08 replace the remaining rectypes with translators
- 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
The current recognizer proposal has received a number of critics. One is that its API is too big. So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions. The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.
Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that. Common extensions go to the RECOGNIZER EXT wordset.
Important changes to the original proposal:
- Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
- Make the recognizer sequence executable with the same effect as a recognizer
- Make the system's
forth-recognizera deferred word to allow plugging in new recognizer sequences
This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.
The core principle is still that the recognizer is not aware of state, and the returned translator is. If you have for some reason legacy code that looks like
: rec-nt ( addr u -- translator ) here place here find dup IF 0< state @ and IF compile, ELSE execute THEN ['] drop ELSE drop ['] notfound THEN ;
then you should factor the part starting with state @ out and return it as translator:
: word-translator ( xt flag -- ) 0< state @ and IF compile, ELSE execute THEN ; : rec-word ( addr u -- ... translator ) here place here find dup IF ['] word-translator ELSE drop ['] notfound THEN ;
XY. The optional Recognizer Wordset
A recognizer takes the string of a lexeme and returns a recognized xt and additional data on the stack (no additional data for
REC-SOMETYPE ( addr len -- i*x recognized | NOTFOUND )
XY.3 Additional usage requirements
recognized: subtype of xt, and executes with the following stack effect:
RECOGNIZED-THING ( j*x i*x state -- k*x )
A recognized xt acts on the state passed to it on the stack
- 0 for interpretation
- -1 for compilation
- -2 for POSTPONE
i*x is the additional information provided by the recognizer, jx and kx are the stack inputs and outputs of interpreting/compiling or postponing the thing.
XY.6.1 Recognizer Words
FORTH-RECOGNIZE ( addr len -- i*x recognized-xt | NOTFOUND-xt ) RECOGNIZER
Takes a string and tries to recognize it, returning the recognized xt and additional information if successful, or
NOTFOUND if not.
NOTFOUND ( -- ) RECOGNIZER
-13 THROW. An ambiguous condition exists if the exception word set is not available.
XY.6.2 Recognizer Extension Words
SET-FORTH-RECOGNIZE ( xt -- ) RECOGNIZER EXT
Assign the recognizer xt to FORTH-RECOGNIZE.
FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using IS FORTH-RECOGNIZE.
GET-FORTH-RECOGNIZE ( -- xt ) RECOGNIZER EXT
Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.
FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using ACTION-OF FORTH-RECOGNIZE.
REC-SEQUENCE: ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT
Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.
SET-REC-SEQUENCE ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT
Set the recognizer sequence of xt-seq to xt1 .. xtn.
GET-REC-SEQUENCE ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT
Obtain the recognizer sequence xt-seq as xt1 .. xtn n.
RECOGNIZED: ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT
Create a recognized word under the name "name", which performs xt-int for state=0, xt-comp for state=-1 and xt-post for state=-2.
This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix:
Defer forth-recognizer ( addr u -- i*x translator / notfound ) : interpret ( i*x -- j*x ) BEGIN ?stack parse-name dup WHILE forth-recognizer execute REPEAT ; : lit, ( n -- ) postpone literal ; : notfound ( state -- ) -13 throw ; : nt-translator ( nt -- ) case state @ 0 of name>interpret execute endof -1 of name>compile execute endof -2 of name>compile swap lit, compile, endof nip // do nothing if state is unknown; possible error handling goes here endcase ; : num-translator ( n -- ) case state @ -1 of lit, endof -2 of lit, postpone lit, endof endcase ; : rec-nt ( addr u -- nt nt-translator / notfound ) forth-wordlist find-name-in dup IF ['] nt-translator ELSE drop ['] notfound THEN ; : rec-num ( addr u -- n num-translator / notfound ) 0. 2swap >number 0= IF 2drop ['] num-translator ELSE 2drop drop ['] notfound THEN ; : minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound ) 2>r 2r@ rec-nt dup ['] notfound = IF drop 2r@ rec-num THEN 2rdrop ; ' minimal-recognizer is forth-recognizer
Extensions reference implementation:
: set-forth-recognize ( xt -- ) is forth-recognize ; : get-forth-recognize ( -- xt ) action-of forth-recognize ; : recognized: ( xt-interpret xt-compile xt-postpone "name" -- ) create , , , does> state @ 2 + cells + @ execute ;