,---------------.
| Contributions |
`---------------´


,------------------------------------------
| 2022-09-08 07:51:27  phisheep  wrote:
| requestClarification - Seemingly contradictory ambiguous condition?
| see: https://forth-standard.org/standard/doc#contribution-268
`------------------------------------------
While trying to document my ambiguous conditions, I came across this gem above:

"access to a deferred word, a word defined by 6.2.1173 DEFER, which was not defined by 6.2.1173 DEFER;"

It doesn't seem to mean anything. Do you know what it is supposed to mean? 


,------------------------------------------
| 2022-11-19 09:25:07  PopovMP  wrote:
| comment - Describe Compile time and Run time behavior
| see: https://forth-standard.org/standard/core/CHAR#contribution-270
`------------------------------------------
I had troubles to make my .f files working in popular distributions until I found CHAR is not supported in Compile / Run time.

## Compilation:

Compilation semantics for this word are undefined.

## Run-time:

Run-time semantics for this word are undefined.


,---------.
| Replies |
`---------´


,------------------------------------------
| 2022-09-09 11:47:13  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-876
`------------------------------------------
> ```
> : recognized: ( xt-interpret xt-compile xt-postpone "name" -- )
>   create , , ,
>   does> state @ 2 + cells + @ execute ;
> ```

In the reference implementation you keep the mode {0,-1,-2} in the `state` variable.
But it's problematic, regardless how the mode is passed into token translators (directly via the stack, or indirectly using a dedicated method).

Since, when interpretation state is set by `[` (so `state` is set to `0`), and then compilation state is set by `]`, the mode should be the same as before `[`. If it was `-2`, it should be set to `-2`. But information that the mode was `-2` is lost.  So another variable should be used to keep a flag whether "POSTPONE" mode is active or not.

Actually, the mode of compilation/interpretation and "POSTPONE" mode are not mutual-exclusive. They can be set independently of each other.

For example, the code:
```
: foo postpone bar [ postpone baz ] ;
```
conceptually can be pretty clear defined (see my [comment](https://forth-standard.org/standard/core/POSTPONE#reply-614)). In this fragment, for the lexeme `bar` "POSTPONE" mode is active in compilation state, for `baz` "POSTPONE" mode is active in interpretation state.

So, if "POSTPONE" mode is employed, a different variable for it should be used for this reason too.

-----

On the other hand I'm not convinced that we need "POSTPONE" mode at all.
Except to implement the word `postpone` itself, where and how this mode can be used? Even for the questionable construct `]] ... [[`  the mode "POSTPONE" isn't needed.


,------------------------------------------
| 2022-09-09 11:59:51  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-877
`------------------------------------------
OK, the most convincing argument is that `STATE` can go away as specified thing. You can use and combine system translators, and you can create table-driven translators, but `STATE` is an implementation detail. 


,------------------------------------------
| 2022-09-09 15:19:08  ruv  replies:
| proposal - Directory experiemental proposal
| see: https://forth-standard.org/proposals/directory-experiemental-proposal#reply-878
`------------------------------------------
What was a rationale to [change](https://forth-standard.org/proposals/directory-experiemental-proposal?showDiff#reply-59) names from

`path-basename`, `path-dirname`, etc

to

`basename-path`, `dirname-path`, etc

?

It seems to me, these words are similar to `file-status`, `file-position`, etc.,  so the former names are better.


`normalize-path` is OK since it's in the form "{verb}-{noun}" and it modifies the input string.


,------------------------------------------
| 2022-09-09 21:32:21  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-879
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-nt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-word ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a recognized xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x recognized | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

    THING-TRANSLATOR ( j*x i*x -- k*x )

A translator xt that performs or compiles the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, j*x and k*x are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  An ambiguous condition exists if the exception word set is not available.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using IS FORTH-RECOGNIZE.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using ACTION-OF FORTH-RECOGNIZE.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.

**REC-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-REC-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-REC-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

**TRANSLATE-INT** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in interpretation state

**TRANSLATE-COMP** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in compilation state

**TRANSLATE-POST** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognizer ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognizer execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate-nt ( nt -- )
      case state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          nip \ do nothing if state is unknown; possible error handling goes here
      endcase ;
    : translate-num ( n -- )
      case state @
          -1 of   lit,  endof
      endcase ;
    : translate-dnum ( d -- )
      \ example of a composite translator using existing translators
      >r translate-num r> translate-num ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognizer

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : get-forth-recognize ( -- xt )
      action-of forth-recognize ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    : translate-int ( translate-xt -- )  2 cells + @ execute ;
    : translate-comp ( translate-xt -- )  cell+ @ execute ;
    : translate-post ( translate-xt -- )  @ execute ;

Stacks TBD, copy from Trute proposal.

## Testing

TBD


,------------------------------------------
| 2022-09-10 09:59:58  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-880
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-nt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-word ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a recognized xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x recognized | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

    THING-TRANSLATOR ( j*x i*x -- k*x )

A translator xt that performs or compiles the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, j*x and k*x are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  An ambiguous condition exists if the exception word set is not available.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using IS FORTH-RECOGNIZE.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using ACTION-OF FORTH-RECOGNIZE.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

**TRANSLATE-INT** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in interpretation state

**TRANSLATE-COMP** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in compilation state

**TRANSLATE-POST** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate-nt ( nt -- )
      case state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          nip \ do nothing if state is unknown; possible error handling goes here
      endcase ;
    : translate-num ( n -- )
      case state @
          -1 of   lit,  endof
      endcase ;
    : translate-dnum ( d -- )
      \ example of a composite translator using existing translators
      >r translate-num r> translate-num ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognizer

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : get-forth-recognize ( -- xt )
      action-of forth-recognize ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    : translate-int ( translate-xt -- )  >body 2 cells + @ execute ;
    : translate-comp ( translate-xt -- )  >body cell+ @ execute ;
    : translate-post ( translate-xt -- )  >body @ execute ;

### Stack library

    : STACK: ( size "name" -- )
      CREATE 1+ ( size ) CELLS ALLOT
      0 OVER ! \ empty stack
    ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize   ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP NOTFOUND
    ;
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      dup stack: dup cells negate here + set-stack
      DOES>  recognize ; 
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) >body get-stack ;

## Testing

TBD


,------------------------------------------
| 2022-09-10 15:03:52  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-881
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that performs or compiles the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

**TRANSLATE-INT** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in interpretation state

**TRANSLATE-COMP** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in compilation state

**TRANSLATE-POST** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate-nt ( nt -- )
      case state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          nip \ do nothing if state is unknown; possible error handling goes here
      endcase ;
    : translate-num ( n -- )
      case state @
          -1 of   lit,  endof
      endcase ;
    : translate-dnum ( d -- )
      \ example of a composite translator using existing translators
      >r translate-num r> translate-num ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognizer

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : get-forth-recognize ( -- xt )
      action-of forth-recognize ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    : translate-int ( translate-xt -- )  >body 2 cells + @ execute ;
    : translate-comp ( translate-xt -- )  >body cell+ @ execute ;
    : translate-post ( translate-xt -- )  >body @ execute ;

### Defining translators

Once you have `TRANSLATE:`, and the associated invocation tools, you shall define the translators using it:

    : lit, ( n -- ) postpone Literal ;
    ' noop ' lit, :noname lit, postpone lit, ; translate: translate-num
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname lit, postpone name>compile postpone execute ; translate: translate-nt

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    ' root ' forth ' forth 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-10 15:38:36  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-882
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation
* 2022-09-10 Typo fixed

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that performs or compiles the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

**TRANSLATE-INT** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in interpretation state

**TRANSLATE-COMP** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in compilation state

**TRANSLATE-POST** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate-nt ( nt -- )
      case state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          nip \ do nothing if state is unknown; possible error handling goes here
      endcase ;
    : translate-num ( n -- )
      case state @
          -1 of   lit,  endof
      endcase ;
    : translate-dnum ( d -- )
      \ example of a composite translator using existing translators
      >r translate-num r> translate-num ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognize ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognize

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : get-forth-recognize ( -- xt )
      action-of forth-recognize ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    : translate-int ( translate-xt -- )  >body 2 cells + @ execute ;
    : translate-comp ( translate-xt -- )  >body cell+ @ execute ;
    : translate-post ( translate-xt -- )  >body @ execute ;

### Defining translators

Once you have `TRANSLATE:`, and the associated invocation tools, you shall define the translators using it:

    : lit, ( n -- ) postpone Literal ;
    ' noop ' lit, :noname lit, postpone lit, ; translate: translate-num
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname lit, postpone name>compile postpone execute ; translate: translate-nt

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    ' root ' forth ' forth 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-12 14:45:06  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-883
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation
* 2022-09-10 Typo fixed
* 2022-09-12 Fix for search order reference implementation

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that performs or compiles the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

**TRANSLATE-INT** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in interpretation state

**TRANSLATE-COMP** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in compilation state

**TRANSLATE-POST** ( j*x i*x translator-xt -- k*x ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate-nt ( nt -- )
      case state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          nip \ do nothing if state is unknown; possible error handling goes here
      endcase ;
    : translate-num ( n -- )
      case state @
          -1 of   lit,  endof
      endcase ;
    : translate-dnum ( d -- )
      \ example of a composite translator using existing translators
      >r translate-num r> translate-num ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognize ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognize

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : get-forth-recognize ( -- xt )
      action-of forth-recognize ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    : translate-int ( translate-xt -- )  >body 2 cells + @ execute ;
    : translate-comp ( translate-xt -- )  >body cell+ @ execute ;
    : translate-post ( translate-xt -- )  >body @ execute ;

### Defining translators

Once you have `TRANSLATE:`, and the associated invocation tools, you shall define the translators using it:

    : lit, ( n -- ) postpone Literal ;
    ' noop ' lit, :noname lit, postpone lit, ; translate: translate-num
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname lit, postpone name>compile postpone execute ; translate: translate-nt

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    root-wordlist forth-wordlist dup 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-13 17:45:08  AntonErtl  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-884
`------------------------------------------
It seems to  me that, given the reference implementation

````
' translate-nt translate-int
' translate-num translate-int
' translate-dnum translate-int
````

does not work (nor with translate-comp nor translate-post).  Assuming you solve this, do you really want me to define, e.g.,

````
:noname ['] translate-nt translate-int ;
````

to get an xt equivalent to one of the xts that has been passed to `translate:`?

How do you implement POSTPONE (IIRC Matthias Trute has a reference implementation for that)?

What problem is solved by making all the translators state-smart?  The problem I see is that you can only access the individual actions by saving `state`, setting `state`, executing the translator, and restoring the state.  That's not a good design.

The specification of `translate:` mentions a "current mode".  Where do I find out what a "mode" is?  This is non-standard terminology.


,------------------------------------------
| 2022-09-14 14:21:28  AntonErtl  replies:
| proposal - Tick and undefined execution semantics - 2
| see: https://forth-standard.org/proposals/tick-and-undefined-execution-semantics-2#reply-885
`------------------------------------------
In preparation for my version of this proposal I have tested the behaviour of existing systems.  I wrote a [test program](http://www.forth200x.org/drafts/tick3.fs)
that exercises the behaviour of various systems in contentious cases,
and ran it on the following systems:

* Gforth 0.7.9_20220901
* iForth 5.0.27
* lxf 1.6-982-823
* SwiftForth 3.11.0
* VFX 64 5.11 RC2

The results are:

Gforth         | iforth         | lxf            | sf          | VFX64          | Test
-------------- | -------------- | -------------- | ----------- | -----------    | -------------
compilation    | compilation    | `abort"`       | compilation | `-14 throw`    | `' if execute`
execution      | execution      | `abort"`       | behaviour1  | execution      | `' r@ execute`
execution      | execution      | `abort"`       | execution   | execution      | `' r@ compile,`
compilation    | execution      | `abort"`       | behaviour2  | `-14 throw`    | `' exit execute`
compilation    | `-14 throw`    | `abort"`       | execution   | `-22 throw`    | `' exit compile,`
execution      | execution      | execution      | execution   | execution      | `' compile, execute`
execution      | execution      | execution      | execution   | execution      | `' compile, compile,`
interpretation | interpretation | interpretation | state-smart | interpretation | `' s" execute`
interpretation | behaviour3     | interpretation | state-smart | behaviour4     | `' s" compile,`
interpretation | unclear        | interpretation | state-smart | state-smart    | `' to execute`
interpretation | unclear        | interpretation | state-smart | `-402 throw`   | `' to compile,`

The entries in the table have the following meanings:

entry          | meaning
-------------- | ----------------------------------------------------------
compilation    | compilation semantics (performed by `execute`)
execution      | execution semantics (performed by `execute`, appended by `compile,`)
interpretation | interpretation semantics (performed by `execute`, appended by `compile,`)
state-smart    | interpretation semantics in interpret state, compilation semantics in compile state

* behaviour1: like execution semantics, but apparently r@ accesses the
  wrong return stack item.

* behaviour2: my theory for the observed behaviour is that the
  execution semantics of exit is performed, but (like in behaviour1)
  applied to the wrong return stack item, resulting in a noop.
  
* behaviour3: looking at the output of SEE, it seems to be the
  interpretation semantics, but the actual behaviour does not quite
  fit.
  
* behaviour4: `compile,` performs (rather than appends) the
  interpretation semantics and leaves the xt on the stack.

* unclear: I have no explanation for the behaviour.


,------------------------------------------
| 2022-09-14 18:01:00  AntonErtl  replies:
| proposal - Specify that 0 THROW pops the 0
| see: https://forth-standard.org/proposals/specify-that-0-throw-pops-the-0#reply-886
`------------------------------------------
## Author:

M. Anton Ertl

## Change Log:

This version proposes a less minimal change, resulting after discussions at the [2022i meeting](https://forth-standard.org/proposals/agenda-forth-200x-interim-meeting-2020-02-18t14-00z#reply-787)

## Problem:

The specification of THROW does not say what happens on 0 THROW

## Solution

This is still a pretty minimal change.  A more subtantial rework would move the input stream handling elsewhere in the standard (this has come up as a cause for confusion). Moreover, I think the organization of the various cases is suboptimal.  A better approach might be

* n=0
* n!=0
  * exception stack non-empty
  * exception stack empty
    * n=-1
    * n=-2
    * otherwise

If you want (or don't want) such changes, reply to this proposal.

## Typical use: (Optional)

```
... search-wordlist 0= -13 and throw execute
```

## Proposal:

Behind the stack effect of 9.6.1.2275 THROW, insert

> If all bits of n are zero, remove n from the stack and continue execution after the THROW.


,------------------------------------------
| 2022-09-14 19:01:09  AntonErtl  replies:
| proposal - Pronounciations
| see: https://forth-standard.org/proposals/pronounciations#reply-887
`------------------------------------------
## Author:

Anton Ertl

## Change Log:

2022-09-14 Settle on 'Have "than"', but without `<#` `#>`.

## Problem:

Some pronounciations are misleading or inconsistent.  This proposal covers all the pronounciations I commented on recently.

## Solution:

Change the pronounciations.  Don't change `<#` `#>`, because they are not comparisons.

## Proposal:

Change the pronounciations as follows:

Word | current pronounciation | proposed pronounciation | rationale
-------- | ------------------------------ | --------------------------------- | -----------
`+x/string` |  plus-x-string | plus-x-slash-string | audio-to-writing correspondence
`-trailing-garbage` | minus-trailing-garbage | dash-trailing-garbage | consistency with `-trailing`
`f>s`    | F to S | f-to-s | consistency with `d>s`
`s>f`    | S to F | s-to-f | consistency with `s>d`
`x\string-` | x-string-minus | x-backslash-string-minus | audio-to-writing correspondence

### "than"

For many words containing ">" or "<" we have inconsistent pronounciations, some with "than" and some without.  Depending on which way we want to go, one of the following subsections should be accepted, and the other rejected:

#### Have "than"

Change the "less" into "less-than" in the pronounciations of `0< d0< du< f0< f<`

Change the "greater" into "greather-than" in the pronounciations of `0>`


,------------------------------------------
| 2022-09-14 19:03:46  AntonErtl  replies:
| proposal - Pronounciations
| see: https://forth-standard.org/proposals/pronounciations#reply-888
`------------------------------------------
## Author:

Anton Ertl

## Change Log:

2022-09-14 Settle on 'Have "than"', but without `<#` `#>`.

## Problem:

Some pronounciations are misleading or inconsistent.  This proposal covers all the pronounciations I commented on recently.

## Solution:

Change the pronounciations.  Don't change `<#` `#>`, because they are not comparisons.

## Proposal:

Change the pronounciations as follows:

Word | current pronounciation | proposed pronounciation | rationale
-------- | ------------------------------ | --------------------------------- | -----------
`+x/string` |  plus-x-string | plus-x-slash-string | audio-to-writing correspondence
`-trailing-garbage` | minus-trailing-garbage | dash-trailing-garbage | consistency with `-trailing`
`f>s`    | F to S | f-to-s | consistency with `d>s`
`s>f`    | S to F | s-to-f | consistency with `s>d`
`x\string-` | x-string-minus | x-backslash-string-minus | audio-to-writing correspondence

Change the "less" into "less-than" in the pronounciations of `0< d0< du< f0< f<`

Change the "greater" into "greather-than" in the pronounciations of `0>`


,------------------------------------------
| 2022-09-15 14:09:28  AntonErtl  replies:
| proposal - Call for Vote - Ambiguous condition in 16.3.3
| see: https://forth-standard.org/proposals/call-for-vote-ambiguous-condition-in-16-3-3#reply-889
`------------------------------------------
This proposal misses a discussion of existing practice.  In particular, there have been significant discussions in comp.lang.forth about the issue, and my impression is that this issue was not clear-cut (but I may be misremembering).

It's not clear to may that

> The ambiguous condition permits systems to use the CURRENT wordlist to find the most recent name. Given the variety and complexity of recent wordlist structures, this apparent simplicity is rarely found compared to just updating a pointer for each name.

supports the proposal.


,------------------------------------------
| 2022-09-15 14:56:44  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-890
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation
* 2022-09-10 Typo fixed
* 2022-09-12 Fix for search order reference implementation
* 2022-09-15 Revert to Trute's table approach to call specific modes deliberately

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that interprets, compiles or postpones the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**INTERPRET-TRANSLATOR** ( tanslate-xt -- xt-interpret ) RECOGNIZER EXT

Translate as in interpretation state

**COMPILE-TRANSLATOR** ( tanslate-xt -- xt-compile ) RECOGNIZER EXT

Translate as in compilation state

**POSTPONE-TRANSLATOR** ( tanslate-xt -- xt-postpone ) RECOGNIZER EXT

Translate as in postpone state

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname name>compile swap literal compile, ;
    translate: translate-nt ( nt -- )
    ' noop
    ' lit,
    :noname lit, postpone lit, ;
    translate: translate-num ( n -- )

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognize ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognize

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : forth-recognizer ( -- xt )
      action-of forth-recognize ;
    : interpret-translator ( tanslate-xt -- xt-interpret ) >body 2 cells + @ ;
    : compile-translator ( translate-xt -- xt-compile) >body 1 cells + @ ;
    : postpone-translator ( translate-xt -- xt-postpone ) >body 0 cells + @ ;

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    root-wordlist forth-wordlist dup 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-15 15:02:45  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-891
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation
* 2022-09-10 Typo fixed
* 2022-09-12 Fix for search order reference implementation
* 2022-09-15 Revert to Trute's table approach to call specific modes deliberately

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that interprets, compiles or postpones the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**INTERPRET-TRANSLATOR** ( tanslate-xt -- xt-interpret ) RECOGNIZER EXT

Get the interpreter xt from the translator

**COMPILE-TRANSLATOR** ( tanslate-xt -- xt-compile ) RECOGNIZER EXT

Get the compiler xt from the translator

**POSTPONE-TRANSLATOR** ( tanslate-xt -- xt-postpone ) RECOGNIZER EXT

Get the postpone xt from the translator

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname name>compile swap literal compile, ;
    translate: translate-nt ( nt -- )
    ' noop
    ' lit,
    :noname lit, postpone lit, ;
    translate: translate-num ( n -- )

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognize ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognize

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : forth-recognizer ( -- xt )
      action-of forth-recognize ;
    : interpret-translator ( tanslate-xt -- xt-interpret ) >body 2 cells + @ ;
    : compile-translator ( translate-xt -- xt-compile) >body 1 cells + @ ;
    : postpone-translator ( translate-xt -- xt-postpone ) >body 0 cells + @ ;

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    root-wordlist forth-wordlist dup 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-15 15:10:01  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-892
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators
* 2022-09-08 add the requested extensions, integrate results of bikeshedding discussion
* 2022-09-08 adjust reference implementation to results of last bikeshedding discussion
* 2022-09-09 Take comments from ruv into account, remove specifying STATE involvement
* 2022-09-10 More complete reference implementation
* 2022-09-10 Add use of extended words in reference implementation
* 2022-09-10 Typo fixed
* 2022-09-12 Fix for search order reference implementation
* 2022-09-15 Revert to Trute's table approach to call specific modes deliberately

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make sure the API is not mandating a special implementation

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-xt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : translate-xt ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-xt ( addr u -- ... translator )
      here place  here find dup IF  [']  translate-xt
      ELSE  drop ['] notfound  THEN ;

The standard interpreter loop should look like this:

    : interpret ( i*x -- j*x )
      BEGIN  parse-name dup  WHILE  forth-recognize execute  REPEAT
      2drop ;

with the usual additions to check e.g. for empty stacks and such.

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translate-xt | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

*TRANSLATE-THING* ( j\*x i\*x -- k\*x )

A translator xt that interprets, compiles or postpones the action of the thing according to what the state the system is in.

`i*x` is the additional information provided by the recognizer, `j*x` and `k*x` are the stack inputs and outputs of interpreting/compiling or postponing the thing.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZE** ( addr len -- i*x translator-xt | NOTFOUND-xt ) RECOGNIZER

Takes a string and tries to recognize it, returning the translator xt and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW`.  If the exception word set is not present, the system shall use a best effort approach to display an adequate error message.

**TRANSLATE:** ( xt-int xt-comp xt-post "name" -- ) RECOGNIZER EXT

Create a translator word under the name "name".  This word is the only standard way to define a user-defined translator from scratch.

"name:" ( j*x i*x -- k*x ) performs xt-int in interpretation, xt-comp in compilation and xt-post in postpone state using a system-specific way to determine the current mode.

## XY.6.2 Recognizer Extension Words

**SET-FORTH-RECOGNIZE** ( xt -- ) RECOGNIZER EXT

Assign the recognizer xt to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise can use this word to change the behavior instead of using `IS FORTH-RECOGNIZE`.

**FORTH-RECOGNIZER** ( -- xt ) RECOGNIZER EXT

Obtain the recognizer xt that is assigned to FORTH-RECOGNIZE.

Rationale:

FORTH-RECOGNIZE is likely a deferred word, but systems that implement it otherwise, can use this word to change the behavior instead of using `ACTION-OF FORTH-RECOGNIZE`.  The old API has this function under the name FORTH-RECOGNIZER (as a value) and this name is reused.  Systems that want to continue to support the old API can support `TO FORTH-RECOGNIZER`, too.

**RECOGNIZER-SEQUENCE:** ( xt1 .. xtn n "name" -- ) RECOGNIZER EXT

Create a named recognizer sequence under the name "name", which, when executed, tries to recognize strings starting with xtn and proceeding towards xt1 until successful.

**SET-RECOGNIZER-SEQUENCE** ( xt1 .. xtn n xt-seq -- ) RECOGNIZER EXT

Set the recognizer sequence of xt-seq to xt1 .. xtn.

**GET-RECOGNIZER-SEQUENCE** ( xt-seq -- xt1 .. xtn n ) RECOGNIZER EXT

Obtain the recognizer sequence xt-seq as xt1 .. xtn n.

**INTERPRET-TRANSLATOR** ( tanslate-xt -- xt-interpret ) RECOGNIZER EXT

Get the interpreter xt from the translator

**COMPILE-TRANSLATOR** ( tanslate-xt -- xt-compile ) RECOGNIZER EXT

Get the compiler xt from the translator

**POSTPONE-TRANSLATOR** ( tanslate-xt -- xt-postpone ) RECOGNIZER EXT

Get the postpone xt from the translator

**TANSLATE-NT** ( j*x nt -- k*x ) RECOGNIZER EXT

Translates a name token; system component you can use to construct other translators of.

**TRANSLATE-NUM** ( j*x x -- k*x ) RECOGNIZER EXT

Translates a number; system component you can use to construct other translators of.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix.  This implementation does only take interpret and compile state into account, and uses the STATE variable to distinguish.

    Defer forth-recognize ( addr u -- i*x translator-xt / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognize execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : translate: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;
    :noname name>interpret execute ;
    :noname name>compile execute ;
    :noname name>compile swap lit, compile, ;
    translate: translate-nt ( nt -- )
    ' noop
    ' lit,
    :noname lit, postpone lit, ;
    translate: translate-num ( n -- )

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] translate-nt  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] translate-num  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognize ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognize

## Extensions reference implementation:

    : set-forth-recognize ( xt -- )
      is forth-recognize ;
    : forth-recognizer ( -- xt )
      action-of forth-recognize ;
    : interpret-translator ( tanslate-xt -- xt-interpret ) >body 2 cells + @ ;
    : compile-translator ( translate-xt -- xt-compile) >body 1 cells + @ ;
    : postpone-translator ( translate-xt -- xt-postpone ) >body 0 cells + @ ;

### Stack library

    : STACK: ( size "name" -- )
      CREATE 0 , CELLS ALLOT ;

    : SET-STACK ( item-n .. item-1 n stack-id -- )
      2DUP ! CELL+ SWAP CELLS BOUNDS
      ?DO I ! CELL +LOOP ;

    : GET-STACK ( stack-id -- item-n .. item-1 n )
      DUP @ >R R@ CELLS + R@ BEGIN
        ?DUP
      WHILE
        1- OVER @ ROT CELL - ROT
      REPEAT
      DROP R> ;

### Recognizer sequences

    : recognize ( addr len rec-seq-id -- i*x translator-xt | NOTFOUND )
      DUP >R @
      BEGIN
        DUP
      WHILE
        DUP CELLS R@ + @
        2OVER 2>R SWAP 1- >R
        EXECUTE DUP ['] NOTFOUND <> IF
          2R> 2DROP 2R> 2DROP EXIT
        THEN
        DROP R> 2R> ROT
      REPEAT
      DROP 2DROP R> DROP ['] NOTFOUND
    ;
    #10 Constant min-sequence#
    : recognizer-sequence: ( rec1 .. recn n "name" -- )
      min-sequence# stack: min-sequence# 1+ cells negate here + set-stack
      DOES>  recognize ;
    : ?defer@ ( xt1 -- xt2 )
      BEGIN dup is-defer? WHILE  defer@  REPEAT ;
    : set-recognizer-sequence ( rec1 .. recn n rec-seq-xt -- ) ?defer@ >body set-stack ;
    : get-recognizer-sequence ( rec-seq-xt -- rec1 .. recn n ) ?defer@ >body get-stack ;

Once you have recognizer sequences, you shall define

    ' rec-num ' rec-nt 2 recognizer-sequence: default-recognize
    ' default-recognize is forth-recognize

The recognizer stack looks surprisingly similar to the search order stack, and Gforth uses a recognizer stack to implement the search order.  In order to do so, you define wordlists in a way that a wid is an execution token which searches the wordlist and returns the appropriate translator.

    : find-name-in ( addr u wid -- nt / 0 )
      execute ['] notfound = IF  0  THEN ;
    root-wordlist forth-wordlist dup 3 recognizer-sequence: search-order
    : find-name ( addr u -- nt / 0 )
      ['] search-order find-name-in ;
    : get-order ( -- wid1 .. widn n )
      ['] search-order get-recognizer-sequence ;
    : set-order ( wid1 .. widn n -- )
      ['] search-order set-recognizer-sequence ;

## Testing

TBD


,------------------------------------------
| 2022-09-15 17:28:36  AntonErtl  replies:
| proposal - EMIT and non-ASCII values
| see: https://forth-standard.org/proposals/emit-and-non-ascii-values#reply-893
`------------------------------------------
## Author:

Anton Ertl

## Change Log:

2021-04-03 Original proposal
2022-09-15 Better wording (also includes systems with address units >8 bits)

## Problem:

The first ideas for the xchar wordset had EMIT behave like (current) XEMIT.  Then Stephen Pelc pointed out that EMIT is used in a number of programs for dealing with raw bytes, so we introduced XEMIT for dealing with extended characters.  But the wording and stack effect of EMIT suggests that EMIT should deal with (possibly extended) characters rather than raw bytes.  This is at odds with a number of implementations, and there is hardly any reason to keep both EMIT and XEMIT.

## Solution:

Define EMIT to deal with uninterpreted characters.  Concerning systems with characters=address units larger than bytes, I would like to hear back from them if they need any more specific definition than what is proposed.

I leave a likewise proposal for KEY to interested parties.

## Typical use: (Optional)

$c3 emit $a4 emit \ outputs ä on an UTF-8 system

## Proposal:

Change the definition of EMIT into:

> EMIT ( char -- )
>
> Send char to the user output device without interpreting it.

Add a reference to "18.6.1.2488.10 XEMIT" to the "See:" section.

Rationale:

> EMIT supports low-level communication of arbitrary contents, not limited to specific encodings; it corresponds to TYPEing one char/byte.  To print multi-byte extended characters, the straightforward way is to use TYPE or XEMIT, but you can also print the individual bytes with multiple EMITs.

## Reference implementation:

```
create emit-buf 1 allot

: emit ( char -- )
  emit-buf c! emit-buf 1 type ;
```

## Existing practice

Gforth, SwiftForth, and VFX implement EMIT as dealing with raw bytes (tested with the "typical use" above), but Peter Fälth's system implements EMIT as an alias of XEMIT, and iForth prints two funny characters.  It is unclear if there are any existing programs affected by the proposed change.

## Testing:

This cannot be tested from a standard program, because there is no way to inspect the output of EMIT.


,------------------------------------------
| 2022-09-15 18:56:26  AntonErtl  replies:
| proposal - An alternative to the RECOGNIZER proposal
| see: https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-894
`------------------------------------------
Retired because it is no longer championed.

Superseded by [minimalistic core API for recognizers](https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-892).


,------------------------------------------
| 2022-09-15 19:06:43  GeraldWodni  replies:
| proposal - Call for Vote - Ambiguous condition in 16.3.3
| see: https://forth-standard.org/proposals/call-for-vote-ambiguous-condition-in-16-3-3#reply-895
`------------------------------------------
This proposal has not reach consensus (inconclusive vote) and needs to rework to be used as a real proposal.


,------------------------------------------
| 2022-09-15 19:12:39  GeraldWodni  replies:
| proposal - XML Forth Standard - migration from LaTeX to DocBook
| see: https://forth-standard.org/proposals/xml-forth-standard-migration-from-latex-to-docbook#reply-896
`------------------------------------------
This proposal will be retired as no immediate action is required.

It should however serve as a template for a future editor who wants to migrate to XML, so they do not need to start from scratch.


,------------------------------------------
| 2022-09-15 19:21:07  BerndPaysan  replies:
| proposal - Recognizer
| see: https://forth-standard.org/proposals/recognizer#reply-897
`------------------------------------------
# Forth Recognizer -- Request For Discussion

* Author: Matthias Trute
* Version: 4
* Date: 2 August 2018
* Status: Retired (Committee Supported Proposal)

Superseded by [Minimalistic core API for recognizer]{https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-892)

## Change history

1. 2014-10-03 Version 1 - initial version.
2. 2015-05-17 Version 2 - extend rationale, added ' and [']
3. 2015-12-01 Version 3 - separate use cases, minor changes for nested recognizer stacks. New `POSTPONE` action.
4. 2018-07-24 Version 4 - Clarifications, Fixing typos, added test cases

## Change history, details

1. 2016-09-18 Added more test cases
1. 2016-09-25 Clarify that `>IN` is unchanged for an `REC-FAIL` (`RECTYPE-NULL`)
result.
1. 2016-10-21 simpler reference implementation
1. 2016-11-05 first attempt to rename keywords and concept names
1. 2017-05-15 discussion of `LOCATE`
1. 2017-08-08 move example recognizers to discussion/rationale section.
1. 2017-09-12 renamed keywords in XY.6.1 as suggested by the Forth 200x committee
1. 2017-12-06 changed wording from &quot;recognizer stack&quot; to &quot;recognizer sequence&quot;.
1. 2017-12-10 created Recognizer EXT section with recognizer sequence management words.
1. 2018-04-09 expanded EXT section with RECTYPE* words
1. 2018-05-11 add comments about `recognizable?`
1. 2018-07-23 finalized
1. 2018-07-24 small bugfixes
1. 2018-08-02 split document into proposal and comments

# Problem

The Forth compiler can be extended easily. The Forth
interpreter however has a fixed set of capabilities as
outlined in section 3.4 of the standard text: Words from
the dictionary and some number formats.

It's not possible to use the Forth text interpreter
in an application or system extension context. Most interpreters in
existing systems use a number of hooks to extent the interpreter.
That makes it possible to use a loadable library to
implement new data types to be handled like the built-in
ones. An example are the floating point numbers. They
have their own parsing and data handling words including
a stack of their own.

Furthermore applications need to use system provided and system specific
words or have to re-invent the wheel to get numbers with a sign or
hex numbers with the $ prefix. The building blocks (`FIND`, `COMPILE,`,
`>NUMBER` etc) are available but there is a gap between them and what
the Forth interpreter already does.

To actually handle data in the Forth context, the
processing actions need to be `STATE` aware. It
would be nice if the Forth text interpreter,
that maintains `STATE`, is able to do the data
processing without exposing `STATE` to the data
handling methods. These different methods need to
be registered somehow.

# Solution

The monolithic design of the Forth interpreter is factored into
three major blocks: First the interpreter. It maintains `STATE`
and organizes the work. Second the actual data parsing. It is
called from the interpreter and analyses strings (sub-strings
of `SOURCE`) if they match the criteria for a certain data
type. These parsing words are grouped to achieve an
order of invocation. The result of the parsing words is handed
over to the interpreter with data specific handling methods.
There are three different methods for each data type depending
on `STATE` and to `POSTPONE` the data.

The combination of a parsing word and the set of data handling words
to deal with the data is called a recognizer. There is no strict 1:1
relation between the parsing words and the data handling sets. A data
handling set for e.g. single cell numbers can be used by different
parsing words.

Whenever the Forth text interpreter is mentioned, the standard
words `EVALUATE` (CORE), `'` (tick, CORE), `INCLUDE-FILE`
(FILE), `INCLUDED` (FILE), `LOAD` (BLOCK) and `THRU` (BLOCK)
are expected to act likewise. This proposal is not about to change
these words, but to provide the tools to do so. As long as the
standard feature set is used, a complete replacement with
recognizers is possible.

This proposal is about the building blocks.

# Proposal

## XY. The optional Recognizer word set

### XY.1 Introduction

The recognizer concept consists of two elements: parsing words
that return data type information that identify the parsed data
and provide methods to perform the various semantics of the data:
interpret, compile and postpone. A parsing word can return
different data type information. A particular data type information
can be used by different parsing words.

A system provided data type information is called `RECTYPE-NULL`.
It is used if no other one is applicable. This token is
associated with the system error actions if used in step
e) of the text interpreter (see Appendix). It is used to
achieve the action d) of the section 3.4 text interpreter.

A recognizing word within the recognizer concept has the stack effect

```
REC-SOMETYPE ( addr len -- i*x RECTYPE-SOMETYPE | RECTYPE-NULL )
```

This recognizing word must not change the string. When it is called
from the interpreter, it may access `SOURCE` and, if applicable,
even change `>IN`. If `>IN` is not used, any string may serve
as input, otherwise &quot;addr/len&quot; is assumed to be a substring of the
buffer `SOURCE`.

"i*x" is the result of the recognizing action of the string "addr/len".
`RECTYPE-SOMETYPE` is the data type id that the interpreter uses
to execute the interpret, compile or postpone actions for the data `i*x`.

All three actions are called with the "i*x" data as left from the
recognizing word and are generally expected to consume it. They can
have additional stack effects, depending on what
`RECTYPE-SOMETYPE-METHOD` actually does.

```
RECTYPE-SOMETYPE-METHOD ( ... i*x -- j*y )
```

The data "i*x" doesn't have to be on the data stack, it
can be at different places, if applicable. E.g. floating
point numbers have a stack of their own. In this case,
the data stack contains the `RECTYPE-SOMETYPE` information only.


### XY.2 Additional terms and notations

**Data type id**
A cell sized number. It identifies the data type and a method set to perform
the data processing in the text interpreter. The actual numeric value is
system specific.

**Recognizer**
A string parsing word that returns a data type id together
with the parsed data if successful. The string parsing
word is assumed to run within the Forth interpreter and
can access `SOURCE` and `>IN`.

**Recognizer Sequence**
An ordered set of recognizers. It is identified with
a cell sized numeric id.

### XY.3 Additional usage requirements

#### XY.3.1 Data type id

A data type id is a single cell value that
identifies a certain data type. Append table
the following table to table 3.1

<table border="1" class="docutils">
<colgroup>
<col width="20%" />
<col width="43%" />
<col width="38%" />
</colgroup>
<tbody valign="top">
<tr><td>Symbol</td>
<td>Data type</td>
<td>Size on Stack</td>
</tr>
<tr><td>dt</td>
<td>data type id</td>
<td>1 cell</td>
</tr>
</tbody>
</table>

### XY.4 Additional documentation requirements ###

#### XY.4.1 System documentation ####

##### XY.4.1.1 Implementation-defined options #####

No additional options.

##### XY.4.1.2 Ambiguous conditions #####

* Change of the content of the parsed string during parsing.

#### XY.4.2 Program documentation ####

No additional dependencies.

### XY.5 Compliance and labeling ###

The phrase "Providing the Recognizer word set" shall be appended
to the label of any standard system that provides all of the
Recognizer word set.

### XY.6 Glossary ###

#### XY.6.1 Recognizer Words ####

**FORTH-RECOGNIZER** ( -- rec-seq-id ) RECOGNIZER \
A system VALUE with a recognizer sequence id.

It is `VALUE` that can be changed with `TO` to assign a
new recognizer set. This change has immediate effect.

This recognizer set shall be used in all
system level words like `EVALUATE`, `LOAD` etc.

**RECOGNIZE** ( addr len rec-seq-id -- i*x RECTYPE-DATATYPE | RECTYPE-NULL )
RECOGNIZER \

Apply the string at "addr/len" to the elements of the recognizer
set identified by `rec-seq-id`. Terminate the iteration if either
a parsing word returns a data type id that is different from
`RECTYPE-NULL` or the set is exhausted. In this case return
`RECTYPE-NULL`.

"i*x" is the result of the parsing word. It represents the data from
the string. It may be on other locations than the data stack. In this
case the stack diagram should be read accordingly.

**RECTYPE>COMP** ( RECTYPE-DATATYPE -- XT-COMPILE )   RECOGNIZER \

Return the execution token for the compilation action from the
recognizer date type id.

**RECTYPE>INT**  ( RECTYPE-DATATYPE -- XT-INTERPRET ) RECOGNIZER \
Return the execution token for the interpretation action from
the recognizer data type id.

**RECTYPE>POST** ( RECTYPE-DATATYPE -- XT-POSTPONE )  RECOGNIZER \
Return the execution token for the postpone action from the
recognizer data type id.

**RECTYPE-NULL** ( -- RECTYPE-NULL ) RECOGNIZER \
The null data type id. It is to be used if no other
data type id is applicable but one is needed. Its
associated methods perform system specific error
actions. The actual numeric value is system dependent.

**RECTYPE:** ( XT-INTERPRET XT-COMPILE XT-POSTPONE "<spaces>name" -- )
RECOGNIZER \
Skip leading space delimiters. Parse name delimited by a space. Create
a data type id under the name `name` and associate the three execution
tokens.

The words for XT-INTERPRET, XT-COMPILE and XT-POSTPONE are called with
the parsed data `i*x` that e.g. `RECOGNIZE` has returned.

The word behind XT-INTERPRET shall have the stack effect
`( ... i*x -- j*y )`. The words behind XT-COMPILE and XT-POSTPONE shall
consume `i*x`.

The execution time of `name` leaves a cell sized token on the data stack
that can be applied to the `RECTYPE>*` words.

#### YZ.6.2 Recognizer Extension Words ####
A Forth system that uses recognizers in the core
has words for numbers and dictionary look-ups.
They shall be named as shown in the table:

<table border="1" class="docutils">
<colgroup>
<col width="16%" />
<col width="84%" />
</colgroup>
<tbody valign="top">
<tr><td>Name</td>
<td>Stack effect</td>
</tr>
<tr><td>`REC-NUM`</td>
<td>`( addr len -- n RECTYPE-NUM | d RECTYPE-DNUM | RECTYPE-NULL )`</td>
</tr>
<tr><td>`REC-FLOAT`</td>
<td>`( addr len -- RECTYPE-FLOAT | RECTYPE-NULL ) (F: -- f | )`</td>
</tr>
<tr><td>`REC-FIND`</td>
<td>`( addr len -- XT +/-1 RECTYPE-XT | RECTYPE-NULL )`</td>
</tr>
<tr><td>`REC-NT`</td>
<td>`( addr len -- NT RECTYPE-NT | RECTYPE-NULL )`</td>
</tr>
</tbody>
</table>

The recognizer type names, if available, shall be as shown in the table below:

<table border="1" class="docutils">
<colgroup>
<col width="21%" />
<col width="45%" />
<col width="34%" />
</colgroup>
<tbody valign="top">
<tr><td>Name</td>
<td>Stack items</td>
<td>Comment</td>
</tr>
<tr><td>`RECTYPE-NUM`</td>
<td>`( -- n RECTYPE-NUM)`</td>
<td>single cell number</td>
</tr>
<tr><td>`RECTYPE-DNUM`</td>
<td>`( -- d RECTYPE-DNUM)`</td>
<td>double cell number</td>
</tr>
<tr><td>`RECTYPE-FLOAT`</td>
<td>`( -- RECTYPE-FLOAT)`
`(F: -- f )`</td>
<td>floating point number ,</td>
</tr>
<tr><td>`RECTYPE-XT`</td>
<td>`( -- XT +/-1 RECTYPE-XT)`</td>
<td>word from the dictionary
matching `FIND`</td>
</tr>
<tr><td>`RECTYPE-NT`</td>
<td>`( -- NT RECTYPE-NT)`</td>
<td>word from the dictionary
with name token  NT</td>
</tr>
</tbody>
</table>

The following words deal with changing and creating recognizer sequences.

**GET-RECOGNIZER** ( rec-seq-id -- rec-n .. rec-1 n ) RECOGNIZER EXT \
Copy the recognizer sequence `rec-1 .. rec-n` to the data stack. The
element `rec-1` is the first in the sequence.

The source is unchanged.

**SET-RECOGNIZER** ( rec-n .. rec-1 n rec-seq-id -- ) RECOGNIZER EXT \
<dd>Replace the recognizer sequence identified by `rec-seq-id` with a
new set of `n` recognizers `rec-x`.

If the capacity of the destination sequence is too small to hold all
new elements, an ambiguous situation arises.

</dd>
<dt>NEW-RECOGNIZER-SEQUENCE ( size .. rec-seq-id ) RECOGNIZER EXT</dt>
<dd>Create a new, empty recognizer sequence with at least
`size` elements.</dd>
</dl>
</div>
</div>
<div class="section" id="xy-7-reference-implementation">
### XY.7 Reference Implementation ###
Basic recognizer sequence module. It is implemented as a separate
stack.

```
: STACK ( size -- stack-id )
    1+ ( size ) CELLS HERE SWAP ALLOT
    0 OVER ! \ empty stack
;

: SET-STACK ( item-n .. item-1 n stack-id -- )
  2DUP ! CELL+ SWAP CELLS BOUNDS
  ?DO I ! CELL +LOOP ;

: GET-STACK ( stack-id -- item-n .. item-1 n )
   DUP @ >R R@ CELLS + R@ BEGIN
     ?DUP
   WHILE
     1- OVER @ ROT CELL - ROT
   REPEAT
   DROP R> ;
```

The recognizer sequence uses the stack module. Hence the stack-id becomes the
rec-seq-id.

```
: NEW-RECOGNIZER-SEQUENCE STACK ;
: SET-RECOGNIZER SET-STACK ;
: GET-RECOGNIZER GET-STACK ;

\ create the default recognizer sequence
4 NEW-RECOGNIZER-SEQUENCE VALUE FORTH-RECOGNIZER

\ create a simple 3 element structure
: RECTYPE: ( XT-INTERPRET XT-COMPILE XT-POSTPONE "<spaces>name" -- )
   CREATE SWAP ROT , , ,
;

\ decode the data structure created by RECTYPE:
: RECTYPE>POST ( RECTYPE-TOKEN -- XT-POSTPONE ) CELL+ CELL+ @ ;
: RECTYPE>COMP ( RECTYPE-TOKEN -- XT-COMPILE  )       CELL+ @ ;
: RECTYPE>INT  ( RECTYPE-TOKEN -- XT-INTERPRET)             @ ;

\ the null token
:NONAME -1 ABORT" FAILED" ; DUP DUP RECTYPE: RECTYPE-NULL

\ depends on the stack implementation
: RECOGNIZE   ( addr len rec-seq-id -- i*x RECTYPE-SOMETYPE | RECTYPE-NULL )
    DUP >R @
    BEGIN
      DUP
    WHILE
      DUP CELLS R@ + @
      2OVER 2>R SWAP 1- >R
      EXECUTE DUP RECTYPE-NULL <> IF
        2R> 2DROP 2R> 2DROP EXIT
      THEN
      DROP R> 2R> ROT
    REPEAT
    DROP 2DROP R> DROP RECTYPE-NULL
;
```

## A.XY Informal Appendix ##

### A.XY.1 Text Interpreter ###

The Forth text interpreter can be changed into a generic tool
that is capable to deal with any data type. It maintains `STATE`
and calls the data processing methods according to it. The
example is a full replacement if all necessary recognizers are
available.

The algorithm of the Forth text interpreter as described in
section 3.4 is modified. All subsections of 3.4 apply
unchanged. Change the steps b) and c) from section 3.4 to make them
optional, they can be performed with recognizers. Replace the step
d) with the following steps d) to f)

1. For each element of the recognizer sequence provided by `FORTH-RECOGNIZER`,
  starting with the top element, call its parsing method with the sub-string
  "name" from step a).

  Every parsing method returns an information token and the parsed data from
  the analyzed sub-string if successful.  Otherwise it returns the system
  provided failure token `RECTYPE-NULL` and no further data.

  Continue with the next element in the recognizer set until either all are
  used or the information token returned from the parsing word is not the
  system provided failure token `RECTYPE-NULL`.

2. Use the information token and do one of the following
    1. if interpreting execute the interpret method associated with the
       information token.
    2. if compiling execute the compile method associated with the information
       token.
3. Continue with a)

```
: INTERPRET
  BEGIN
      PARSE-NAME DUP
  WHILE
      FORTH-RECOGNIZER RECOGNIZE
      STATE @ IF RECTYPE>COMP ELSE RECTYPE>INT THEN
      EXECUTE
      ?STACK  \ simple housekeeping
  REPEAT 2DROP
;
```

### A.XY.2 POSTPONE ###

`POSTPONE` compiles the data returned by `RECOGNIZE` (`i*x`)
into the dictionary as literal(s) and appends the compilation action
of the `RECTYPE-TOKEN` data type id. Later at run-time the `i*x`
data is read back and the compilation action is performed like it
would have been called directly at compile time.

```
: POSTPONE ( "name" -- )
  PARSE-NAME FORTH-RECOGNIZER RECOGNIZE DUP >R
  RECTYPE>POST EXECUTE R> RECTYPE>COMP COMPILE, ;
```
This implementation assumes a system that uses recognizers only.

### A.XY.3 Test Cases ###

The test cases assume a stack to implement the recognizer set.

```
T{ 4 NEW-RECOGNIZER-SEQUENCE constant RS -> }T

T{ :NONAME 1 ;  :NONAME 2 ;  :NONAME 3  ; RECTYPE: rectype-1 -> }T
T{ :NONAME 10 ; :NONAME 20 ; :NONAME 30 ; RECTYPE: rectype-2 -> }T

T{ : rec-1 NIP 1 = IF rectype-1 ELSE RECTYPE-NULL THEN ; -> }T
T{ : rec-2 NIP 2 = IF rectype-2 ELSE RECTYPE-NULL THEN ; -> }T

T{ rectype-1 RECTYPE>INT  EXECUTE -> 1 }T
T{ rectype-1 RECTYPE>COMP EXECUTE -> 2 }T
T{ rectype-1 RECTYPE>POST EXECUTE -> 3 }T

\ testing RECOGNIZE
T{         0 RS SET-RECOGNIZER -> }T
T{ S" 1"     RS RECOGNIZE   -> RECTYPE-NULL }T
T{ ' rec-1 1 RS SET-STACK -> }T
T{ S" 1"     RS RECOGNIZE   -> rectype-1 }T
T{ S" 10"    RS RECOGNIZE   -> RECTYPE-NULL }T
T{ ' rec-2 ' rec-1 2 RS SET-STACK -> }T
T{ S" 10"    RS RECOGNIZE   -> rectype-2 }T
```
The dictionary lookup has the following test cases

```
T{ S" DUP" REC-FIND  -> ' DUP -1 RECTYPE-XT }T
T{ S" UNKOWN WORD" REC-FIND -> RECTYPE-NULL }T
```
The number recognizer has the following checks

```
VARIABLE OLD-BASE BASE @ OLD-BASE !

T{ : S-1234 S" 1234" ; -> }T
T{ : D-1234 S" 1234." ; -> }T
T{ : S-UNKNOWN S" unknown word" ; -> }T
T{ : S-DUP  S" DUP" ; -> }T

T{ S-1234 FORTH-RECOGNIZER RECOGNIZE -> 1234  RECTYPE-NUM   }T
T{ D-1234 FORTH-RECOGNIZER RECOGNIZE -> 1234. RECTYPE-DNUM  }T
T{ S-DUP  FORTH-RECOGNIZER RECOGNIZE -> ' DUP -1 RECTYPE-XT }T
T{ S-UNKNOWN FORTH-RECOGNIZER RECOGNIZE  -> RECTYPE-NULL }T
T{ S" %-10010110" REC-NUM -> -150 RECTYPE-NUM }T
T{ S" %10010110"  REC-NUM ->  150 RECTYPE-NUM }T
T{ S" 'Z'"    REC-NUM -> char Z RECTYPE-NUM }T
T{ S" ABCXYZ" REC-NUM -> RECTYPE-NULL }T

\ check whether BASE is unchanged
T{ BASE @ OLD-BASE @ = -> -1 }T
```

Floating point numbers are handled likewise

```
T{ : S-1234e5 S" 1234e5" ; -> }T
T{ S-1234e5 REC-FLOAT -> 1234e5 RECTYPE-FLOAT }
T{ S-1234e5 FORTH-RECOGNIZER RECOGNIZE -> 1234e5 RECTYPE-FLOAT }T
```

# Experience #

First ideas to dynamically extend the Forth text interpreter
were published in 2005 at comp.lang.forth by Josh Fuller and J Thomas:
[Additional Recognizers](http://compgroups.net/comp.lang.forth/additional-recognizers/734676)?

A specific solution to deal with number prefixes was
roughly sketched by Anton Ertl at comp.lang.forth in 2007 with
[https://groups.google.com/forum/#!msg/comp.lang.forth/r7Vp3w1xNus/Wre1BaKeCvcJ](https://groups.google.com/forum/#!msg/comp.lang.forth/r7Vp3w1xNus/Wre1BaKeCvcJ)

There are a number of specific solutions that can at least partly be seen
as recognizers in various Forth's:

* prefix-detection in ciforth
* W32Forth uses its "chain" concept to achieve similar effects.
* various commercial Forth's seem to have ways to extent the
  interpreter.
* FICL, a system close to Forth, has
  parse-steps[](http://ficl.sourceforge.net/parsesteps.html) since approx
  2001.

A first generic recognizer concept was implemented in amforth
version 4.3 (May 2011). The design presented in this RFD is
implemented with version 5.3 (May 2014). gforth has
recognizers since 2012, the ones described here since June
2014.

Existing recognizers cover a wide range of data formats
like floating point numbers and strings. Others mimic the
back-tick syntax used in many Unix shells to execute OS
sub-process. A recognizer is used to implement OO
notations.

Most of the small words that constitute a recognizer don't
need a name actually since only their execution tokens are
used. For the major words a naming convention is suggested:
`REC-<name>` for the parsing word, and `RECTYPE-<name>`
for the data type word created with `RECTYPE:` for the data
type "name".

# Acknowledgments #
The following people did major or minor contributions, in
no particular order.

* Bernd Paysan
* Jenny Brien
* Andrew Haley
* Alex McDonald
* Anton Ertl
* Forth 200x Committee


,------------------------------------------
| 2022-09-15 19:25:20  UlrichHoffmann  replies:
| proposal - Recognizer RfD rephrase 2020
| see: https://forth-standard.org/proposals/recognizer-rfd-rephrase-2020#reply-898
`------------------------------------------
**superceeded by [minimalistic core API for recognizers proposal](https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers?hideDiff#reply-892)**

## Recognizer RfD rephrase 2020

Author: Ulrich Hoffmann  
Contact: uho@xlerb.de  
Version: 0.8
Date: 2020-02-24
Status: Published  


#### Preamble

This text is a rephrasing of just section XY.2, XY.6, section XY.7 and parts of A.XY of the original recognizer RfD [1] by Matthias Trute that uses terminology and word names closer to that already present in Forth-94 and Forth-2012.

It is not intended to invalidate the susequent RfDs B, C or D [2][3][4]. They reflect the ongoing discussion about Forth recognizers and should be considered valuable documentation of that discussion. This text however is intended to revert the recognizer proposal back to simplicity of concepts and terms making it both easier to understand and use as well as simpler to implement.

This text does *not* add any new functionality to the original proposal. It merely introduces different terms for the structures already existing in the original proposal. The only difference in functionality is the substitution of the defining word RECOGNIZER: of the original proposal by the word RECOGNIZER (note the missing : ) that - similar to the Forth-94 word WORDLIST - creates a recognizer information token and leaves it on the data stack.

Yes - this text has the potential of starting a bikeshedding discussion but as the recognizer concepts seem to be stable over the last couple of years it is about time to agree on appropriate names and notions.

The following table summarizes the different terms and names:


| Term in original proposal | Term used here                     | comment                               |
| ------------------------- | ---------------------------------- | ------------------------------------- |
| recognizer stack          | recognizer-order                   | similar to search-order               |
| information token (rit)   | recognizer information token (rit) | explicit and consistent               |
| DO-RECOGNIZER             | RECOGNIZE                          | avoid hyphen in name                  |
| RECOGNIZER:               | RECOGNIZER                         | similar to WORDLIST, no defining word |
| R:FAIL                    | UNRECOGNIZED                       | no : in name, better english          |
| REC:xxx                   | recognize-xxx                      | no : in name, better english          |
| R:xxx                     | xxx-recognized                     | no : in name, better english          |


#### Items to discuss

1. Programs that use the word RECOGNIZE (e.g. user-defined text interpreters) most likely need to use the interpret/compile/postpone xts of the returned recognizer information token. For these programs to be portable among standard systems appropriate access words would need to be standardized. [3] and [4] propose such words. Without these access words standardizing the word RECOGNIZE is doubtful. Only standardizing the modified (internal) text interpreter behavior would be sufficient then.

2. The word RECOGNIZER (and the corresponding defining word RECOGNIZER: of [1]) create the opaque structure *recognizer information token*. As an alternative *recognizer information token*s could be defined - similar to addresses of counted strings (c-addr) - as special addresses and the structure of memory at that address could be exposed. *recognizer information token* could then be created by already existing standard words such as CREATE ALLOT ALLOCATE and would have a known layout, e.g. three xts in sequence: { INTERPRET-XT | COMPILE-XT | POSTPONE-XT }. The access words of 1. would not need to be standardized as each standard program could access the xts using already existing standard words for memory acccess.

3. The word RECOGNIZER (and the corresponding defining word RECOGNIZER: of [1]) despite its name does not create a recognizer (i.e. a parsing-word plus possible several recognizer information tokens) but a single recognizer information token (triple of interpret/compile/postpone xts characterized by a single-cell value). Another name might reflect this functionality better.

4. Changes in the standard text interpreter (i.e. that it invokes the word RECOGNIZE internally) has implication on many other words apart from MARKER (e.g. ' ['] EVALUATE INCLUDE-FILE INCLUDED ...). Changes in their behaviour should be mentioned in the propsal. [2] proposes explicit changes for ' ['] MARKER while [3] and [4] have a paragraph describing the implication generally and do not propose i.e. MARKER changes explicitly. 

5. *Recognizer information tokens* (triple of interpret/compile/postpone xts characterized by a single-cell value) could be named more appropriately. [4] proposes a different name *data type id* that does not seem to be appropriate. Its general notion seems to mislead into the direction of Forth having a data type system.  
From a classical computer science view recognizers act in the lexical analysis (*scanner*) phase of a compiler, operating on sequences of characters detecting appropriate *lexemes* (character subsequences of the input stream) and convert them to *tokens*. Several lexems might map to the same token (e.g. different sequences of digits map to the token NUM) along with so called *attributes* (e.g. the value of the number). For this reason tokens are sometimes also called *token classes* or *token types* or the *kind* of the token. These might be good alternative names instead of *recognizer information token* or *data type id*. Forth-94 and Forth-2012 use the term ID (as in wordlist-id or file-id) to define characterizing single-cell values so going along the xxx-id would be consistent with existing standard terms. (maybe *recognizer-token-id*)? 
 
#### References

[1] *Forth Recognizer -- Request For Discussion, Version 1*, Matthias Trute, 2014-10-03, access at <http://amforth.sourceforge.net/pr/Recognizer-rfc.pdf>

[2] *Forth Recognizer -- Request For Discussion, Version 2*, Matthias Trute, 2015-09-20, access at <http://amforth.sourceforge.net/pr/Recognizer-rfc-B.pdf>

[3] *Forth Recognizer -- Request For Discussion, Version 3*, Matthias Trute, 2016-09-04, access at <http://amforth.sourceforge.net/pr/Recognizer-rfc-C.pdf>

[4] *Forth Recognizer -- Request For Discussion, Version 4*, Matthias Trute, 2018-08-02, access at <http://amforth.sourceforge.net/pr/Recognizer-rfc-D.pdf>

---
## Proposal

....

### XY.2 Additional terms and notations

**Recognizer Information Token**: An implementation-dependent single-cell value that identifies the data type and a method table to perform the data processing of the interpreter. A naming convention suggests that the names end with *-recognized*.
Recognizer Information Tokens are abbreviated *rit* in stack comments.

**Recognizer**: A combination of a text parsing word that returns recognizer information tokens together with parsed data if successful. The text parsing word is assumed to run in cooperation with SOURCE and >IN. A naming convention suggests that the names start with *recognize-*.

...

### XY.6 Glossary

### XY.6.1 Recognizer words

**RECOGNIZE** 
( addr len -- i*x rit | UNRECOGNIZED ) RECOGNIZER

Apply the recognizers in the recognizer-order to the string at "addr/len" one after the other. Terminate the iteration if either a recognizer returns a recognizer information token *rit* that is different from UNRECOGNIZED or the recognizer-order is exhausted. In this case, return UNRECOGNIZED otherwise *rit*.

"i*x" is the result of the parsing word. It may be on other locations than the data stack. In this case the stack diagram should be read accordingly.
 
It is an ambiguous condition if the recognizer-order is empty.

----

**GET-RECOGNIZERS**
 ( -- rec-n .. rec-1 n ) RECOGNIZER

Return the execution tokens rec-1 .. rec-n of the parsing words in the recognizer-order. rec-1 identifies the recognizer that is called first and rec-n the execution token of the word that is called last.

The recognizer-order is unaffected.

----

**MARKER** ( "<spaces>name" -- ) RECOGNIZER

Extend MARKER to include the current recognize-order in the state preservation. 

----

**UNRECOGNIZED** ( -- UNRECOGNIZED ) RECOGNIZER

A constant cell sized recognizer information token with two uses: first it is used to deliver the information that a specific recognizer could not deal with the string passed to it. Second it is a predefined recognizer information token whose elements are used when no recognizer from the recognizer-order could handle the passed string. These methods provide the system error actions.

The actual numeric value is system dependent and has no predictable value.

----

**RECOGNIZER** ( XT-INTERPRET XT-COMPILE XT-POSTPONE -- rit ) RECOGNIZER

Create a recognizer information token *rit* with the three execution tokens XT-INTERPRET XT-COMPILE XT-POSTPONE. The implementation is system dependent.

The words for XT-INTERPRET, XT-COMPILE and XT-POSTPONE are called with the parsed data that the associated parsing word of the recognizer returned. The information token itself is consumed by the interpreter.

----

**SET-RECOGNIZERS** ( rec-n .. rec-1 n -- ) RECOGNIZER

Set the recognizer-order to the recognizers identified by the execution tokens of their parsing words rec-n .. rec-1. rec-1 will be the parsing word of the recognizer that is called first, rec-n will be the last one. 

It is an ambiguous condition, if n is not a positive number.

### XY.7 Reference Implementation

    \ create a simple 3 element structure
    \ rit           : XT-INTERPRET
    \ rit CELL+     : XT-COMPILE
    \ rit 2 CELLS + : XT-POSTPONE
    : RECOGNIZER ( XT-INTERPRET XT-COMPILE XT-POSTPONE -- rit )
        HERE >R SWAP ROT , , , R> ;
        
    \ system failure recognizer
    : notfound ( i*x -- )  -13 THROW ;
    
    ' notfound  ' notfound  ' notfound RECOGNIZER CONSTANT UNRECOGNIZED
    
    \ contains the recognizer-order
    \ first cell is the current number of recognizers.
    10 CELLS BUFFER: recognizer-order
    0 recognizer-order !
   
    : SET-RECOGNIZERS ( rec-n .. rec-1 n -- )
        DUP recognizer-order !
        BEGIN
          DUP
        WHILE
          DUP CELLS recognizer-order +
          ROT SWAP ! 1-
        REPEAT DROP 
    ;
    
    : GET-RECOGNIZERS ( -- rec-n .. rec-1 n )
        recognizer-order @ recognizer-order
        BEGIN
          CELL+ OVER
        WHILE
          DUP @ ROT 1- ROT
        REPEAT 2DROP
        recognizer-order @
    ;
    
    : RECOGNIZE ( addr len -- i*x rit | UNRECOGNIZED )
        recognizer-order @
        BEGIN
          DUP
        WHILE
          DUP CELLS recognizer-order + @
          2OVER 2>R SWAP 1- >R
          EXECUTE DUP UNRECOGNIZED <> IF R> DROP 2R> 2DROP EXIT THEN DROP
          R> 2R> ROT
        REPEAT
        DROP 2DROP
        UNRECOGNIZED
    ;

#### POSTPONE

POSTPONE is outside the Forth interpreter:

    : POSTPONE ( "<spaces>name" -- )
       BL WORD COUNT
       RECOGNIZE
       2 CELLS + @ ( post ) \ get the XT-POSTPONE from recognizer
       EXECUTE
    ; IMMEDIATE

...

### A.XY Informal Annex

#### A.XY.1 Forth Text Interpreter

The Forth text interpreter turns into a generic tool that is capable to deal with any data type. It maintains STATE and calls the data processing methods according to it.

##### INTERPRETER

    : PARSE-NAME ( -- addr u ) BL WORD COUNT ;

    : INTERPRET ( addr len -- i*x rid | unrecognized )
        BEGIN
          PARSE-NAME ?DUP IF DROP EXIT THEN \ no more words?
          RECOGNIZE
          STATE @ IF  CELL+ @  ( comp ) ELSE @ ( interp ) THEN \ get the right XT
          EXECUTE \ do the action
          ?STACK \ simple housekeeping
        AGAIN 
    ;


#### A.XY.2 Example Recognizers

##### Word recognizer

    \ find-name is close to FIND. amforth specific.
    256 BUFFER: find-name-buf
    
    : place ( c-addr1 u c-addr2 )
       2DUP C! CHAR+ SWAP MOVE ;
   
    : find-name ( addr len -- xt +/-1 | 0 )
       find-name-buf place
       find-name-buf
       FIND DUP 0= IF NIP THEN ;

    : immediate? ( flags -- true|false ) 0> ;
        
    \ Define word recognizer
    
    \ INTERPRET
    :NONAME ( i*x XT flags -- j*y )
      DROP EXECUTE ;
   
    \ COMPILE
    :NONAME ( XT flags -- )
      immediate?
      IF COMPILE, ELSE EXECUTE THEN ;
   
    \ POSTPONE
    :NONAME ( XT flags -- )
      immediate?
      IF COMPILE, ELSE POSTPONE LITERAL POSTPONE COMPILE, THEN ;
 
    RECOGNIZER CONSTANT word-recognized

    \ parsing word for word recognizer
    : recognize-word ( addr len -- XT flags rid | UNRECOGNIZED )
       find-name ( addr len -- XT flags | 0 )
       ?DUP IF word-recognized ELSE UNRECOGNIZED THEN ;

    \ prepend the word recognizer to the recognizer-order
    GET-RECOGNIZERS ' recognize-word SWAP 1+ SET-RECOGNIZERS

----

**end of document**


,------------------------------------------
| 2022-09-15 19:34:09  UlrichHoffmann  replies:
| proposal - CS-DROP  (revised 2018-08-20) 
| see: https://forth-standard.org/proposals/cs-drop-revised-2018-08-20-#reply-899
`------------------------------------------
Please refer to the 2019-08-22 version below.


,------------------------------------------
| 2022-09-15 19:34:14  BerndPaysan  replies:
| proposal - Remove the “rules of FIND”
| see: https://forth-standard.org/proposals/remove-the-rules-of-find-#reply-900
`------------------------------------------
Replace the text in [DEFINED]

> Return a true flag if *name* is the name of a word that can be found (according to the rules in the system's FIND); otherwise return a false flag.

with

> Try to find *name.*  Return a true flag if *name* can be found; otherwise return a false flag.

Add the following redefinition of the term “find” to 16.2:

> **find:**
> To search the search order or a specified wordlist for a definition name matching a given string.

Cross-reference 2.1 find and 16.2 find.


,------------------------------------------
| 2022-09-15 19:49:25  UlrichHoffmann  replies:
| proposal - 2022 Standards meeting agenda
| see: https://forth-standard.org/proposals/2022-standards-meeting-agenda#reply-901
`------------------------------------------
# Forth Standards Meeting Draft (1) Agenda
14-15 Sept 2022 15:00-19:00 UTC

Online - for latest details see [chat.forth-standard.org](https://chat.forth-standard.org)

See also: [euro.theforth.net](https://euro.theforth.net), [forth-standard.org](https://forth-standard.org)

**Wednesday, 14th September (UTC)**

   - 14:30 Get together - Setup your gear and smalltalk
   - 14:50 Call to order - get ready (please be online by now)
   - 15:00 Session 3
   - 17:00 Bio Break
   - 17:15 Session 4
   - 19:00 End of main session
   - Workshops

**Thursday, 15th September (UTC)**

   - Workshops
   - 14:30 Get together - Setup your gear and smalltalk
   - 14:50 Call to order - get ready (please be online by now)
   - 15:00 Session 5
   - 17:00 Bio Break
   - 17:15 Session 6
   - 19:00 End of Standards Meeting

Friday: euroForth conference

--- 

## Agenda 

2022-09-08

## Participants
   1. Welcome
   2. Determine the persons present
   3. Meeting transcript

## Review of Procedures
   1. How we organize this meeting

   2. Progress of current work  
      - Draft Document update (last draft is from 2019) 
      - Are we ready for a new Standard snapshop (Forth2023)?
      - How can we speed up our work?
      - How we can better serve the Forth community?  
      - How can we encourage Forthers to submit proposals?  

      Pending Topics include:  
      with some progress:
        - recognizers
        - multi-threaded multitasking
      with little progress:
        - memory access (16/32/64-Bit, RAM/ROM)
        - reduce ambiguous conditions

   3. What are addtional topics for future standardisation?
 
## Reports
   1. Chair
   2. Editor
   3. Technical
   4. Treasurer

## Election/Confirmation of officers

If you would like to stand for election, please suggest your name and please shortly introduce yourself.

We have to elect (by secret ballot) a Chair, Editor, Technical Officer and a Treasurer.

   1. Chair (currently Ulrich Hoffman)
   2. Editor (currently Peter Knaggs)
   3. Technical (currently Gerald Wodni)
   4. Treasurer (currently Bernd Paysan)

## Review of Proposals/Contributions


### Proposals from [forth-standard.org/proposals](https://forth-standard.org/proposals)

#### Proposals in the state *formal*

1. Specify that 0 THROW pops the 0 ([https://forth-standard.org/proposals/specify-that-0-throw-pops-the-0#reply-794](https://forth-standard.org/proposals/specify-that-0-throw-pops-the-0#reply-794))
2022-02-19 19:04:45 - AntonErtl

#### Proposals in the state *voting*

1. PLACE +PLACE ([https://forth-standard.org/proposals/place-place#reply-745](https://forth-standard.org/proposals/place-place#reply-745))
2021-09-08 21:15:27 - UlrichHoffmann

#### Proposals in the state *informal*

We have a lots of informal proposals with open status (moved to an appendix at the end for clarity).  
We should discuss we handle them best.  

### Contributions on [forth-standard.org](https://forth-standard.org) since last meeting

There are a lot of contributions since the interim March meeting. Find them in the appendix.


## Workshop Topics

Workshops are topics for discussion outside the formal meeting. We will collect topics on the fly during the meeting's discussions.

## Consideration of proposals + CfV votes

- Which proposals should go for vote?
- Any topics for proposal in the pipiline?


## Workshop reports

Let's collect the results of our workshops.

## Matters arising

Whats up?

## Any other business

Something else?

## Date of next meeting

When shall we three meet again?
In thunder, lightning, or in rain?


---

# Appendix to *Review of Proposals/Contributions*

## Proposals in the state *informal* (most recent first)

1. Pronounciations ([pronounciations #261](https://forth-standard.org/proposals/pronounciations#contribution-261))
2022-08-19 18:00:05 - AntonErtl

2. Exclude zero from the data types that are identifiers ([exclude-zero-from-the-data-types-that-are-identifiers #252](https://forth-standard.org/proposals/exclude-zero-from-the-data-types-that-are-identifiers#contribution-252))
2022-08-13 23:24:52 - ruv

3. Clarification for execution token ([clarification-for-execution-token #251](https://forth-standard.org/proposals/clarification-for-execution-token#contribution-251))
2022-08-13 20:16:29 - ruv

4. Formatting: spaces in data type symbols ([formatting-spaces-in-data-type-symbols #250](https://forth-standard.org/proposals/formatting-spaces-in-data-type-symbols#contribution-250))
2022-08-12 15:04:29 - ruv

5. Revert rewording the term &quot;execution token&quot; ([revert-rewording-the-term-execution-token- #249](https://forth-standard.org/proposals/revert-rewording-the-term-execution-token-#contribution-249))
2022-08-12 14:18:35 - ruv

6. Better wording for &quot;Glossary notation&quot; ([better-wording-for-glossary-notation- #215](https://forth-standard.org/proposals/better-wording-for-glossary-notation-#contribution-215))
2021-09-24 11:33:41 - ruv

7. Better wording for &quot;data field&quot; term ([better-wording-for-data-field-term #214](https://forth-standard.org/proposals/better-wording-for-data-field-term#contribution-214))
2021-09-14 08:55:49 - ruv

8. Tick and undefined execution semantics - 2 ([tick-and-undefined-execution-semantics-2 #212](https://forth-standard.org/proposals/tick-and-undefined-execution-semantics-2#contribution-212))
2021-09-08 10:15:49 - StephenPelc

9. EMIT and non-ASCII values ([emit-and-non-ascii-values #184](https://forth-standard.org/proposals/emit-and-non-ascii-values#contribution-184))
2021-04-03 15:34:40 - AntonErtl

10. Tick and undefined execution semantics ([tick-and-undefined-execution-semantics #163](https://forth-standard.org/proposals/tick-and-undefined-execution-semantics#contribution-163))
2020-10-29 00:28:43 - ruv

11. Common terminology for recognizers discurse and specifications ([common-terminology-for-recognizers-discurse-and-specifications #161](https://forth-standard.org/proposals/common-terminology-for-recognizers-discurse-and-specifications#contribution-161))
2020-09-07 13:56:43 - ruv

12. minimalistic core API for recognizers ([https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-867](https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-867))
    
13. An alternative to the RECOGNIZER proposal ([https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-493](https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-493))
2020-09-05 15:09:39 AndrewHaley

14. Call for Vote - Ambiguous condition in 16.3.3 ([https://forth-standard.org/proposals/call-for-vote-ambiguous-condition-in-16-3-3#reply-460](https://forth-standard.org/proposals/call-for-vote-ambiguous-condition-in-16-3-3#reply-460))
2020-09-02 11:16:03 - StephenPelc
   
15. XML Forth Standard - migration from LaTeX to DocBook ([xml-forth-standard-migration-from-latex-to-docbook #154](https://forth-standard.org/proposals/xml-forth-standard-migration-from-latex-to-docbook#contribution-154))
2020-09-01 21:16:26 - GeraldWodni
     
16. Nestable Recognizer Sequences ([nestable-recognizer-sequences #149](https://forth-standard.org/proposals/nestable-recognizer-sequences#contribution-149))
2020-08-22 16:09:52 - AntonErtl    

17. OPTIONAL IEEE 754 BINARY FLOATING-POINT WORD SET ([https://forth-standard.org/proposals/optional-ieee-754-binary-floating-point-word-set#reply-420](https://forth-standard.org/proposals/optional-ieee-754-binary-floating-point-word-set#reply-420))
2020-08-24 23:38:37 - KrishnaMyneni
      
18. Recognizer ([recognizer #142](https://forth-standard.org/proposals/recognizer#contribution-142))
2020-07-20 20:36:30 - BerndPaysan

19. Same name token for different words ([same-name-token-for-different-words #136](https://forth-standard.org/proposals/same-name-token-for-different-words#contribution-136))

20. Recognizer RfD rephrase 2020 ([recognizer-rfd-rephrase-2020 #131](https://forth-standard.org/proposals/recognizer-rfd-rephrase-2020#contribution-131))

21. NAME&gt;INTERPRET wording ([name-interpret-wording #129](https://forth-standard.org/proposals/name-interpret-wording#contribution-129))
2020-02-20 09:55:14 - ruv

22. Clarify FIND, more classic approach ([https://forth-standard.org/proposals/clarify-find-more-classic-approach#reply-682](https://forth-standard.org/proposals/clarify-find-more-classic-approach#reply-682))
2019-10-08 11:01:25 - ruv

23. Remove the “rules of FIND” ([https://forth-standard.org/proposals/remove-the-rules-of-find-#reply-465 ](https://forth-standard.org/proposals/remove-the-rules-of-find-#reply-465 )) 
2019-09-12 09:09:51 - BerndPaysan
 
24. Case insensitivity ([case-insensitivity #114](https://forth-standard.org/proposals/case-insensitivity#contribution-114))
2019-09-06 18:27:48 - AntonErtl

25. CS-DROP (revised 2019-08-22) ([https://forth-standard.org/proposals/cs-drop-revised-2019-08-22-#reply-471](https://forth-standard.org/proposals/cs-drop-revised-2019-08-22-#reply-471))
2019-09-06 08:24:28 - UlrichHoffmann
    
26. Right-justified text output ([right-justified-text-output #101](https://forth-standard.org/proposals/right-justified-text-output#contribution-101))
2019-08-01 22:07:03 - mcondron        
    
27. Executing compilation semantics ([executing-compilation-semantics #94](https://forth-standard.org/proposals/executing-compilation-semantics#contribution-94))
2019-07-12 04:16:14 - ruv

28. Revise Rationale of Buffer: ([https://forth-standard.org/proposals/revise-rationale-of-buffer-#reply-247](https://forth-standard.org/proposals/revise-rationale-of-buffer-#reply-247))
2019-07-06 15:45:25 AntonErtl

29. F&gt;R and FR&gt; to support dynamically-scoped floating point variables ([f-r-and-fr-to-support-dynamically-scoped-floating-point-variables #75](https://forth-standard.org/proposals/f-r-and-fr-to-support-dynamically-scoped-floating-point-variables#contribution-75))
2019-03-03 06:20:52 - kc5tja

30. Case sensitivity ([case-sensitivity #73](https://forth-standard.org/proposals/case-sensitivity#contribution-73))
2018-11-03 13:15:53 - ruv

31. Revised Proposal Process ([revised-proposal-process #71](https://forth-standard.org/proposals/revised-proposal-process#contribution-71))
2018-09-21 06:49:42 - PeterKnaggs

32. Multi-Tasking Proposal ([https://forth-standard.org/proposals/multi-tasking-proposal#reply-186](https://forth-standard.org/proposals/multi-tasking-proposal#reply-186))
2018-09-06 17:19:38 - AndrewHaley
    
33. CS-DROP (revised 2018-08-20) ([https://forth-standard.org/proposals/cs-drop-revised-2018-08-20-#reply-302
](https://forth-standard.org/proposals/cs-drop-revised-2018-08-20-#reply-302
))
2018-08-20 20:22:25 - UlrichHoffmann

34. S( &quot;Request for Discussion&quot; (revised 2018-08-16) ([s-request-for-discussion-revised-2018-08-16- #65](https://forth-standard.org/proposals/s-request-for-discussion-revised-2018-08-16-#contribution-65))
2018-08-17 16:27:53 - UlrichHoffmann
    
35. Let us adopt the Gerry Jackson test suite as part of Forth 200x ([let-us-adopt-the-gerry-jackson-test-suite-as-part-of-forth-200x #63](https://forth-standard.org/proposals/let-us-adopt-the-gerry-jackson-test-suite-as-part-of-forth-200x#contribution-63))
2018-07-10 14:38:46 - StephenPelc

36. Tighten the specification of SYNONYM (version 1) ([tighten-the-specification-of-synonym-version-1- #60](https://forth-standard.org/proposals/tighten-the-specification-of-synonym-version-1-#contribution-60))
2018-06-08 10:09:18 - GerryJackson

37. EXCEPTION LOCALs ([exception-locals #36](https://forth-standard.org/proposals/exception-locals#contribution-36))
2017-10-28 07:04:49 - AndrewRead

38. BL rationale is wrong ([bl-rationale-is-wrong #34](https://forth-standard.org/proposals/bl-rationale-is-wrong#contribution-34))
2017-10-25 11:35:46 - AntonErtl

39. The value of STATE should be restored ([the-value-of-state-should-be-restored #32](https://forth-standard.org/proposals/the-value-of-state-should-be-restored#contribution-32))
2017-09-03 11:07:49 - AlexDyachenko

40. Core-ext S\"; should reference File-ext S\"; ([core-ext-s-should-reference-file-ext-s- #29](https://forth-standard.org/proposals/core-ext-s-should-reference-file-ext-s-#contribution-29))
2017-04-16 08:03:17 - AntonErtl

41. Implementations requiring BOTH 32 bit single floats and 64 bit double floats. ([implementations-requiring-both-32-bit-single-floats-and-64-bit-double-floats- #26](https://forth-standard.org/proposals/implementations-requiring-both-32-bit-single-floats-and-64-bit-double-floats-#contribution-26))
2016-12-21 14:39:40 - zhtoor

42. Directory experiemental proposal ([https://forth-standard.org/proposals/directory-experiemental-proposal#reply-59](https://forth-standard.org/proposals/directory-experiemental-proposal#reply-59))
2016-12-12 15:42:57 - GeraldWodni

43. DEFER this not :-) ([defer-this-not- #22](https://forth-standard.org/proposals/defer-this-not-#contribution-22))
2016-09-02 16:14:36 - enoch

44. WLSCOPE -- wordlists switching made easier ([wlscope-wordlists-switching-made-easier #20](https://forth-standard.org/proposals/wlscope-wordlists-switching-made-easier#contribution-20))
2016-06-18 04:19:03 - enoch

### Contributions on [forth-standard.org](https://forth-standard.org) since last meeting (most recent first)

1. Etymology of SYNONYM ([tools, SYNONYM #267](https://forth-standard.org/standard/tools/SYNONYM#contribution-267))
2022-09-07 21:15:16 - AntonErtl

2. Support several versions of the standard in parallel ([ #266](https://forth-standard.org/meta-discussion#contribution-266))
2022-09-07 11:41:37 - ruv

3. Bogus Test Case for SAVE-INPUT ([core, SAVE-INPUT #265](https://forth-standard.org/standard/core/SAVE-INPUT#contribution-265))
2022-09-06 14:36:57 - flaagel

4. Incorrect Test Pattern ([file, SOURCE-ID #264](https://forth-standard.org/standard/file/SOURCE-ID#contribution-264))
2022-09-06 14:26:04 - flaagel

5. Test Proposal ([test-proposal #263](https://forth-standard.org/proposals/test-proposal#contribution-263))
2022-08-28 19:24:27 - GeraldWodni

6. &gt;NUMBER Test Patterns ([core, toNUMBER #262](https://forth-standard.org/standard/core/toNUMBER#contribution-262))
2022-08-28 11:10:27 - flaagel

7. Pronounciations ([pronounciations #261](https://forth-standard.org/proposals/pronounciations#contribution-261))
2022-08-19 18:00:05 - AntonErtl

8. Exception word set is not optional any more ([exception #260](https://forth-standard.org/standard/exception#contribution-260))
2022-08-18 13:50:15 - ruv

9. Should QUIT propagate exceptions? ([core, QUIT #259](https://forth-standard.org/standard/core/QUIT#contribution-259))
2022-08-18 12:09:37 - ruv

10. Pronounciation ([xchar, PlusXDivSTRING #258](https://forth-standard.org/standard/xchar/PlusXDivSTRING#contribution-258))
2022-08-15 14:07:41 - AntonErtl

11. Pronounciation ([xchar, MinusTRAILING-GARBAGE #257](https://forth-standard.org/standard/xchar/MinusTRAILING-GARBAGE#contribution-257))
2022-08-15 14:04:40 - AntonErtl

12. Pronounciation ([double, DUless #256](https://forth-standard.org/standard/double/DUless#contribution-256))
2022-08-15 13:51:36 - AntonErtl

13. Pronounciation ([float, FtoS #255](https://forth-standard.org/standard/float/FtoS#contribution-255))
2022-08-15 13:48:50 - AntonErtl

14. Pronounciation ([float, StoF #254](https://forth-standard.org/standard/float/StoF#contribution-254))
2022-08-15 13:29:09 - AntonErtl

15. Pronounciation ([xchar, XSTRINGMinus #253](https://forth-standard.org/standard/xchar/XSTRINGMinus#contribution-253))
2022-08-14 17:47:05 - AntonErtl

16. Exclude zero from the data types that are identifiers ([exclude-zero-from-the-data-types-that-are-identifiers #252](https://forth-standard.org/proposals/exclude-zero-from-the-data-types-that-are-identifiers#contribution-252))
2022-08-13 23:24:52 - ruv

17. Clarification for execution token ([clarification-for-execution-token #251](https://forth-standard.org/proposals/clarification-for-execution-token#contribution-251))
2022-08-13 20:16:29 - ruv

18. Formatting: spaces in data type symbols ([formatting-spaces-in-data-type-symbols #250](https://forth-standard.org/proposals/formatting-spaces-in-data-type-symbols#contribution-250))
2022-08-12 15:04:29 - ruv

19.  Revert rewording the term &quot;execution token&quot; ([revert-rewording-the-term-execution-token- #249](https://forth-standard.org/proposals/revert-rewording-the-term-execution-token-#contribution-249))
2022-08-12 14:18:35 - ruv

20. Implementing COMPILE, via EXECUTE ([core, COMPILEComma #248](https://forth-standard.org/standard/core/COMPILEComma#contribution-248))
2022-08-12 10:21:25 - ruv

21. Better API for multitasking ([multi-tasking-proposal #247](https://forth-standard.org/proposals/multi-tasking-proposal#contribution-247))
2022-07-18 00:03:00 - ruv

22. Ambiguous conition for MARKER ([core, MARKER #246](https://forth-standard.org/standard/core/MARKER#contribution-246))
2022-07-16 10:55:40 - ruv

23. :NONAME Primitives ([core, ColonNONAME #245](https://forth-standard.org/standard/core/ColonNONAME#contribution-245))
2022-07-05 16:22:37 - flaagel

24. Interactions with MARKER and KILL-TASK ([multi-tasking-proposal #244](https://forth-standard.org/proposals/multi-tasking-proposal#contribution-244))
2022-06-25 15:54:21 - kc5tja

25. Stack Sizes? ([multi-tasking-proposal #243](https://forth-standard.org/proposals/multi-tasking-proposal#contribution-243))
2022-06-25 15:38:51 - kc5tja

26. Round-robin vs Preemptive ([multi-tasking-proposal #242](https://forth-standard.org/proposals/multi-tasking-proposal#contribution-242))
2022-06-25 15:26:37 - kc5tja

27. Suggested reference implementation ROT ([core, ROT #241](https://forth-standard.org/standard/core/ROT#contribution-241))
2022-06-23 21:59:20 - poggingfish

28. Suggested reference implementation R@ ([core, RFetch #240](https://forth-standard.org/standard/core/RFetch#contribution-240))
2022-06-20 17:40:33 - poggingfish

29. Suggested reference implementation 2* ([core, TwoTimes #239](https://forth-standard.org/standard/core/TwoTimes#contribution-239))
2022-06-20 17:34:22 - poggingfish

30. Same execution token ([usage #238](https://forth-standard.org/standard/usage#contribution-238))
2022-06-13 22:40:38 - ruv

31. 3.4.5 conflicts with [: … ;] ([usage #237](https://forth-standard.org/standard/usage#contribution-237))
2022-05-11 12:45:05 - AtH

32. Trigonmetric Functions in Forth ([float, FSIN #236](https://forth-standard.org/standard/float/FSIN#contribution-236))
2022-04-11 17:49:49 - OldSpoon

33. F.3 Seems in Error ([testsuite #235](https://forth-standard.org/standard/testsuite#contribution-235))
2022-04-08 17:45:25 - JimPeterson

34. Possible Reference Implementation ([core, ALIGN #234](https://forth-standard.org/standard/core/ALIGN#contribution-234))
2022-04-05 17:44:08 - JimPeterson

35. Possible Reference Implementation ([core, MIN #233](https://forth-standard.org/standard/core/MIN#contribution-233))
2022-04-05 14:43:43 - JimPeterson

36. Possible Reference Implementation ([core, VARIABLE #232](https://forth-standard.org/standard/core/VARIABLE#contribution-232))
2022-04-05 14:05:53 - JimPeterson

37. Double&gt; ([core, MTimes #231](https://forth-standard.org/standard/core/MTimes#contribution-231))
2022-04-04 21:04:46 - AdrianMcMenamin

38. Question about final test ([core, UMTimes #230](https://forth-standard.org/standard/core/UMTimes#contribution-230))
2022-04-02 21:58:34 - AdrianMcMenamin

39. inconsistent naming ([search, FORTH-WORDLIST #229](https://forth-standard.org/standard/search/FORTH-WORDLIST#contribution-229))
2022-03-18 14:04:40 - LSchmidt

40. Accessing Remaining Data Stack? ([locals #228](https://forth-standard.org/standard/locals#contribution-228))
2022-03-08 20:58:26 - JimPeterson

41. Contradiction With do-loops ([locals #227](https://forth-standard.org/standard/locals#contribution-227))
2022-03-08 20:39:59 - JimPeterson

42. c-addr used in stack diagrams ([core, Cq #226](https://forth-standard.org/standard/core/Cq#contribution-226))
2022-03-06 21:06:16 - LSchmidt

43. Using a . suffix to specify a double ([double, DZeroEqual #225](https://forth-standard.org/standard/double/DZeroEqual#contribution-225))
2022-03-05 19:45:14 - flaagel

44. many tests appear to only assess interpretation semantics of test subjects ([testsuite #224](https://forth-standard.org/standard/testsuite#contribution-224))
2022-02-27 21:23:05 - LSchmidt

45. chasing for dangling words referred to ([testsuite #223](https://forth-standard.org/standard/testsuite#contribution-223))
2022-02-27 20:58:46 - LSchmidt

46. many tests appear to only assess interpretation semantics of test subjects ([testsuite #222](https://forth-standard.org/standard/testsuite#contribution-222))
2022-02-27 18:43:57 - LSchmidt

47. I suggest to complete the test ([core, POSTPONE #221](https://forth-standard.org/standard/core/POSTPONE#contribution-221))
2022-02-27 14:26:42 - LSchmidt


,------------------------------------------
| 2022-09-15 22:50:32  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-902
`------------------------------------------
Yes maybe we will!


,------------------------------------------
| 2022-09-15 22:51:42  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-903
`------------------------------------------
Yes maybe we will!


,------------------------------------------
| 2022-09-15 23:50:47  KrishnaMyneni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-904
`------------------------------------------


,------------------------------------------
| 2022-09-16 03:59:33  LeonWagner  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-905
`------------------------------------------
This is some test text. 

* I maintain multiple systems, but it only lets me select one. 
* Maybe lose those brackets after "in ful in [ ]" as there is no way to fill in the version number.


,------------------------------------------
| 2022-09-16 06:55:49  AntonErtl  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-906
`------------------------------------------


,------------------------------------------
| 2022-09-16 07:01:27  AntonErtl  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-907
`------------------------------------------
Test of whether I can change a vote.

It would be great if all the programmer's votes were shown together and all the system votes were shown together, both sorted by vote, so that anybody interested (including the committee) has an easier time getting an overview of the results.  Showing a reply would then be unnecessary unless there was text in the reply.


,------------------------------------------
| 2022-09-16 09:30:54  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-908
`------------------------------------------
This vote is only using the system's vote


,------------------------------------------
| 2022-09-16 09:32:51  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-909
`------------------------------------------
This vote is only for programmers.
Also note on how to change votes: later votes by the same user overwrite earlier votes.


,------------------------------------------
| 2022-09-16 09:37:08  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-910
`------------------------------------------
Only system vote test 2


,------------------------------------------
| 2022-09-16 09:37:52  GeraldWodni  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-911
`------------------------------------------
Only programmer Vote 2


,------------------------------------------
| 2022-09-16 12:59:35  AntonErtl  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-912
`------------------------------------------


,------------------------------------------
| 2022-09-16 15:28:08  BerndPaysan  replies:
| proposal - Test Proposal
| see: https://forth-standard.org/proposals/test-proposal#reply-913
`------------------------------------------
There needs to be a free form field for the system vote to actually say which release already implements it in full/in parts.


,------------------------------------------
| 2022-09-17 11:27:57  ruv  replies:
| proposal - Formatting: spaces in data type symbols
| see: https://forth-standard.org/proposals/formatting-spaces-in-data-type-symbols#reply-914
`------------------------------------------
Additional reasonings.

1. According to [data type symbols](https://forth-standard.org/standard/usage#table:datatypes), `u|n` is a single data type symbol.  It's confusing when this data type symbol contains spaces, when most data type symbols don't contain spaces. The same is true for `i*x`, etc.

2. A bar character `|` is also used to represent alternates for the whole tuple of stack-parameter data types; for example see [FIND](https://forth-standard.org/standard/core/FIND) `( c-addr -- c-addr 0 | xt 1 | xt -1 )`.  If we represent `u|n` as `u | n` (i.e, with spaces around the bar),  it looks like alternative for the whole tuple of data types, but it's wrong.


,------------------------------------------
| 2022-09-17 12:19:57  ruv  replies:
| proposal - Better wording for "Glossary notation"
| see: https://forth-standard.org/proposals/better-wording-for-glossary-notation-#reply-915
`------------------------------------------
## Author

Ruv

## Change Log

- 2022-09-16 Changes in wording by Leon Wagner
- 2021-09-24 Initial version

## Problem

The section [2.2.4 Glossary notation](https://forth-standard.org/standard/notation#notation:glossary) says:

> Each glossary entry specifies a Forth <u>word</u> and consists of two parts: <u>an</u> *index line* and <u>the</u> *semantic description* of the <u>definition</u>. 

The section [2.2.4.2 Glossary semantic description](https://forth-standard.org/standard/notation#subsubsection.2.2.4.2) says:

> The first paragraph of the semantic description contains a <u>stack notation</u> for each stack affected <u>by execution</u> of the word.

(underlined by me)

The quoted lines are correct for the cases of ordinary words. 

But for non ordinary words they are incorrect:

  1. For non ordinary words the "semantic description" part actually contains a different section for each defined (or explicitly undefined) semantics, with an optional label for semantics and an optional stack diagrams in each section (see [3.4.3 Semantics](https://forth-standard.org/standard/usage#usage:semantics)).

  2. The underlined part "by execution" is not correct for non ordinary words (when the section describes a behavior other than execution semantics). Since "execution of a word" means performing its execution semantics. But a section can describe compilation semantics, and the corresponding stack effects can be not equivalent to effects by "execution of the word".

Other problems in wording:

  3. The underlined part "stack notation" is slightly confusing in its context. In the section [2 Terms, notation, and references](https://forth-standard.org/standard/notation), a notation means a convention. A semantic description in a glossary entry doesn't introduce a new notation, but uses the [stack notation](https://forth-standard.org/standard/notation#subsection.2.2.2) to describe the input and output stack parameters. Such description of the parameters is usually called "stack diagram".

  4. Different terms are used to refer a same notion in the quoted lines. Use either "word" or "definition". 

## Solution

### Possible solutions per each item
#### Item 1
Possible variants
  - Say that a glossary entry contains the <i><u>behavior</u> description</i> part that contains one or more *semantic description* sections.
  - Say that a glossary entry contains one or more *semantic description* parts.

The former variant better reflects the idea that semantics describe a behavior in some conditions.
But, it seems, the latter variant is simpler without significant losses.

Note a label for semantics. Take into account the phrase "When a definition has only one specified behavior, <u>the label is omitted</u>" in [3.4.3.1 Execution semantics](https://forth-standard.org/standard/usage#subsubsection.3.4.3.1).

#### Item 2 

Use another wording "by performing the semantics" instead of "by execution of the word".

#### Item 3

Use the phrase "stack diagram" instead of "stack notation".

#### Item 4

Use the normative term "Forth definition".

### Deletions and insertions

> Each glossary entry specifies a Forth <del>word</del> <ins>definition</ins> and consists of <del>two parts: an</del> <ins>the</ins> <i>index line</i> and <del>the</del> <ins>one ore more</ins> *semantic description<ins>s</ins>* <del>of</del> <ins>for</ins> the definition. 

> The first paragraph of <del>the</del> <ins>a</ins> semantic description contains <ins>an optional label for the semantics and</ins> a <del>stack notation</del> <ins>stack diagram</ins> for each stack affected by <del>execution of the word</del> <ins>performing these semantics</ins>.


## Proposal

### In the section [2.2.4 Glossary notation](https://forth-standard.org/standard/notation#notation:glossary)

Replace the phrase:

> Each glossary entry specifies a Forth word and consists of two parts: an *index line* and the *semantic description* of the definition. 

with the phrase:

> Each glossary entry specifies a Forth definition and consists of the *index line* and one or more *semantic descriptions* for the definition.

### In the section [2.2.4.2 Glossary semantic description](https://forth-standard.org/standard/notation#subsubsection.2.2.4.2)

Replace the phrase:

> The first paragraph of the semantic description contains a stack notation for each stack affected by execution of the word.

with the phrase:

> The first paragraph of a semantic description contains an optional label for the semantics and a stack diagram for each stack affected by performing these semantics.


,------------------------------------------
| 2022-09-17 12:37:45  ruv  replies:
| proposal - Better wording for "data field" term
| see: https://forth-standard.org/proposals/better-wording-for-data-field-term#reply-916
`------------------------------------------
## Author

Ruv

## Change Log

2022-09-16 On the TC meeting it was suggested to retain referencing of `CREATE`. It's acceptable to the author at the moment.
2021-09-14 Initial version

## Assumption

A basic term definition should not inalienably refer to a Forth word or a further section of the standard.
Such referring means that there is a lack of terms and the terminology should be better developed, or that just this definition is too poor.

## Problem

We have the following problems with the definition for the "data field" term:

1. It inalienably refers to the [`CREATE`](https://forth-standard.org/standard/core/CREATE) word ( Forth-2012 contains only one such definition in the section [2.1. Definitions of terms](https://forth-standard.org/standard/notation#section.2.1)).

2. Formally, it conflicts with the term "data space". It says that a data field is a data space (i.e., it is a hyponym of). But the data space is a singleton, it unites all memory regions that may be accessed by a program. Hence a data field cannot be a hyponym of (or an instance of) the data space.

3. It connects a data field to a word defined via `CREATE`. But a sophisticated `SYNONYM` can keep these association for the newname (and the new xt) too. So, there is no need to restrict this association by `CREATE` in the term definition.

4. Formally, it conflicts with the term "word", since this term is used in a non normative meaning in this definition.

## Solution

Update the definition for the "data field" term with the following changes:

 - Remove the reference to `CREATE`.
 - Say that a data field is a data space _region_ (as it actually is).
 - Use the term "Forth definition" instead of the term "word" (optionally).

The insertion and deletions:

> **data field**: <del>The</del> <ins>A</ins> data space <ins>region</ins> associated with <del>a word defined via CREATE</del> <ins>a Forth definition</ins>.

or another variant:

> **data field**: <del>The</del> <ins>A</ins> data space <ins>region</ins> associated with a <ins>Forth</ins> word <del>defined via CREATE</del>.

or yet another one:

> **data field**: <del>The</del> <ins>A</ins> data space <ins>region</ins> associated with a <ins>Forth</ins> word defined via CREATE.


## Rationale

It doesn't matter whether each Forth definition is associated with a data field, or not (in some system, each Forth definition is associated with a data field, but some of them have zero size). In anyway, at the moment, the standard provides an API to associate a data field to and obtain it for a word that is created via `CREATE` only. But this can be changed in the future, and now without touch the basic terms.

The expression "data space region" is obvious, so there is no need to define it. Also, it's already used in other places (e.g., see "region of data space" in the "variable" term).

Concerning "word" and "Forth definition". The latter one is more correct in this case. Although, this change is optional.  The definition for the "word" term can be independently updated too, since it's used in the sense "named Forth definition" in many places (see also another [comment](https://forth-standard.org/standard/tools#reply-296)).

Perhaps a better way is to formally associate a data field with an execution token, as it actually is (see [`>BODY`](https://forth-standard.org/standard/core/toBODY)).
But, since the expression "_name's_ data field" is used in some glossary entries, this approach requires an additional term: 
  - **data field of a Forth definition**: the data field associated with the execution token of the Forth definition.

This proposal can be updated accordingly, if any.

## Proposal

Replace the definition for the "data field" term (in the [section 2.1](https://forth-standard.org/standard/notation#section.2.1)) by the following:

> A data space region associated with a Forth word defined by CREATE (6.1.1000)


,------------------------------------------
| 2022-09-17 13:42:27  ruv  replies:
| proposal - Same name token for different words
| see: https://forth-standard.org/proposals/same-name-token-for-different-words#reply-917
`------------------------------------------
This contribution was originally intended as comment. A formal proposal should be prepared to go ahead with this.


,------------------------------------------
| 2022-09-17 19:30:32  AntonErtl  replies:
| proposal - EMIT and non-ASCII values
| see: https://forth-standard.org/proposals/emit-and-non-ascii-values#reply-918
`------------------------------------------
## Author:

Anton Ertl

## Change Log:

* 2021-04-03 Original proposal
* 2022-09-15 Better wording (also includes systems with address units >8 bits)
* 2022-09-17 More explanation in the Rationale

## Problem:

The first ideas for the xchar wordset had EMIT behave like (current) XEMIT.  Then Stephen Pelc pointed out that EMIT is used in a number of programs for dealing with raw bytes, so we introduced XEMIT for dealing with extended characters.  But the wording and stack effect of EMIT suggests that EMIT should deal with (possibly extended) characters rather than raw bytes.  This is at odds with a number of implementations, and there is hardly any reason to keep both EMIT and XEMIT.

## Solution:

Define EMIT to deal with uninterpreted characters.  Concerning systems with characters=address units larger than bytes, I would like to hear back from them if they need any more specific definition than what is proposed.

I leave a likewise proposal for KEY to interested parties.

## Typical use: (Optional)

$c3 emit $a4 emit \ outputs ä on an UTF-8 system

## Proposal:

Change the definition of EMIT into:

> EMIT ( char -- )
>
> Send char to the user output device without interpreting it.

Add a reference to "18.6.1.2488.10 XEMIT" to the "See:" section.

Add the following Rationale (as A.6.1.1320):

> EMIT supports low-level communication of arbitrary contents, not limited to specific encodings; it corresponds to TYPEing one char (i.e. `addr 1 type`).  In Unicode terminology, `EMIT` does not send a code point (there is `XEMIT` for that), but a code unit.  To print multi-char extended characters, the straightforward way is to use TYPE or XEMIT, but you can also print the individual chars with multiple EMITs.

Add the following reference implementation as E.6.1.1320:

## Reference implementation:

```
create emit-buf 1 allot

: emit ( char -- )
  emit-buf c! emit-buf 1 type ;
```

## Existing practice

Gforth, SwiftForth, and VFX implement EMIT as dealing with raw bytes (tested with the "typical use" above), but Peter Fälth's system implements EMIT as an alias of XEMIT, and iForth prints two funny characters.  It is unclear if there are any existing programs affected by the proposed change.

## Testing:

This cannot be tested from a standard program, because there is no way to inspect the output of EMIT.


,------------------------------------------
| 2022-09-17 20:16:28  AntonErtl  replies:
| proposal - Nestable Recognizer Sequences
| see: https://forth-standard.org/proposals/nestable-recognizer-sequences#reply-919
`------------------------------------------
> There would still have to be a difference between recognizers that search the dictionary (called by REC-NAME or similar) and other recognizers

Every wordlist is a recognizer (at least according to the proposal), but not every recognizer is a wordlist,  So yes, there is a difference.

Whether you replace just the current not-found portion or all the recognizers does not make a difference in the complexity of the interface, but the latter is more versatile.  If a programmer does not need to have a recognizer before the search-order recognizer, nobody is forcing them to put one there.  But if a programmer needs it, that capability is provided at no extra cost.

I am retracting this proposal; the [[160] minimalistic core API for recognizers](https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers?hideDiff#reply-892) adopts a part of it and at the current time I don't intend to standardize the rest.


,------------------------------------------
| 2022-09-17 20:43:30  AntonErtl  replies:
| proposal - Right-justified text output
| see: https://forth-standard.org/proposals/right-justified-text-output#reply-920
`------------------------------------------
The committee asked me to retire this proposal, because the proponent has apparently abandoned it.


,------------------------------------------
| 2022-09-17 22:19:36  PeterFalth  replies:
| proposal - EMIT and non-ASCII values
| see: https://forth-standard.org/proposals/emit-and-non-ascii-values#reply-921
`------------------------------------------
" without interpreting it" will not be true on a Windows system. Windows works internally with UTF-16 so emit needs to buffer and translate to UTF-16 before sending the sequence to the screen.
With the new "Windows Terminal" that will become the standard terminal for at least W11 this will change. The Windows Terminal has a VT-mode that makes it work like a UNIX terminal and with that UTF8 strings can be sent directly to the screen. Of course the translation is still there but hidden in the terminal.

If you really need to restrict EMIT just write that its input must be within 0-255. Then you need also to specify what happens if someone send for example $20ac to EMIT.
Will it abort, just emit the low byte or maybe write a Euro sign on the screen?

Peter Fälth


,------------------------------------------
| 2022-09-18 09:16:42  AntonErtl  replies:
| proposal - EMIT and non-ASCII values
| see: https://forth-standard.org/proposals/emit-and-non-ascii-values#reply-922
`------------------------------------------
The idea is that it works as shown in the reference implementation and as described in Section "Typical Use".  Several people who implement Forth on Windows were present in the committee meeting, and the idea of `EMIT` as dealing with raw bytes comes from one of them, so I expect that there is some way to implement the proposed EMIT on Windows.  It does not matter if Windows, when displaying on the screen, first waits until it has a Unicode code point, converts it into UTF-16, and then uses its UTF-16 subsystem for displaying that.  What matters is that binary data (including data that is not a valid code point according to the used encoding) sent through EMIT and redirected to somewhere is left unscathed by the Forth system and the OS.  For the code-point display we have `XEMIT`.

It seems to me that this intent was perceived correctly by you (so the specification expresses the intent), but you think that it cannot be implemented on Windows.

Concerning restricting EMIT do 0-255: Systems with characters (and address units) larger than bytes may want to EMIT these larger characters (or not; the implementors of such systems have to figure out what is most useful in their situation), and the present proposal does not want to eliminated this option.

As for dealing with non-char inputs: In Forth-2012 EMIT is specified as taking an x (a cell), but the behaviour is standard-specified only for specific values and implementation-defined for the others.  The common practice among the systems that emit raw bytes (Gforth, SwiftForth, VFX) is to ignore the upper bits.  So

    $1c3 emit $ffa4 emit

also prints "ä".  I lean toward specifying that, maybe like "Upper bits in x that do not fit in a char are ignored".


,------------------------------------------
| 2022-09-18 09:46:22  AntonErtl  replies:
| proposal - PLACE +PLACE
| see: https://forth-standard.org/proposals/place-place#reply-923
`------------------------------------------
Here's an implementation of `+PLACE` that has the following nice properties:

* it limits its writing to the 256-byte region starting at c-addr2
* it copies the stuff that was originally at c-addr1 u1 even in the case of overlap

Maybe specifying `+PLACE` to have these properties is a good idea.

````
: +place {: c-addr1 u1 c-addr2 -- :} \ gforth-obsolete plus-place
    c-addr2 count {: c-addr u2 :}
    u2 u1 + $ff min {: u :}
    c-addr1 c-addr u u2 /string move
    u c-addr2 c! ;
````


,------------------------------------------
| 2022-09-18 10:08:58  AntonErtl  replies:
| proposal - PLACE +PLACE
| see: https://forth-standard.org/proposals/place-place#reply-924
`------------------------------------------
On a more general note: While PLACE and maybe also +PLACE may be common practice, I think they are bad practice, for the following reasons:

* They are designed to create counted strings.  Counted strings may be seductive because you need to pass only one cell on the stack and store only one cell, but their length limitation means that they are not generally useful, so we need another set of words for dealing with longer strings, and we have it in the form of words that deal with c-addr u strings.  But once we have a set of words for general strings, do we really want another set of words for another string representation?  In the best case, these words will remain unused and just sow confusion.  In the worst case, they are used, and users then suffer from their limitations.  I suspect that PLACE was in more common use in 1994 than it is now, but the Forth-94 committee chose not to standardize it, probably for the reasons above.  We should not standardize it, either.

* These words have no way to check the length of the result buffer (admittedly, neither does MOVE), so they are a buffer overflow waiting to happen.  That goes doubly for +PLACE, where it's even harder to avoid a buffer overflow.  If you want to add such words, give them stack effects like ( c-addr u c-buf-addr u-buf -- ) and specify that they do not write outside [c-buf-addr,c-buf-addr+u-buf).  But then the "common practice" argument no longer holds.

Concerning common practice: Gforth contains 3 uses of PLACE and 0 uses of +PLACE, compared to 45 uses of MOVE.


,------------------------------------------
| 2022-09-21 17:12:25  AntonErtl  replies:
| proposal - Pronounciations
| see: https://forth-standard.org/proposals/pronounciations#reply-925
`------------------------------------------
Accepted at the 2022 meeting 10Y:0N:1A


,------------------------------------------
| 2022-09-21 17:13:45  AntonErtl  replies:
| proposal - Specify that 0 THROW pops the 0
| see: https://forth-standard.org/proposals/specify-that-0-throw-pops-the-0#reply-926
`------------------------------------------
Accepted at the 2022 meeting 10Y:0N:1A


,------------------------------------------
| 2022-10-06 18:01:09  flaagel  replies:
| testcase - Bogus Test Case for SAVE-INPUT
| see: https://forth-standard.org/standard/core/SAVE-INPUT#reply-927
`------------------------------------------
Hello ruv,

I do apologize for the noise. Back when I published this, I was focusing on implementing EVALUATE without having a clearly defined SAVE-INPUT and RESTORE-INPUT process. The smoke has cleared up from my side since then and I have learned my lesson. I should _always_ compare behaviour to what GNU Forth does, even though I will never support all the features of that beats.

Sorry about that.

        Francois


,------------------------------------------
| 2022-10-06 18:03:51  flaagel  replies:
| testcase - Incorrect Test Pattern
| see: https://forth-standard.org/standard/file/SOURCE-ID#reply-928
`------------------------------------------
Apologies.


,------------------------------------------
| 2022-11-23 12:47:12  AntonErtl  replies:
| requestClarification - Seemingly contradictory ambiguous condition?
| see: https://forth-standard.org/standard/doc#reply-929
`------------------------------------------
Confirmed.  My guess is that the ambiguous conditions in DEFER! and DEFER@ were intended to be in this place.


,------------------------------------------
| 2022-11-23 12:53:38  AntonErtl  replies:
| comment - Describe Compile time and Run time behavior
| see: https://forth-standard.org/standard/core/CHAR#reply-930
`------------------------------------------
CHAR has default compilation semantics, and that may be confusing.  Forth-94 supports [CHAR] for compiling a literal character into a definition.  In Forth-2012 there is also the option to use the syntax `'A'`, which pushes the ASCII code of A at interpretation/run-time.