,---------------.
| Contributions |
`---------------´


,------------------------------------------
| 2020-09-16 15:33:45  JennyBrien  wrote:
| requestClarification - Extending MARKER
| see: https://forth-standard.org/standard/core/MARKER#contribution-162
`------------------------------------------
Many years ago, when I was modifying F-83 to be ANS-compilant, I had to temporarily patch several parts of the core. I used:
``` 
   CHANGED \  n addr -- ;  add addr and old value to a linked list and then store n at addr
```
I then modified `FORGET` to run down the list and restore any values changed after that point. 

Is there any common practice on how to add similar further "landmark information" that MARKER should restore?


,---------.
| Replies |
`---------´


,------------------------------------------
| 2020-09-08 08:36:42  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-514
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make the system's `forth-recognizer` a deferred word to allow plugging in new recognizer sequences

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-nt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] rectype-null  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : word-translator ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-word ( addr u -- rectype )
      here place  here find dup IF  [']  word-translator
      ELSE  drop ['] notfound  THEN ;

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translator | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

    SOME-TRANSLATOR ( i*x -- j*x )

A translator depends on `STATE` to translate the given arguments:

* 0 for interpretation
* -1 for compilation
* -2 for POSTPONE

`i*x` is the additional information provided by the recognizer.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZER** ( addr len -- i*x translator | NOTFOUND-xt ) RECOGNIZER

This is a deferred word.  It takes a string and tries to recognize it, returning the recognized recognizer type and additional information if successful, or `RECTYPE-NULL` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW` if the exception wordset is available.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix:

    Defer forth-recognizer ( addr u -- i*x translator / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognizer execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : nt-translator ( nt -- )
      case  state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          -2 of  name>compile swap lit, compile,  endof
          nip // do nothing if state is unknown; possible error handling goes here
      endcase ;
    : num-translator ( n -- )
      case  state @
          -1 of   lit,  endof
          -2 of   lit, postpone lit,  endof
      endcase ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] nt-translator  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] num-translator  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognizer ( addr u -- nt rectype-nt / n rectype-num / rectype-null )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognizer

The different actions during interpret/compile/postpone can be factored out easily, and used by a common dispatcher:

    : translator: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;

## Testing


,------------------------------------------
| 2020-09-08 08:39:23  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-515
`------------------------------------------
## Author:

Bernd Paysan

## Change Log:

* 2020-09-06 initial version
* 2020-09-08 taking ruv's approach and vocabulary at translators
* 2020-09-08 replace the remaining rectypes with translators

## Problem:

The current recognizer proposal has received a number of critics.  One is that its API is too big.  So this proposal tries to create a very minimalistic API for a core recognizer, and allows to implement more fancy stuff as extensions.  The problem this proposal tries to solve is the same as with the original recognizer proposal, this proposal is therefore not a full proposal, but sketches down some changes to the original proposal.

## Solution:

Define the essentials of the recognizer in a RECOGNIZER word set, and allow building upon that.  Common extensions go to the RECOGNIZER EXT wordset.

Important changes to the original proposal:

* Make the recognizer types executable to dispatch the methods (interpret, compile, postpone) themselves
* Make the recognizer sequence executable with the same effect as a recognizer
* Make the system's `forth-recognizer` a deferred word to allow plugging in new recognizer sequences

This replaces one poor man's method dispatch with another poor man's method dispatch, which is maybe less daunting and more flexible.

The core principle is still that the recognizer is not aware of state, and the returned translator is.  If you have for some reason legacy code that looks like

    : rec-nt ( addr u -- translator )
      here place  here find dup IF
          0< state @ and  IF  compile,  ELSE  execute  THEN  ['] drop
      ELSE  drop ['] notfound  THEN ;

then you should factor the part starting with state @ out and return it as translator:

    : word-translator ( xt flag -- )
      0< state @ and  IF  compile,  ELSE  execute  THEN ;
    : rec-word ( addr u -- ... translator )
      here place  here find dup IF  [']  word-translator
      ELSE  drop ['] notfound  THEN ;

## Typical use

TBD

## Proposal:

XY. The optional Recognizer Wordset

A recognizer takes the string of a lexeme and returns a translator xt and additional data on the stack (no additional data for `NOTFOUND`):

    REC-SOMETYPE ( addr len -- i*x translator | NOTFOUND )

# XY.3 Additional usage requirements

## XY.3.1 Translator

**translator:** subtype of xt, and executes with the following stack effect:

    SOME-TRANSLATOR ( i*x -- j*x )

A translator depends on `STATE` to translate the given arguments:

* 0 for interpretation
* -1 for compilation
* -2 for POSTPONE

`i*x` is the additional information provided by the recognizer.

# XY.6 Glossary

## XY.6.1 Recognizer Words

**FORTH-RECOGNIZER** ( addr len -- i*x translator | NOTFOUND-xt ) RECOGNIZER

This is a deferred word.  It takes a string and tries to recognize it, returning the recognized recognizer type and additional information if successful, or `NOTFOUND` if not.

**NOTFOUND** ( -- ) RECOGNIZER

Performs `-13 THROW` if the exception wordset is available.

## Reference implementation:

This is a minimalistic core implementation for a recognizer-enabled system, that handles only words and single numbers without base prefix:

    Defer forth-recognizer ( addr u -- i*x translator / notfound )
    : interpret ( i*x -- j*x )
      BEGIN
          ?stack parse-name dup  WHILE
          forth-recognizer execute
      REPEAT ;

    : lit,  ( n -- )  postpone literal ;
    : notfound ( state -- ) -13 throw ;
    : nt-translator ( nt -- )
      case  state @
          0  of  name>interpret execute  endof
          -1 of  name>compile execute  endof
          -2 of  name>compile swap lit, compile,  endof
          nip // do nothing if state is unknown; possible error handling goes here
      endcase ;
    : num-translator ( n -- )
      case  state @
          -1 of   lit,  endof
          -2 of   lit, postpone lit,  endof
      endcase ;

    : rec-nt ( addr u -- nt nt-translator / notfound )
      forth-wordlist find-name-in dup IF  ['] nt-translator  ELSE  drop ['] notfound  THEN ;
    : rec-num ( addr u -- n num-translator / notfound )
      0. 2swap >number 0= IF  2drop ['] num-translator  ELSE  2drop drop ['] notfound  THEN ;

    : minimal-recognizer ( addr u -- nt nt-translator / n num-translator / notfound )
      2>r 2r@ rec-nt dup ['] notfound = IF  drop 2r@ rec-num  THEN  2rdrop ;

    ' minimal-recognizer is forth-recognizer

The different actions during interpret/compile/postpone can be factored out easily, and used by a common dispatcher:

    : translator: ( xt-interpret xt-compile xt-postpone "name" -- )
      create , , ,
      does> state @ 2 + cells + @ execute ;

## Testing


,------------------------------------------
| 2020-09-08 11:33:30  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-516
`------------------------------------------
Downside of using `STATE` right in the dispatcher: `POSTPONE` becomes more difficult. Instead of

    : postpone ( "name" -- ) parse-name forth-recognizer -2 swap execute ; immediate

it is more convoluted

    : postpone ( "name" -- )
      parse-name forth-recognizer
      state @ >r -2 state !  catch  r> state !  throw ; immediate

How to detect `[[` at the end of a postpone sequence is also not so trivial.


,------------------------------------------
| 2020-09-08 14:48:20  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-517
`------------------------------------------
> Downside of using STATE right in the dispatcher: POSTPONE becomes more difficult.

It's OK. Actually, we distribute complexity among various parts. When we make one thing less complex, we make another thing more complex. But due to the different numbers of occurrences of various things (in systems, libraries, programs) the summary complexity can be less or more. 

This approach also makes some things more complex, but the summary complexity decreases, I believe.

Concerning `POSTPONE`. I think, some useful parts should be factored out.

Also, we don't need to catch exception — usually, it's a stop error, and the state is ambiguous in any case. QUIT resets all the internal states. Concerning programs — we need a standard way to reset the internal states of the Forth text interpreter, regardless of Recognizers proposal.

In my "lexeme resolvers" implementation I use conception of postponing level that can be 0, 1, 2, and introduce the words to increment and to decrement this level.
So, `POSTPONE` is [defined](https://github.com/ruv/forth-design-exp/blob/master/lexeme-translator/core.example.fth) as the following:
```
: postpone  ( " name" --      )   parse-name inc-state translate-lexeme dec-state ( flag ) ?nf ; immediate
```

Where `translate-lexeme` is [defined](https://github.com/ruv/forth-design-exp/blob/master/lexeme-translator/resolver.api.L1.fth) as the following:
```
: perceive-lexeme ( c-addr u -- k*x xt-tt | c-addr u 0 )
  perceptor dup if execute then
;
: translate-lexeme ( i*x c-addr u -- j*x true | c-addr u 0 )
  perceive-lexeme dup if execute true then
;
```
(Note that in contrast of this proposal, resolvers return `( c-addr u 0 )` on fail)

> How to detect `[[` at the end of a postpone sequence is also not so trivial.

An appropriate approach is that the word `]]` is a parsing word.
```
: ]] ( -- )
  inc-state begin
    next-lexeme 2dup s" [[" equals 0= while
    translate-lexeme ?nf
  repeat 2drop dec-state
; immediate
```
So we don't have any problem to detect `[[` at the end.


An advantage of the postponing level conception is that the following code works as expected:
```
: foo [  ]] 123 . [[  ]  ;   foo \ prints 123
```

-----

In the message news:[rdcur5$ga4$1@dont-email.me](https://groups.google.com/forum/message/raw?msg=comp.lang.forth/yuNZEvq8EqA/pLOQPJiuAgAJ) (the full message: news:[rdcn35$sd2$1@dont-email.me](https://groups.google.com/forum/message/raw?msg=comp.lang.forth/yuNZEvq8EqA/LJcFnWCnAgAJ)) I showed another approach, when postponing action is not required at all (i.e., -2 state in this proposal).


,------------------------------------------
| 2020-09-08 16:45:16  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-518
`------------------------------------------
> translator: subtype of xt, and executes with the following stack effect:<br/>
> `SOME-TRANSLATOR ( i*x -- j*x )`

It's correct in the general case, but it makes a little sense, since any definition meets this stack effect.

So I think we should distinguish the parameters of a translator itself from the effect of translating of the code that is passed to the translator. Possible variants:

```
\ We can define 'token' data type
TRANSLATE-SOMETOKEN ( i*x token -- j*x )

\ Some hybrid variant
TRANSLATE-SOMETOKEN  ( i*x token{k*x} -- j*x )

\ Only low level data types
TRANSLATE-SOMETOKEN  ( i*x k*x -- j*x ) 

```
(NB: I use a conventional naming {verb}-{noun} for such a words).

It should be also noted that these _x_ may be distributed in all the stacks: the data stack, the floating-pint stack, the control-flow stack (except token k*x, that cannot be in the contrlo-fow stack).


,------------------------------------------
| 2020-09-08 20:14:11  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-519
`------------------------------------------
Indeed, `TRANSLATE-SOMETHING` sounds better than `SOMETHING-TRANSLATOR`.

`FORTH-RECOGNIZER` is ok, because it's followed by `EXECUTE`, so this is a noun.


,------------------------------------------
| 2020-09-09 08:13:21  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-520
`------------------------------------------
### "FORTH-RECOGNIZER" name

I thought about `FORTH-RECOGNIZER` name.
It makes a strong impression that this word is similar to `FORTH-WORDLIST ( -- wid )`.  The problem is that it isn't.

`FORTH-WORDLIST` is a constant (it always return the same value), that indicates a one the same word list among all the word lists.   This word list can be included into the search order, and it can be absent in the search order.

By analogy, `FORTH-RECOGNIZER` should be a constant that indicates a one the same recognizer among all the recognizers. This recognizer can be included into the recognizer that is used by the Forth text interpreter, and it can be absent in the recognizer that is used by the Forth text interpreter.  (In accordance with the conception that a sequence of recognizers is also a recognizer).

All these should be right to hold consistent naming.  But actually it is wrong. It means, that this name breaks consistency and isn't inappropriate for the proposed word.

`FORTH-RECOGNIZER ( -- xt )` can be a word that returns xt of the system's recognizer that is used by the Forth text interpreter **by default**  (i.e. initially).


> FORTH-RECOGNIZER is ok, because it's followed by EXECUTE, so this is a noun.

Also, it makes a strong impression that it returns a recognizer. But it's wrong. Also, it's result is analyzed **much more often** than it's followed by EXECUTE.

### Basic methods

By no means, we need 
1. a method that tells the Forth text interpreter to use a given recognizer.
2. a method that returns the recognizer that is currently used by the Forth text interpreter, 
3. a method that performs the recognizer that is currently used by the Forth text interpreter


A one differed word (a vector) X can solve it:
1. set: `IS X`
2. get: `ACTION-OF X`
3. perform: `X`


But I insist that this approach limits implementations too much.  A Forth system can want to perform its internal actions on switching the recognizer that is used by the Forth text interpreter. But it cannot do it, if this recognizer is switched via `IS X` method.  For that, the different getter and setter words are usually provided in the Standard (except very ancient `BASE` and `>IN` — due to back compatibility).
Yes, perhaps Gforth can attach any additional internal actions for `IS X` phrase. But we shouldn't complicate all Forth system implementations.

A possible implementation via deferred word and distinct getter and setter words:
```
defer perceive ( c-addr u -- k*x tt )
: perceptor ( -- xt ) action-of perceive ;
: set-perceptor ( xt -- ) is perceive ;
```
Perhaps, the more specific names are better (?):
```
defer perceive-lexeme ( c-addr u -- k*x tt )
: lexeme-perceptor ( -- xt ) action-of perceive-lexeme ;
: set-lexeme-perceptor ( xt -- ) is perceive-lexeme ;
```


,------------------------------------------
| 2020-09-09 08:25:37  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-521
`------------------------------------------
Correction: pleas read "**By anyway, we need**" instead of "_By no means, we need_".


,------------------------------------------
| 2020-09-10 12:50:53  KrishnaMyneni  replies:
| proposal - OPTIONAL IEEE 754 BINARY FLOATING-POINT WORD SET
| see: https://forth-standard.org/proposals/optional-ieee-754-binary-floating-point-word-set#reply-522
`------------------------------------------
@ruv, that's a good point. Originally, I thought it might make writing the implementation more consistent between 32 and 64 bit Forths. However, from the user point of view it is easier to deal with one double integer rather than two singles. I will rewrite the proposal to specify that the bits of udfraction specify the binary fraction for the floating point datum. Of course the MAKE-IEEE-DFLOAT word will still check for illegal values. The order of the inputs will also be changed.


,------------------------------------------
| 2020-09-10 15:43:15  ruv  replies:
| proposal - OPTIONAL IEEE 754 BINARY FLOATING-POINT WORD SET
| see: https://forth-standard.org/proposals/optional-ieee-754-binary-floating-point-word-set#reply-523
`------------------------------------------
> `MAKE-IEEE-DFLOAT ( F: -- r ) ( signbit udfraction uexp -- error )`

1. _uexp_ should be _n-exp_  (i.e. a signed number).
2. Is it any profit to have _signbit ud-mantissa_ instead of _d-mantissa_ ? (i.e. taking the sign from the mantissa).
3. What is the radix for the exponent? 2 or 10? (it should be mentioned).
4. Yes, it's better if error is a throw code.
5. What is the value of _r_ in the case of error? What is better: 0 or NaN?

6. Is it any sense to use this function in a recognizer for floating point numbers (if the radix of exponent is 2)?

> `HEX 0 54442D18 921FB 1 MAKE-IEEE-DFLOAT fconstant pi`

How can we get 3.14 from these numbers?


,------------------------------------------
| 2020-09-10 21:36:12  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-524
`------------------------------------------
´DEFER` is a core word now, so using `DEFER` for such a thing is ok. We don't need a special getter and setter for everything.

The implication that `FORTH-RECOGNIZER` returns a recognizer (and does not, it executes one) is a valid point. A better name is needed. At the moment it is a `VALUE` and does return a recognizer. Now, it is a deferred word, and does recognize strings.  We should keep it with Anton's unification: a sequence of recognizers can be combined to one recognizer.  Just because it's now recognizing more different things, it's still a recognizer.  No need to find another synonym.  Takes string, returns data+translator token ? is a recognizer.

Maybe `RECOGNIZE-FORTH` is the corresponding verb.  It takes a string and recognizes it if this is valid FORTH.


,------------------------------------------
| 2020-09-11 03:46:35  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-525
`------------------------------------------
> `DEFER` is a core word now, so using `DEFER` for such a thing is ok.

Actually, `DEFER`, as well as `TO`, is a **Core extension** word, so it's optional. But it's another argument.

Back to my first argument, what do you suggest if a system needs to perform internal actions on switching the recognizer that is currently used by the Forth text interpreter?

You can ask, do I have an example of such requirement. Yes, I do. I want to provide a method to undo such switching in my system. It's similar to effect of the "PREVIOUS" word for the search order.  Perhaps you can suggest some solution with the deferred word?


> Anton's unification: a sequence of recognizers can be combined to one recognizer.

Yes. I too [said](https://groups.google.com/forum/message/raw?msg=comp.lang.forth/yuNZEvq8EqA/ODl8a_ZUAQAJ) that any sequence of recognizers _seq-x_ (from API v4) can be represented as a single recognizer `: recognize-x seq-x recognize ;`. So, sequences are excessive in the basic API, — a Forth system doesn't need to know is it a sequence or not.


> Maybe RECOGNIZE-FORTH is the corresponding verb. It takes a string and recognizes it if this is valid FORTH.

It's better.  But it recognizes not valid FORTH, but anything what the Forth text interpreter can currently recognize (and only that).


Conceptually, this word isn't just a recognizer. There is a single special system's slot for a recognizer that is used by the Forth text interpreter. We can put any recognizer into this slot. We can also perform the recognizer that is placed into this slot.  So this word **performs** the recognizer from this slot. I incline to call this slot "perceptor". And after that the word that performs the recognizer from this slot becomes "perceive".

All recognizer names have the pattern _RECOGNIZE-*_. 
The idea is to not put this special word on a par with all other recognizers. 
For that, its better to find a name that is distinct from the _RECOGNIZE-SOMETHING_ pattern.  What do you think?


,------------------------------------------
| 2020-09-11 04:10:31  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-526
`------------------------------------------
> Actually, DEFER, as well as TO, is a Core extension word, so it's optional. But it's another argument.

This argument is that a Forth system can be implemented as a minimal kernel and additional libraries. And `DEFER`, `IS`, `ACTION-OF` can be available via a library. But when we put a deferred word into this API, we force a system's author to put `DEFER`, `IS`, `ACTION-OF` into the kernel too. But actually they isn't required in the kernel. It would be too restrictive limitation on the implementations.


,------------------------------------------
| 2020-09-11 12:23:36  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-527
`------------------------------------------
### Locate

`locate` cannot work for lexemes that can be recognized (translated) according to this proposal.


,------------------------------------------
| 2020-09-11 17:45:12  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-528
`------------------------------------------
The last [comment](#reply-527) was intend for the [proposal of AndrewHaley](https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-493), and it was mistakenly placed here.


,------------------------------------------
| 2020-09-11 21:55:54  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-529
`------------------------------------------
The recognizer will be an option, as well.  At the moment, `FORTH-RECOGNIZER` is proposed to be a value.  That's also a CORE EXT word (as is `TO`).

A minimalistic system that wants to implement recognizers needs `FORTH-RECOGNIZER` to be a deferred word.  I.e. it needs code for `DODEFER`.  It can load the rest of the deferred word stuff later as extension.


,------------------------------------------
| 2020-09-12 07:46:25  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-530
`------------------------------------------
Certainly, recognizers is an option. I didn't mean that some required part requires an optional part.  I mean that one optional part requires another complex optional part without any good and fair ground.

Yes, a minimalistic system that wants to provide a deferred word needs only code for `DODEFER`. But it still makes bootstrapping of this system more complex. Hence, when we put a deferred word into API, we make things more complex for some implementations. But we don't even have a rationale for that.

Also, with deferred word we still don't have a solution if a system needs to perform internal actions on switching the recognizer that is currently used by the Forth text interpreter.


,------------------------------------------
| 2020-09-12 13:12:35  ruv  replies:
| proposal - Nestable Recognizer Sequences
| see: https://forth-standard.org/proposals/nestable-recognizer-sequences#reply-531
`------------------------------------------
### Binary constructor
```
: two-recognizers ( xt1 xt2 "name" -- )
  create , ,
does>
  dup >r @ execute dup rectype-null <> if
    r> drop exit then
  r> cell+ @ execute ;
```

This constructor expects that a recognizer doesn't consume `( c-addr u )` on rejection.

Otherwise (if a recognizer consumes `( c-addr u)` in any case) the definition will be a bit more complex:
```
: two-recognizers ( xt1 xt2 "name" -- )
    create , ,
  does> ( c-addr u  a-addr-body )
    dup >r -rot 2dup 2>r rot
    @ execute dup rectype-null <> if
      rdrop rdrop rdrop exit
    then drop
    2r> r> cell+ @ execute
;
```

Nevertheless, I'm inclined to agree that if a recognizer consumes `( c-addr u )` in any case, it seemingly makes shorter the total lexical size of overall code.

> Whether to pass the first recognizer on top or bottom is also unclear

It is more clear if they are passed left to right, i.e., we place them into the stack in the same order in which they should be executed: the first placed is executed fist, the second placed is executed second (if any), the last placed (that is topmost) is executed last.

This situation is similar to the order of local variables (in declaration): direct mapping is more clear.


,------------------------------------------
| 2020-09-12 22:51:48  ruv  replies:
| proposal - Traverse-wordlist does not find unnamed/unfinished definitions
| see: https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions#reply-532
`------------------------------------------
I would suggest to avoid "named word" pleonasm in "for every <u>named word</u> that can be found", since an unnamed definition cannot be found. I.e., if a definition can be found, then it certainly has a name.

A possible variant of this part:<br/>
> "Execute xt once for every word that can be found,"

A possible variant that unites both corrections into a single one:<br/>
> "Execute _xt_ once for every word that can be found in the word list _wid_, and for every word whose name matches the name of a found word but placed earlier in this word list,"

The phrase "<u>same</u> name" is inappropriate since it doesn't take into account possible case insensitivity. However, names matching is described in 3.4.2 [Finding definition names](/standard/usage#usage:find).


Also, the following typo can be corrected:

"words with the same name <u>are called</u> in the order newest-to-oldest (possibly with other words in between)"
<br/>?<br/>
"words with the <b>matched names are visited</b> in the order newest-to-oldest (possibly with other words in between)"


,------------------------------------------
| 2020-09-13 06:49:49  AntonErtl  replies:
| proposal - Traverse-wordlist does not find unnamed/unfinished definitions
| see: https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions#reply-533
`------------------------------------------
The proposal was voted on and accepted 10Y/0N/1A.  The vote was closed on 2020-09-03.  If you think that the voted-on version is unclear enough to be improved, you need to make a new proposal.

I think it is clear enough, though.  "Named word" may be a pleonasm, but it is clear.  The way that "same name" is used in the voted-on version makes it clear that all matching names are considered to be the same.

Concerning "are called": Yes, "are visited" is intended, so one could make another proposal for fixing that.  But nobody seems to have been confused by "are called" yet.


,------------------------------------------
| 2020-09-13 16:19:15  AntonErtl  replies:
| proposal - Traverse-wordlist does not find unnamed/unfinished definitions
| see: https://forth-standard.org/proposals/traverse-wordlist-does-not-find-unnamed-unfinished-definitions#reply-534
`------------------------------------------
If someone proposes another revision, one could write:

> When a word becomes findable, it also becomes traversable.  The word the stays traversable until it is deleted.

and then define the rest in terms of "traversable", in particular:

> Execute xt once for every traversable word in the wordlist wid,


,------------------------------------------
| 2020-09-14 09:12:42  AndrewHaley  replies:
| proposal - An alternative to the RECOGNIZER proposal
| see: https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-535
`------------------------------------------
Firstly, there is a need for user-defined literals and some other kinds of prefix notation. Anyone who needs anything more exotic (or powerful) and wants it to be standardized had better provide evidence that it's needed for Forth programs. A good design will have everything you need and nothing more.

Secondly, `'a::b` would just work. Any system supporting `a::b` as wordlist::word would have to redefine `FIND` to break the tokens apart: a recognizer for `'`-prefixed words would call `FIND`, which would find the word.


,------------------------------------------
| 2020-09-14 09:20:04  AndrewHaley  replies:
| proposal - An alternative to the RECOGNIZER proposal
| see: https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-536
`------------------------------------------
Re Jenny's point. It's necessary to define some mechanism by which "performing the interpretation semantics" of some rec-type might be performed. It seems to me more appropriate to specify exactly how that gets done here: it gets done by the called recognizer word. The "semantics" are whatever the recognizer does.


,------------------------------------------
| 2020-09-14 13:21:26  ruv  replies:
| proposal - An alternative to the RECOGNIZER proposal
| see: https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-537
`------------------------------------------
> Are you objecting to the use of the common word "recognize"?

Though the common word "[recognize](https://en.wiktionary.org/wiki/recognize#Verb)" is used in a non usual meaning. Your "recognizer" does not just recognize a lexeme, but also performs interpretation or compilation semantics for the lexeme. It's confusing that performing semantics is a part of recognizing by your interpretation.


>  Firstly, there is a need for user-defined literals and some other kinds of prefix notation

What is a literal?   

By the first glance,`'X` is a literal, `a::b` is a literal, `'a::b` is a literal too — the run-time semantics for all of them is just to put a number (an xt) into the stack.

> Any system supporting a::b as wordlist::word would have to redefine FIND

Do you mean that it should be done in a non standard way (i.e., not over the API you are proposing)?

An issue of your API is that  we cannot define `'X` format in the general form: `'<any-literal-that-is-mapped-to-single-xt>`. Ditto we cannot define `wordlist::word` format in the general form `<any-literal-that-is-mapped-to-single-xt>::name`.


> Re Jenny's point. 

Jenny is right substantially (since "rectype" is not used in this proposal). The idea is that the found "[RECOGNIZE]" word should perform interpretation semantics for the lexeme if interpreting, and compilation semantics if compiling.


> "If found, perform the interpretation sematics of the found recognizer"

The Forth text interpreter only performs interpretation semantics if interpreting, and compilation semantics if compiling. So this phrase in the specification makes things too confusing. Better to say: "perform the execution semantics".


,------------------------------------------
| 2020-09-14 13:25:16  ruv  replies:
| proposal - An alternative to the RECOGNIZER proposal
| see: https://forth-standard.org/proposals/an-alternative-to-the-recognizer-proposal#reply-538
`------------------------------------------
Correction:
By the first glance,`'X` is a literal, `'a::b` is a literal too — the run-time semantics for all of them is just to put a number (an xt) into the stack.


Re `a::b` — it's run-time semantics may be other.


,------------------------------------------
| 2020-09-14 21:33:55  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-539
`------------------------------------------
CORE has only `VARIABLE` as option for storing things to change. As a result, the interface to use `FORTH-RECOGNIZER` has to be clumsy, i.e.

    forth-recognizer @ execute execute

Clumsy interfaces can not be changed if you have better things at hand.  You can probably wrap around the clumsy interface, e.g.

    Defer recognize-forth
    addr recognize-forth Constant forth-recognizer

if you can use `ADDR` to access the deferred word's xt storage location.  But then you have another interface, less clumsy, and only available when you have `DEFER`+`ADDR` (and `ADDR` is not even part of the standard).

A minimalistic API, as what I am looking for here is one where you don't have to document much.  The less uniform an API is, the more you have to document.  The uniformity here is that a recognizer is a word that has `( addr u -- i*x translator-xt )` as stack effect.  And combinations of recognizers have the same effect.  And the system's recognizer is just another one, which you can swap in and out.  And you can define a `REC-SEQUENCE`, where you can manipulate the sequence, and put that into the system's recognizer.

This uniformity is broken when you don't use a deferred word for the system's recognizer — you can't just call that one as you can call the others.  You need `@ EXECUTE`.  This is clumsy.


,------------------------------------------
| 2020-09-15 10:04:01  ruv  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-540
`------------------------------------------
> CORE has only `VARIABLE` as option for storing things to change. As a result, the interface to use FORTH-RECOGNIZER has to be clumsy, i.e.
> `forth-recognizer @ execute execute`

I don't suggest to use a variable in the interface, — it's even worse than a defer. When a variable is used to change something, this changing cannot be effectively detected. But the requirement is:  an ability for a system to perform internal actions on switching the recognizer that is currently used by the Forth text interpreter.

For that I would prefer to have the separate words in the API: a setter, a getter and a "performer" (a word that performs the recognizer that is currently used by the Forth text interpreter).

What are your objections to have several separate words in the minimalistic API?


>  The uniformity here is that a recognizer is a word that has ( addr u -- i*x translator-xt ) as stack effect. 

I strongly support this approach (and I myself suggested this approach too, with slightly different stack effects). 

> This uniformity is broken when you don't use a deferred word for the system's recognizer

It seems, the set of words like the following (the names may vary):
```
perceive ( c-addr u -- k*x tt )
set-perceptor ( xt -- )
perceptor ( -- xt )
```
doesn't brake the mentioned uniformity. Please, clarify.


,------------------------------------------
| 2020-09-16 14:26:03  BerndPaysan  replies:
| proposal - minimalistic core API for recognizers
| see: https://forth-standard.org/proposals/minimalistic-core-api-for-recognizers#reply-541
`------------------------------------------
Using special setters and getters means you have another (special purpose) `DEFER` mechanism here. Of course you can implement that with

    variable current-perceptor
    : perceive ( addr u -- i*j token ) current-perceptor @ execute ;
    : set-perceptor ( xt -- ) current-perceptor ! ;
    : perceptor ( -- xt ) current-perceptor @ ;

which is probably a bit less implementation effort than `DEFER`, `IS`, and `ACTION-OF`. Or really?

State-Smart:

    : defer  Create ['] noop ,  does> @ execute ;
    : is  ' >body state @ if  ]] literal ! [[  else  !  then ; immediate
    : action-of  ' >body state @ if  ]] literal @ [[  else  @  then ; immediate

or with NDCS:

    : defer  Create ['] noop ,  does> @ execute ;
    : is  ' >body ! ; ndcs: ' >body ]] literal ! [[ ;
    : action-of  ' >body @ ;  ndcs: ' >body ]] literal @ [[ ;

`DEFER` is really a lightweight way to define words that can be changed.

These three lines of code are doing more than the three lines of code you need in addition when you have your special-purpose setter and getter, but they are still one-liners.

Forthers like to reinvent the wheel.  But don't overdo this.


,------------------------------------------
| 2020-09-16 17:17:50  JennyBrien  replies:
| proposal - Wording: declare undefined interpretation semantics for locals
| see: https://forth-standard.org/proposals/wording-declare-undefined-interpretation-semantics-for-locals#reply-542
`------------------------------------------
POSTPONEing a local doesn't/shouldn't work either.


,------------------------------------------
| 2020-09-16 20:25:35  JennyBrien  replies:
| proposal - Nestable Recognizer Sequences
| see: https://forth-standard.org/proposals/nestable-recognizer-sequences#reply-543
`------------------------------------------
> The similarity between wordlists and a search order has inspired the idea of nestable search orders: Several wordlists could be combined into a sequence that itself would work like a wordlist in other search orders. However, the search order words had already been standardized, so this idea never made it out of the concept stage.

> The similarity between the search order and recognizer sequences has led to the present recognizer proposal containing the words GET-RECOGNIZER and SET-RECOGNIZER, which are mostly modeled on GET-ORDER and SET-ORDER.

At first glance, it's simple to convert a wordlist into a recognizer, so recognizer sequences would also give nestable search orders. If `WORDLIST` returned the xt of an anonymous recognizer... but there would still be problems deciding how to `SET-CURRENT`. There would still have to be a difference between recognizers that search the dictionary (called by `REC-NAME` or similar) and other recognizers, otherwise there can be no concept of a 'current search order'

So, do we need a `FORTH-RECOGNIZER` that combines the two? Is it sufficient to replace the 'word-not-found' portion of the interpreter? So far, I have only seen one use-case for a user-written recognizer to precede `REC-NAME` and I suspect such users would be better served by having their own interpreter loop rather than patching in to the system one. Maybe all that is needed is the ability to add a recognizer to the current stack and leave it their until it is removed by `MARKER` or the stack is reset by `QUIT`, in which case:

```
   : +RECOGNIZER (  _name_ -- )  ' action-of recognized two-recognizers ;
```