FIND

( c-addr -- c-addr 0 | xt 1 | xt -1 )

Find the definition named in the counted string at c-addr. If the definition is not found, return c-addr and zero. If the definition is found, return its execution token xt. If the definition is immediate, also return one (1), otherwise also return minus-one (-1). For a given string, the values returned by FIND while compiling may differ from those returned while not compiling.

See:

Rationale:

One of the more difficult issues which the committee took on was the problem of divorcing the specification of implementation mechanisms from the specification of the Forth language. Three basic implementation approaches can be quickly enumerated:

  1. Threaded code mechanisms. These are the traditional approaches to implementing Forth, but other techniques may be used.

  2. Subroutine threading with "macro-expansion" (code copying). Short routines, like the code for DUP, are copied into a definition rather than compiling a JSR reference.

  3. Native coding with optimization. This may include stack optimization (replacing such phrases as SWAP ROT + with one or two machine instructions, for example), parallelization (the trend in the newer RISC chips is to have several functional subunits which can execute in parallel), and so on.

The initial requirement (inherited from Forth 83) that compilation addresses be compiled into the dictionary disallowed type 2 and type 3 implementations.

Type 3 mechanisms and optimizations of type 2 implementations were hampered by the explicit specification of immediacy or non-immediacy of all standard words. POSTPONE allowed de-specification of immediacy or non-immediacy for all but a few Forth words whose behavior must be STATE-independent.

One type 3 implementation, Charles Moore's cmForth, has both compiling and interpreting versions of many Forth words. At the present, this appears to be a common approach for type 3 implementations. The committee felt that this implementation approach must be allowed. Consequently, it is possible that words without interpretation semantics can be found only during compilation, and other words may exist in two versions: a compiling version and an interpreting version. Hence the values returned by FIND may depend on STATE, and ' and ['] may be unable to find words without interpretation semantics.

Testing:

HERE 3 C, CHAR G C, CHAR T C, CHAR 1 C, CONSTANT GT1STRING
HERE 3 C, CHAR G C, CHAR T C, CHAR 2 C, CONSTANT GT2STRING
T{ GT1STRING FIND -> ' GT1 -1 }T
T{ GT2STRING FIND -> ' GT2 1  }T
( HOW TO SEARCH FOR NON-EXISTENT WORD? )

ContributeContributions

AntonErtlavatar of AntonErtl Clarify FINDProposal2018-05-23 17:04:29

Problem

The existing specification of FIND is unclear wrt what xts are returned under what conditions. It also uses a different notion of immediacy from the one in the Definition of Terms. From the rationale of FIND, it is obvious that cmForth inspired the idea that two different xts can be returned. The rationale of COMPILE, shows that the intention is that FIND can be usable for the user-defined text interpreter. But FIND as specified does not guarantee that. This proposal would fix this problem; it is also phrased in a way that includes systems like cmForth and Mark Humphries' system.

Proposal

Replace the text in the specification of FIND with:

Find the definition named in the counted string at c-addr. If the definition is not found, return c-addr and zero. If the definition is found, return xt 1 or xt -1. The returned values may differ between interpretation and compilation state. In interpretation state, EXECUTEing the returned xt performs the interpretation semantics of the word. In compilation state, the returned values represent the compilation semantics: if xt 1 is returned, then EXECUTEing xt performs the compilation semantics; if xt -1 is returned, then COMPILE,ing xt performs the compilation semantics.

JennyBrienavatar of JennyBrien 2018-05-23 22:09:24

Looks good to me, but the cmForth approach (of separate compilation and interpretation wordlists and searching one first) cannot be made Standard. It will not always find the most recent use of the name. Mark's doesn't either, but it fails by ignoring a more recent compile-only version while interpreting. Other Forths would attempt to execute the compile-only version, which would also be an error.

Thinking in terms of name tokens, it seems that NAME>COMPILE can take four possible forms:

   xt  compile,     \ the default
   xt  optimiser,   \ set by set-compile
   xt  execute      \ a normal immediate word
   ext execute.    \ set by set-ndcs

FIND should return a flag of -1 for the first two cases (no optimisation is done unless with an intelligent COMPILE, that can derive the optimising xt from the original xt alone) and 1 in the other two cases, but the fourth only while compiling.

AntonErtlavatar of AntonErtl 2018-05-24 07:37:08

A cmForth-like system could be standard by first looking up user-defined words, then (in compile state) the compiler words, then (in both states) the system-defined words. This does not allow to add new compiler words, but the standard does not allow that, either.

Concerning your description of NAME>COMPILE, optimization (set by SET-OPTIMIZER in Gforth or SET-COMPILER in VFX) affects COMPILE,, not NAME>COMPILE, so your second case does not exist. Otherwise yes, if we accept the present proposal, the xt2 produced by NAME>COMPILE ( nt -- xt1 xt2 ) has to be EXECUTE or COMPILE, to match this FIND; otherwise NAME>COMPILE would produce compilation tokens that cannot be represented by the results of FIND.

StephenPelcavatar of StephenPelc 2018-05-24 12:16:24

It's too early to rewrite FIND. It may be better to find an alternative and to mark FIND as obsolescent. Modern Forths have a range of flags other than IMMEDIATE, for example the NDCS flag. We have an opportunity to expose other flags and to replace xt value with xt mask where mask is a bitmask containing IMMEDIATE and NDCS bits. This gives more opportunities to systems. We also need to revisit COMPILE, and to provide a compiling word for words (e.g. IF and friends) that parse or produce/consume data at compile time. The standard's wording for compile time is mystically unclear.

StephenPelcavatar of StephenPelc 2018-05-24 12:16:42

It's too early to rewrite FIND. It may be better to find an alternative and to mark FIND as obsolescent. Modern Forths have a range of flags other than IMMEDIATE, for example the NDCS flag. We have an opportunity to expose other flags and to replace xt value with xt mask where mask is a bitmask containing IMMEDIATE and NDCS bits. This gives more opportunities to systems. We also need to revisit COMPILE, and to provide a compiling word for words (e.g. IF and friends) that parse or produce/consume data at compile time. The standard's wording for compile time is mystically unclear.

AntonErtlavatar of AntonErtl 2018-05-24 16:11:53

One benefit of this proposal is that it does not use "immediate" in a way that is at odds with the use in the rest of the document. AFAICT it also reflects the intent of the Forth-94 committee better than the current text. And 24 years after Forth-94, it is certainly not too early to make that clarification. It may be unnecessary, but there are people who make wide-reaching claims, because "immediate" is used here with a different meaning than in the rest of the document, so maybe such a clarification is necessary after all.

As for the alternative, I'll propose that Real Soon Now.

StephenPelcavatar of StephenPelc 2018-05-24 17:13:21

I'm really not trying to stop people from improving FIND. I just need some breathing space to get to the next stage of VFX NDCS. This involves rebuilding the cross compiler on it, and it ain't pretty. Once that's done, I'll have a pretty good idea where the problems lie.

One of the problems in changing FIND now is the law of unintended consequences; basically what else are you going to damage? The safest course may well be to find a new word that we can agree on, and then to mark FIND as obsolescent.

AntonErtlavatar of AntonErtl 2018-05-25 06:35:45

Do you want the traditional user-defined text interpreter (e.g., as outlined in the rationale of COMPILE,) to work on VFX? The new description specifies exactly as much as is necessary to guarantee that, nothing more. If you need more breathing space, you would break these user-defined text interpreters.

As for unintended consequences, if someone uses a blemish in the standard like the current description of FIND as an excuse to damn the whole standard, a consequence is to fix the blemish.

StephenPelcavatar of StephenPelc 2018-05-25 14:37:46

Since we cannot use COMPILE, for words that parse or affect the stack, surely NAME>COMPILE has to be able to return the xt of words such as the suggested NDCS, which can handle parsing or stack effects.

AntonErtlavatar of AntonErtl 2018-05-25 17:42:06

Let's make it concrete:

: s"-int '"' parse save-mem ;
: s"-comp '"' parse postpone sliteral ;
' s"-int ' s"-comp interpret/compile: s"
s\" s\"" find-name name>compile ( xt1 xt2 )

A correct result at the end of this piece of code is: xt1 is the xt of S"-comp and xt2 is the xt of execute. In compilation state, FIND returns the xt of S"-comp and 1.

Now your idea seems to be a different result for the piece of code above: xt1 should be the xt of S", and xt2 should be the xt of NDCS,. That may be correct for NAME>COMPILE, but what does your FIND do, if you want to support the traditional user-defined text interpreter (which uses only EXECUTE and COMPILE,)?

JennyBrienavatar of JennyBrien 2018-05-26 18:11:13

I can see the point of centralising the optimising of default words in COMPILE, because that allows the action to take place within a definition, apart from the compiling loop. If no optimisation is set, it compiles a call, which results in code that produces the same output, if more slowly.

But what should xt NDCS, do? In particular, what should it do if presented with an xt that has no explicit compiling action?

It would seem most sensible to compile it - which is after all 'performing its compilation semantics'. But in that case, the whole of the compiler, form the point of obtaining a valid token, is bundled in the one word.

NDCS, is equivalent to NAME>COMPILE EXECUTE

The more import question is: does this token have an explicit compiling action set, and if so, what is it? With name tokens, that is simple:

   :  EXPLICIT?  ( nt -- int-xt 0 | comp-xt -1 ) 
                             NAME>COMPILE  ['] EXECUTE =  ;

AntonErtlavatar of AntonErtl 2018-05-29 10:46:22

NAME>COMPILE is a standard way to get at the compilation semantics, and it works for all words. There is no need for NDCS, as a standard interface. Some systems may have it as internal factor, however.

But that's for a discussion on replacing FIND, and is not relevant to the present proposal on clarifying FIND.

AntonErtlavatar of AntonErtlNew Version 2018-05-29 11:28:32

ChangeLog

2018-05-29: Specify FIND for words without interpretation semantics, and loosen it for TO IS ACTION-OF. Added Remarks section for a rationale of these additions.

2018-05-23: Initial version

Problem

The existing specification of FIND is unclear wrt what xts are returned under what conditions. It also uses a different notion of immediacy from the one in the Definition of Terms. From the rationale of FIND, it is obvious that cmForth inspired the idea that two different xts can be returned. The rationale of COMPILE, shows that the intention is that FIND can be usable for the user-defined text interpreter. But FIND as specified does not guarantee that. This proposal would fix this problem; it is also phrased in a way that includes systems like cmForth and Mark Humphries' system.

Proposal

Replace the text in the specification of FIND with:

Find the definition named in the counted string at c-addr. If the definition is not found, return c-addr and zero. If the definition is found, return xt 1 or xt -1. The returned values may differ between interpretation and compilation state. In interpretation state, EXECUTEing the returned xt performs the interpretation semantics of the word. In compilation state, the returned values represent the compilation semantics: if xt 1 is returned, then EXECUTEing xt performs the compilation semantics; if xt -1 is returned, then COMPILE,ing xt performs the compilation semantics.

In interpretation STATE, FIND may produce c-addr 0 if the definition has no interpretation semantics; if it produces xt 1 or xt -1, the returned xt represents a system-dependent action.

If the definition if for a word for which POSTPONE is ambiguous, it is ambiguous to perform the xt returned by FIND in a STATE different from the STATE during FIND.

In 4.1.2 Ambiguous conditions, add the ambiguous condition above, and remove "6.1.1550 FIND" from

attempting to obtain the execution token, (e.g., with 6.1.0070 ', 6.1.1550 FIND, etc. of a definition with undefined interpretation semantics;

Remarks

The removal of FIND from the clause in 4.1.2 ensures that we can text-interpret (in compile STATE) words without interpretation semantics, such as IF. The description of the behaviour of FIND for these words in interpretation STATE allows implementations that do not find such words, implementations that return the xt for an error, implementations that return the xt for the compilation semantics, and implementations that return the xt for some system-specific interpretation semantics.

The ambiguous condition allows STATE-smart implementations of TO, IS and ACTION-OF (as Forth-94 and Forth-2012 do).

Note that this does not allow STATE-smart implementations of words without interpretation semantics (e.g., IF), but then, that's already forbidden by POSTPONE and [COMPILE].

JennyBrienavatar of JennyBrien 2018-05-30 07:17:41

Editing suggestion:

In interpretation state, EXECUTEing the returned xt performs the interpretation semantics of the word. If the definition has no interpretation semantics FIND may produce c-addr 0; if it produces xt 1 or xt -1, the returned xt represents a system-dependent action.

In compilation state, the returned values represent the compilation semantics: if xt 1 is returned, then EXECUTEing xt performs the compilation semantics; if xt -1 is returned, then COMPILE,ing xt performs the compilation semantics.

If the definition is for a word for which POSTPONE is ambiguous, it is ambiguous to perform the xt returned by FIND in a STATE different from the STATE during FIND.

AntonErtlavatar of AntonErtlNew Version 2018-06-17 13:28:27

ChangeLog

2018-05-17: Wording changes following the suggestion by JennyBrien

2018-05-29: Specify FIND for words without interpretation semantics, and loosen it for TO IS ACTION-OF. Added Remarks section for a rationale of these additions.

2018-05-23: Initial version

Problem

The existing specification of FIND is unclear wrt what xts are returned under what conditions. It also uses a different notion of immediacy from the one in the Definition of Terms. From the rationale of FIND, it is obvious that cmForth inspired the idea that two different xts can be returned. The rationale of COMPILE, shows that the intention is that FIND can be usable for the user-defined text interpreter. But FIND as specified does not guarantee that. This proposal would fix this problem; it is also phrased in a way that includes systems like cmForth and Mark Humphries' system.

Proposal

Replace the text in the specification of FIND with:

Find the definition named in the counted string at c-addr. If the definition is not found, return c-addr and zero. If the definition is found, return xt 1 or xt -1. The returned values may differ between interpretation and compilation state:

In interpretation state: If the definition has interpretation semantics, FIND returns xt 1 or xt -1, and EXECUTEing xt performs the interpretation semantics of the word. If the definition has no interpretation semantics, FIND may produce c-addr 0; if it produces xt 1 or xt -1, EXECUTEing xt performs a system-dependent action.

In compilation state, the returned values represent the compilation semantics: if xt 1 is returned, then EXECUTEing xt performs the compilation semantics; if xt -1 is returned, then COMPILE,ing xt performs the compilation semantics.

If the definition is for a word for which POSTPONE is ambiguous, it is ambiguous to perform the xt returned by FIND in a STATE different from the STATE during FIND.

In 4.1.2 Ambiguous conditions, add the ambiguous condition above, and remove "6.1.1550 FIND" from

attempting to obtain the execution token, (e.g., with 6.1.0070 ', 6.1.1550 FIND, etc. of a definition with undefined interpretation semantics;

Remarks

The removal of FIND from the clause in 4.1.2 ensures that we can text-interpret (in compile STATE) words without interpretation semantics, such as IF. The description of the behaviour of FIND for these words in interpretation STATE allows implementations that do not find such words, implementations that return the xt for an error, implementations that return the xt for the compilation semantics, and implementations that return the xt for some system-specific interpretation semantics.

The ambiguous condition allows STATE-smart implementations of TO, IS and ACTION-OF (as Forth-94 and Forth-2012 do).

Note that this does not allow STATE-smart implementations of words without interpretation semantics (e.g., IF), but then, that's already forbidden by POSTPONE and [COMPILE].

Reply

AntonErtlavatar of AntonErtl find-nameProposal2018-05-25 12:26:23

Problem

FIND has several problems:

  1. It takes a counted string.

  2. Its interface is designed for single-xt+immediate-flag systems (which cannot implement FILE S" correctly in all cases), and there is allowance for a STATE-dependent result to support other systems.

  3. As currently specified (there is a proposal for fixing that), it fails to meet the goal of guaranteeing that the user-defined text-interpreter (outlined in the rationale of COMPILE,).

As a consequence of problem 2, the following implementation of ' is wrong:

: ' bl word find 0= -13 and throw ;

This ' produces the wrong result in the following case on Gforth:

: x ' ; immediate
] x s" [

And that's because Gforth implements S" correctly, and FIND such that the user-defined text interpreter works. The following definition would fix this problem on Gforth:

: ' bl word state @ >r postpone [ find r> if ] then 0= -13 and throw ;

but actually the specification of FIND currently is sufficiently unclear that we cannot be sure that that's guaranteed to work, either.

SEARCH-WORDLIST has a related, but different set of problems:

  1. It produces results with a varying number of stack items. In some situations that makes it more cumbersome to use.

  2. Its interface is designed for single-xt+immediate-flag systems, but this time without allowing a STATE-dependent result. As a consequence, Gforth produces the xt representing the interpretation semantics of the word, independent of STATE, with no way to get the compilation semantics of the word.

  3. In the general case, given that the xt part of the result does not represent a part of the compilation semantics, the 1/-1 part of the result is pointless.

Solution

Introduce FIND-NAME (for the search order) and FIND-NAME-IN (for specific wordlists). Both take strings in c-addr u form, and both return the name token of the found word (or 0 if not found). We can then go from the name token to the interpretation semantics with NAME>INTERPRET, and to the compilation semantics with NAME>COMPILE.

Typical use

: ' parse-name find-name dup 0= -13 and throw name>interpret ;

: postpone
  parse-name find-name dup 0= -13 and throw
  name>compile swap postpone literal compile, ; immediate

\ user-defined text interpreter
: interpret-word
  parse-name 2dup find-name if
     nip nip state @ if name>compile else name>interpret then execute
  else
     ... \ process numbers
  then ;

\ alternative:
defer name>statetoken ( nt -- ... xt )
: [ ['] name>interpret is name>statetoken false state ! ;
[ \ initialize STATE and NAME>STATETOKEN
: ] ['] name>compile is name>statetoken true state ! ;

: interpret-word
  parse-name 2dup find-name if
     nip nip state name>statetoken execute
  else
     ... \ process numbers
  then ;

Remarks

FIND-NAME and FIND-NAME-IN are natural factors of all words that look up words in the dictionary, such as FIND, ', POSTPONE, the text interpreter, and SEARCH-WORDLIST. So implementing them does not cost additional code in the Forth system, only some refactoring effort.

This approach is not compatible with the cmForth approach and Mark Humphries' approach, because they both use one word header each for interpretation and compilation semantics. This problem already exists for the other words that deal with name tokens, but the present proposal would make name tokens more important, and systems that do not support them less viable. However, both approaches have been known for at least two decades, and have seen little to no uptake in standard systems. And we have good approaches for implementing systems with name tokens, so excluding these approaches is not a significant loss.

Proposal

Add the following words:

FIND-NAME ( c-addr u -- nt | 0 )

Find the definition identified by the string c-addr u in the current search order. Return its name token nt, if found, otherwise 0.

FIND-NAME-IN ( c-addr u wid -- nt | 0 )

Find the definition identified by the string c-addr u in the wordlist wid. Return its name token nt, if found, otherwise 0.

Reference implementation

Implementing FIND-NAME-IN requires carnal knowledge of the system.

: find-name {: c-addr u -- nt | 0 :}
  get-order 0 swap 0 ?do ( widn...widi nt|0 )
    dup 0= if
      drop c-addr u rot find-name-in
    else
      nip
    then
  loop ;

Testing

Testcases

Experience

Gforth has implemented FIND-NAME and (under the name (search-wordlist)) FIND-NAME-IN since 1996. No problems were reported or found internally.

Several other systems have been reported to implement FIND-NAME under this or other names (e.g., FOUND in ciforth).

mtruteavatar of mtrute 2018-05-27 08:48:20

amforth has FIND-NAME since ever and FIND-NAME-IN under the name SEARCH-NAME following SEARCH-WORDLIST ( c-addr u wid -- 0 | xt 1 | xt -1 ).

JennyBrienavatar of JennyBrien 2018-05-27 13:03:37

If I've got this right, a definition of FIND using FIND-NAME:

   :  EXPLICIT?  ( nt -- int-xt 0 | comp-xt -1 ) 
                             NAME>COMPILE  ['] EXECUTE =  ;

   : FIND   dup count parse-name find-name dup IF
                 ( c-addr nt )  nip dup name>interpret swap
                ( int-xt nt )  explicit?  IF  
                ( int-xt comp-xt )  state @ IF nip ELSE  drop THEN 
                                           1 ELSE 
                                    nip -1 THEN  ;

JennyBrienavatar of JennyBrien 2018-05-27 21:07:04

Should be:

 :  EXPLICIT?  ( nt -- int-xt 0 | comp-xt -1 ) 
                             NAME>COMPILE  ['] EXECUTE =  ;

   : FIND   dup count parse-name find-name dup IF
                 ( c-addr nt )  nip dup name>interpret swap
                ( int-xt nt )  explicit?  IF  
                ( int-xt comp-xt )  state @ IF nip ELSE  drop THEN 
                                           1 ELSE 
                                    drop  -1 THEN  
                                                                                      THEN  ;

MarkWillsavatar of MarkWills 2018-05-29 10:22:53

Why not have FIND-NAME return both the interpretation and compilation XTs simultaneously, then you can simply DROP or NIP to keep the XT that you're interested in:

FIND-NAME ( c-addr u -- compXT|0 intXT|0 )

Now, we don't need NAME>INTERPRET and NAME>COMPILE at all.

Additionally, NAME>INTERPRET and NAME>COMPILE are erroneously named, since you don't pass a name to them. The "name" (in the form of addr u) is passed into FIND-NAME. NAME>INTERPRET and NAME>COMPILE both take an execution token, so they should be named XT>INTERPRET and XT>COMPILE.

However, as noted above, it seems much simpler to have FIND-NAME return both tokens, and the programmer DROP or NIP as appropriate.

Regards

Mark Wills

AntonErtlavatar of AntonErtl 2018-05-30 06:49:26

I would implement FIND using FIND-NAME as follows (untested):

: find ( c-addr -- c-addr 0 | xt 1 | xt -1 )
  dup find-name dup if
   nip state @ if
    name>compile ' execute = if 1 else -1 then
   else
    name>interpret 1
   then
  then ;

This assumes that NAME>COMPILE returns the xt of EXECUTE or that of COMPILE, on the TOS. If NAME>COMPILE can also return other xts on the TOS, FIND may need to be implemented in a different way. Whether interpretive FIND returns 1 or -1 does not play a role for the classical text interpreter. However, if other applications need it, one might extract the proper value from NAME>COMPILE; an application that relies on the flag returned in interpret state is also likely to rely on the corresponding xt as representing a part of the compilation semantics, and that does not work for all words on all systems (it might be good enough for a specific system or a specific set of words, though).

As for a word that returns an interpretation xt and a compilation xt: It has no existing practice, in contrast to FIND-NAME. It's also bad factoring, because it does two things at once while we usually need only one of them. The wisdom of having a name token (nt) and factoring these operations into FIND-NAME, NAME>INTERPRET and NAME>COMPILE is demonstrated by the introduction of TRAVERSE-WORDLIST, which also produces nts, that then can be processed with NAME>INTERPRET. Another problem of the suggested word is that it returns just one xt for the compilation semantics, which would require a costly production of these xts for many words. Finally, such a word would also be harder to write and read, because both writer and reader would have to look up how NIP and DROP correspond to interpretation and compilation semantics.

JennyBrienavatar of JennyBrien 2018-05-30 15:14:18

NAME >INTERPRET returns 0 for a compile-only word that the system has marked as such. That's not a valid output for FIND.

AntonErtlavatar of AntonErtl 2018-05-31 07:46:08

To cover that (but with the other assumptions as above), FIND would look as follows:

: find ( c-addr -- c-addr 0 | xt 1 | xt -1 )
  dup find-name dup if
   state @ if
    nip name>compile ['] execute = if 1 else -1 then
   else
    name>interpret dup if nip 1 then
   then
  then ;

StephenPelcavatar of StephenPelc 2018-06-06 22:47:43

I now have an NDCS version of VFX Forth that passes all of Gerry Jackson's test suite, plus the previous tests for [COMPILE]. The NDCS VFX works fine with FIND. The sole reason for FIND-NAME is to allow for multiple words with the same xt, i.e. the nt identifies a named word and xts identify the actions. We have to recognise that people with xt-based systems are not going to abandon them, they will simply use FIND-WORD or some such that returns an xt. I really feel that we should slow down and resolve issues such as NDCS words before we redesign the text interpreter. Given recognisers, this may be too late, but we need to work our way through the issues before we make changes that we may regret. Note that the compilation issues caused by NDCS ("non-default compilation semantics") have taken 20+ years to emerge.

JennyBrienavatar of JennyBrien 2018-06-07 09:50:11

Stephen Pelc wrote

The NDCS VFX works fine with FIND.

Does the same FIND allow NCDS words to be used wirh legacy code that either EXECUTEs or COMPILE,s the xt returned?

The sole reason for FIND-NAME is to allow for multiple words with the same xt, i.e. the nt identifies a named word and xts identify the actions.

Does this mean that

synonym foo bar

' foo  ' bar = .
-1 ok

? That seems sensible if it can be done. It means syonyms of CREATE ... DOES> words should simply work with definitions that use ' >BODY and, by extension, synonyms of deferred wordsand values should also work as expected. It also makes it clear that you can't change the definition of a synonym with IMMEDIATE or DOES> because that would affect the original.

We have to recognise that people with xt-based systems are not going to abandon them, they will simply use FIND-WORD or some such that returns an xt.

Of course, but the name token stuff has been around for twenty years and more now, and it's not going away either.

I really feel that we should slow down and resolve issues such as NDCS words before we redesign the text interpreter. Given recognisers, this may be too late, but we need to work our way through the issues before we make changes that we may regret.

Practically, we need to clarify when FIND-WORD is being used as a synonym for FIND-NAME and when for FIND-NAME NAME>INTERPRET . Whether an xt should be COMPILE,ed or EXECUTEd is determined by the dictionary header that returned it, not by the bare xt itself.

Note that the compilation issues caused by NDCS ("non-default compilation semantics") have taken 20+ years to emerge.

That's because normally it makes no sense to execute an NCDS word except as a part of normal compilation. So you could get away with defining COMPILE, as meaning 'perform the compilation semantics' and using LITERAL EXECUTE when you meant 'append the execution semantics.'

StephenPelcavatar of StephenPelc 2018-06-07 13:46:17

Jenny said: Stephen Pelc wrote

The NDCS VFX works fine with FIND.

Does the same FIND allow NCDS words to be used wirh legacy code that either EXECUTEs or COMPILE,s the xt returned?

That's one of the results of permitting NDCS words that are not IMMEDIATE. If we keep the restriction that COMPILE, must not parse or have additional stack effects, we need a word, then we can either introduce a compiling word that can parse, e.g. : compile-word \ ix xt -- jx \ *G process an XT for compilation. dup ndcs? \ NDCS or normal if ndcs, else compile, then ; or we could just lift the restrictions on COMPILE, so that the requirements of NDCS can be supported by COMPILE, itself. The second option leads to very little change. In the VFX with COMPILE-WORD, the word NDCS, is used only once in the kernel and COMPILE-WORD replaces COMPILE, in the vast majority of cases.

It is worth noting that VFX ran with this non-compliant COMPILE, for several years and the only complaint about it came from Anton ... because it was non-compliant. Note that my exploration of NDCS is not about preserving single xt and so on, it was an attempt to understand compilation in Forth. One of the side effects of this exploration is the realisation that STATE-smart words can be re-implemented as dual-behaviour words which are standard compliant.

JennyBrienavatar of JennyBrien 2018-06-07 19:43:58

Note that my exploration of NDCS is not about preserving single xt and so on, it was an attempt to understand compilation in Forth. One of the side effects of this exploration is the realisation that STATE-smart words can be re-implemented as dual-behaviour words which are standard compliant.

Yes, but your method and Anton's sèem to imply different methods of specifying NDCS. Your method requires that the compilation xt. expects the interpretation xt on the stack. His (I think) requires that it doesn't.

StephenPelcavatar of StephenPelc 2018-06-07 23:47:38

My view is that if you want dual-behaviour words then you need to implement the consequences outlined in my paper. However this is implemented, a way of performing the word that I called NDCS, is required. Anton's implementations hide most of this.

BerndPaysanavatar of BerndPaysan 2018-06-08 12:18:02

Full reference implementation using TRAVERSE-WORDLIST:

\ find-name and find-name-in

\ this file is in the public domain

: >lower ( c1 -- c2 )
    dup 'A' 'Z' 1+ within bl and or ;
: istr= ( addr1 u1 addr2 u2 -- flag )
   rot over <> IF  2drop drop false  EXIT  THEN
    bounds ?DO
        dup c@ >lower I c@ >lower <> IF  drop false  unloop  EXIT  THEN
        1+
    LOOP  drop true ;

: find-name-in ( addr u wid -- nt / 0 )
    >r 0 -rot r>
    [: dup >r name>string 2over istr= IF
            rot drop r> -rot false
        ELSE  rdrop true  THEN
    ;] swap traverse-wordlist 2drop ;
: find-name {: c-addr u -- nt | 0 :}
    get-order 0 swap 0 ?do ( widn...widi nt|0 )
        dup 0= if
            drop c-addr u rot find-name-in
        else
            nip
        then
    loop ;

AntonErtlavatar of AntonErtl 2018-06-16 18:21:15

The sole reason for FIND-NAME is to allow for multiple words with the same xt, i.e. the nt identifies a named word and xts identify the actions.

Yes, words with different names and the same interpretation semantics xt are a reason for FIND-NAME, but not the only one.

The main reason is that it (together with the already-standardized NAME>INTERPRET and NAME>COMPILE) provides a good factoring that separates the jobs ob looking up the named word (FIND-NAME), accessing the interpretation semantics (NAME>INTERPRET) and accessing the compilation semantics (NAME>COMPILE) into separate words.

This is the replacement for FIND that you asked for.

We have to recognise that people with xt-based systems are not going to abandon them

Nobody is asking them to, whatever you mean with "xt-based system". The only thing that FIND-NAME NAME>INTERPRET NAME>COMPILE etc. are not a good fit for are multi-header systems like cmForth and Mark Humphries' system (but I have outlined how to implement these words on the latter).

I really feel that we should slow down and resolve issues such as NDCS words before we redesign the text interpreter. Given recognisers, this may be too late, but we need to work our way through the issues before we make changes that we may regret. Note that the compilation issues caused by NDCS ("non-default compilation semantics") have taken 20+ years to emerge.

These issues emerged in my experience in 1996 (two years after Forth-94 was released), and FIND-NAME etc. have been implemented and used in the text interpreter of Gforth since then. We have collected experience with them since then (i.e., for 22 years), for several implementation approaches for NDCS words (including the current one in Gforth, which can be described as "xt-based"), and the experience is that these words provide a good implementation-independent interface.

Given that the problem has been recognized by the time of Forth-83 (35 years ago) at the latest, this solution has 22 years of experience, and you asked for a FIND replacement, your call for slowing down even more is not helpful. These issues have been discussed again and again for a long time, so they are not going away; therefore we need to go forward. You asked for this proposal, here it is!

That's one of the results of permitting NDCS words that are not IMMEDIATE. If we keep the restriction that COMPILE, must not parse or have additional stack effects, we need a word, then we can either introduce a compiling word that can parse, e.g. : compile-word \ ix xt -- jx \ *G process an XT for compilation. dup ndcs? \ NDCS or normal if ndcs, else compile, then ;

The other option is to use EXECUTE as the word that can parse and have additional stack effects. Of course then FIND will report these NDCS words as "immediate", and will return different xts in different STATEs for NDCS words, as allowed by Forth-94 and Forth-2012.

or we could just lift the restrictions on COMPILE, so that the requirements of NDCS can be supported by COMPILE, itself. The second option leads to very little change. In the VFX with COMPILE-WORD, the word NDCS, is used only once in the kernel and COMPILE-WORD replaces COMPILE, in the vast majority of cases.

The real requirement (aka restriction) of COMPILE, is that it appends the same semantics that EXECUTE executes, i.e., that it is equivalent to

: compile, postpone literal postpone execute ;

That it has the stack effect ( xt -- ) and that it does not parse follows from that. And I have seen quite a lot of uses of COMPILE, that make use of this equivalence (most uses of COMPILE, that are not in text interpreters).

It is worth noting that VFX ran with this non-compliant COMPILE, for several years and the only complaint about it came from Anton ... because it was non-compliant.

And lots of people report "no problems" with STATE-smart words. Others, like me, tripped over them.

A broken COMPILE, would not be so bad if it only affected VFX and only S" and TO. But the disease was spreading to Gforth, where not just a few words with broken COMPILE, were implemented, but many; and, in addition, it contained defining words that made each of their children have a broken COMPILE,. I am not sure if this disease has been fully cured yet.

My view is that if you want dual-behaviour words then you need to implement the consequences outlined in my paper. However this is implemented, a way of performing the word that I called NDCS, is required.

This view is refuted by the many systems that have implemented dual-behaviour words correctly, but differently from what you outlined in the paper, and in particular, without having a word like your NDCS,.

In any case, FIND-NAME NAME>INTERPET NAME>COMPILE abstract from the implementation (except for being a better fit for a single-header system than for a multi-header system), as demonstrated by the fact that they have been implemented on several different systems, and can be implemented on VFX.

AntonErtlavatar of AntonErtl 2018-06-18 13:51:04

I have adopted the reference implementation by Bernd Paysan and added a few adjustments. I have also added more test cases that test using FIND-NAME for a simple variant of a user-defined text interpreter, and using it for interpreting all the hard cases: S" and TO (both interpretation semantics and compilation semantics) and locals (definition, use, and TO semantics). The result works on development Gforth (with both the Gforth implementation of FIND-NAME and with the reference implementation), and on VFX 4.72.

You can find them here: Reference Implementation, Testcases

One wrinkle is that you need carnal knowledge for proper handling of locals in FIND-NAME. This has been included for Gforth and VFX in the reference implementation, but not (yet) for other systems.

AntonErtlavatar of AntonErtlNew Version 2018-08-15 10:38:39

ChangeLog

2018-08-15 Reference implementation now provides FIND-NAME-IN; more information on other systems; added NAME-PRINTER example

Problem

FIND has several problems:

  1. It takes a counted string.

  2. Its interface is designed for single-xt+immediate-flag systems (which cannot implement FILE S" correctly in all cases), and there is allowance for a STATE-dependent result to support other systems.

  3. As currently specified (there is a proposal for fixing that), it fails to meet the goal of guaranteeing that the user-defined text-interpreter (outlined in the rationale of COMPILE,).

As a consequence of problem 2, the following implementation of ' is wrong:

: ' bl word find 0= -13 and throw ;

This ' produces the wrong result in the following case on Gforth:

: x ' ; immediate
] x s" [

And that's because Gforth implements S" correctly, and FIND such that the user-defined text interpreter works. The following definition would fix this problem on Gforth:

: ' bl word state @ >r postpone [ find r> if ] then 0= -13 and throw ;

but actually the specification of FIND currently is sufficiently unclear that we cannot be sure that that's guaranteed to work, either.

SEARCH-WORDLIST has a related, but different set of problems:

  1. It produces results with a varying number of stack items. In some situations that makes it more cumbersome to use.

  2. Its interface is designed for single-xt+immediate-flag systems, but this time without allowing a STATE-dependent result. As a consequence, Gforth produces the xt representing the interpretation semantics of the word, independent of STATE, with no way to get the compilation semantics of the word.

  3. In the general case, given that the xt part of the result does not represent a part of the compilation semantics, the 1/-1 part of the result is pointless.

Solution

Introduce FIND-NAME (for the search order) and FIND-NAME-IN (for specific wordlists). Both take strings in c-addr u form, and both return the name token of the found word (or 0 if not found). We can then go from the name token to the interpretation semantics with NAME>INTERPRET, and to the compilation semantics with NAME>COMPILE.

Typical use

: ' parse-name find-name dup 0= -13 and throw name>interpret ;

: postpone
  parse-name find-name dup 0= -13 and throw
  name>compile swap postpone literal compile, ; immediate

\ user-defined text interpreter
: interpret-word
  parse-name 2dup find-name if
     nip nip state @ if name>compile else name>interpret then execute
  else
     ... \ process numbers
  then ;

\ alternative:
defer name>statetoken ( nt -- ... xt )
: [ ['] name>interpret is name>statetoken false state ! ;
[ \ initialize STATE and NAME>STATETOKEN
: ] ['] name>compile is name>statetoken true state ! ;

: interpret-word
  parse-name 2dup find-name if
     nip nip state name>statetoken execute
  else
     ... \ process numbers
  then ;

\ a defining word for words that print their name
: name-printer ( "name" -- )
  >in @ create >in ! parse-name get-current find-name-in ,
does> ( -- )
  @ name>string type ;

Remarks

FIND-NAME and FIND-NAME-IN are natural factors of all words that look up words in the dictionary, such as FIND, ', POSTPONE, the text interpreter, and SEARCH-WORDLIST. So implementing them does not cost additional code in the Forth system, only some refactoring effort.

This approach is not compatible with the cmForth approach and Mark Humphries' approach, because they both use one word header each for interpretation and compilation semantics. This problem already exists for the other words that deal with name tokens, but the present proposal would make name tokens more important, and systems that do not support them less viable. However, both approaches have been known for at least two decades, and have seen little to no uptake in standard systems. And we have good approaches for implementing systems with name tokens, so excluding these approaches is not a significant loss.

Proposal

Add the following words:

FIND-NAME ( c-addr u -- nt | 0 )

Find the definition identified by the string c-addr u in the current search order. Return its name token nt, if found, otherwise 0.

FIND-NAME-IN ( c-addr u wid -- nt | 0 )

Find the definition identified by the string c-addr u in the wordlist wid. Return its name token nt, if found, otherwise 0.

Reference implementation

Reference Implementation

Testing

Testcases

Experience

Gforth has implemented FIND-NAME and (under the name (search-wordlist)) FIND-NAME-IN since 1996. No problems were reported or found internally.

amForth has FIND-NAME "since ever" (first release 2006), and FIND-NAME-IN under the name SEARCH-NAME.

Other systems have been reported to implement FIND-NAME under this or other names (e.g., FOUND in ciforth).

Reply