Digest #97 2020-03-09
Contributions
proposal - Wording: declare undefined interpretation semantics for locals
Problem
We have the "name Execution:" section for the locals, but don't have a "name Interpretation:" section. Hence, name has default interpretation semantics according to 3.4.3.2, that conflicts with an explicitly declared ambiguous condition in the "name Execution:" section.
Proposal
Add the following section:
name Interpretation:
Interpretation semantics for name are undefined.
Remove from the "name Execution:" section the following sentence:
An ambiguous condition exists when name is executed while in interpretation state.
Replies
I think this work on rephrasing, making better terminology and wording, and even reforming the conceptions, — is very important.
My thoughts regarding the terminology are the following.
3. The word "RECOGNIZER".
It seems the most bright conflict in the terminology lays between RECOGNIZER and GET-RECONIZERS words.
The former returns rit
, the latter returns rec
. Such conflict is inadmissible in the Standard. Another term (and another name for the word) should be found instead of "recognizer".
Anyway, holding for this word a kind of semantic similarity to the WORDLIST word looks like a good choice (if any).
5. A name for triple of interpret/compile/postpone xts
This item is closely connected to the above one (3).
It should be taken into account that we already have execution token xt (that identifies execution semantics) and name token nt (that identifies a named definition). I.e., the specification applies the term "token" to an attribute itself (a value only), without the corresponding information about its type (or class, or kind). Hence, the information about the corresponding type should be called "token type" (an identifier of a token type).
Under the hood, this token type identifier should be associated with handlers: how to execute (interpret) the corresponding token, how to compile the corresponding token, etc.
See also my approach in comp.lang.forth post in 2018 (news:pngvcc$pta$1@gioia.aioe.org, copy).
4. Changes in the specifications of other words
Yes, the specification for MARKER
should be updated.
But there's no need to mention changes in the behavior of the words that:
a) do interpretation (like
EVALUATE
,INCLUDE-FILE
,INCLUDED
), — it is enough to change 3.4 The Forth text interpreter.b) do finding according to the current search order (like
'
,[']
,POSTPONE
,[COMPILE]
,[DEFINED]
,[UNDEFINED]
), — it is enough to change 3.4.2 Finding definition names (for example, see 16.3.3 Finding definition names).
The items (1) and (2) are more about API than about terminology.
2. rit
structure accessors
I think, suggesting programs to use N CELL+ @
to access xt — is a bad choice. Even for the mentioned counted strings we have the COUNT
word, i.e. an accessor. Therefore, even for a transparent structure, if we provide an access, we should provide the access words. (But I think, we don't need access to xts, see 1.ii below)
1. interpret/compile/postpone xts
"Programs [...] most likely need to use the interpret/compile/postpone xts of the returned recognizer information token. [...] Without these access words standardizing the word RECOGNIZE is doubtful."
We have the following issues with this:
i. Nobody provides a use-case
Nobody provides a use-case for the scenario when a program needs access to these xts.
OTOH we have a one important principle: a user-defined text interpreter should be implementable in a standard way (in this case, without re-implementing recognizers). NB: POSTPONE-action is not needed for this text interpreter.
ii. Better to avoid successors
Perhaps another way is better. Instead of using the corresponding xts, use the words that do the corresponding actions:
INTERPRET-TOKEN ( i*x token{k*x} token-type -- j*x )
COMPILE-TOKEN ( i*x token{k*x} token-type -- j*x )
POSTPONE-TOKEN ( token{k*x} token-type -- )
I.e., in place of
( ... rit ) _R>COMP EXECUTE
you just do
( ... rit ) COMPILE-TOKEN
Rationale: in the most cases a user/program needs just to perform these actions. Getting an xt and then executing it has an excessive step without any profit.
Other issues regarding the API and examples.
6. Naming in A.XY.2
find-name
was proposed to be a standard word that returns a name token nt.
It is better to use another name for your word ( addr len -- xt +/-1 | 0 )
.
I can suggest
find-word-in ( c-addr u wid -- c-addr u 0 | xt immediate-flag true )
find-word ( c-addr u -- c-addr u 0 | xt immediate-flag true )
Rationale: 1) the FIND
word returns c-addr on fail 2) when implementing a text interpreter, on fail you need ( c-addr u ) to convert it into the number; 3) in some cases, always the same number of the result items is an advantage for optimization.
7. Action of postponing isn't essential
i. Why does a user (a program) need to use the POSTPONE-action?
The only known use case is to implement ]] ... [[
construct. But this construct, when implemented via the POSTPONE-action, have a set of known flaws: it doesn't follow copy-pastebility design principle, and (as the result) it doesn't handle the immediate parsing words in a convenient way (including comments, and [IF] ... [THEN]
).
Actually, a user doesn't need to use the POSTPONE-action, but he just needs to postpone fragments of code! Therefore, it is better to provide something like c{ ... }c
construct (see my s-state PoC) that provides full copy-patebility.
ii. Why does a user (a program) have to specify POSTPONE-action for a new token-type?
The only reason is to make a Forth system aware of how to apply POSTPONE to a user-defined literal. But a user doesn't need to apply POSTPONE to the literals if a Forth system provides a way to postpone any fragments of code.
Well, perhaps a Forth system cannot postpone a user-defined literal (or even a parsing word) as part of a fragment of code, if a user doesn't provide POSTPONE-action? But it is wrong. Since any COMPILE-action is defined via the standard words (and the words defined via standard words), then a Forth system is able to postpone any tokens, having a definition of COMPILE-action for them (see the same c-state PoC)
Yes, it is not quite easy in implementation, but it is very convenient in using!
Another open question
8. Dependency on STATE
Obviously, the results of RECOGNIZE may depend on search order and BASE
. Also, a user-defined recognizer may depend on user-defined states.
But what about STATE-dependency for initial recognizer? May the results of RECOGNIZE depend on STATE
?
E.g. recognize-word
from example in A.XY.2 is based on FIND
, and hence it may depend on STATE
. And hence in some cases the result is not allowed to be performed in the different STATE (see also my proposal for FIND clarification).
Have a look at the next version of FIND specification I have designed. It seems this variant is more accurate and quite better.
proposal - Wording: declare undefined interpretation semantics for locals
The same proposal is applied to (LOCAL), taking into account that there another wording is used: "local Execution:". Perhaps we should also harmonize these specifications with each other.
proposal - Wording: declare undefined interpretation semantics for locals
In the general case, "name is executed while in interpretation state" can be achieved via : foo {: name :} name ; 1 foo
— and nothing wrong with that. So the corresponding sentence is even incorrect by its own.