Digest #189 2022-07-17
Contributions
The glossary entry for forget
declares an ambiguous condition:
An ambiguous condition exists if
FORGET
removes a word required for correct execution.
Probably, a similar ambiguous condition should be declared for the words defined via marker
too. Something like the following:
An ambiguous condition exists if definitions or data required for correct execution of the program are removed during execution of name.
Replies
Of course, recurse
may be used in an unnamed definition.
The only place where recurse
is not allowed is the part of a definition after does>
(see the glossary entry for recurse
).
A default rule is:
- for a program: everything which is not explicitly forbidden is allowed;
- for a system: everything which can be detected by a program and is not explicitly allowed is forbidden.
Where a program means a standard program, and a system means a standard system.
@StefanK, thank you for your participation. But it looks like you have missed too many arguments discussed above.
For example, a deferred word forth-recognizer
has a confusing name, and it cannot be acceptable in the API, since it's difficult for the Forth system to detect when its value is changed (NB: it isn't an argument in favor of "stack of recognizers").
: translator ( xt-*lit, xt-compile xt-interpret "name" -- ) create , , , ;
' lit, :noname ( nt -- xt-execute | xt-compile, ) dup >cfa swap immediate? IF execute ELSE compile, THEN ; :noname ( i*x nt -- j*x ) >cfa execute ; translator translate-nt
Also, could you please stick to a consistent and clear terminology?
In this example you create not a token translator, but a named token descriptor (and the corresponding token descriptor object). See Common terminology for recognizers (improvements and critics are welcome).
token descriptor object: an implementation dependent data object that describes how to interpret, how to compile and how to postpone (if any) a token .
I also proposed the following naming convention for the corresponding words:
- For token translators use names in the form
tt-*
— that is the abbreviation oftranslate-token-*
; for example,tt-lit
,tt-nt
. - For token descriptors use names in the form
td-*
— that is the abbreviation oftoken-descriptor-*
; (for example,td-lit
,td-nt
)
The employed approach in your example to create a token descriptor can be called "three components" approach. A significant disadvantage of this approach is that it doesn't provide a way to reuse old descriptors when you create a new descriptor. Compare to token translators — they can be easily reused to create new token translators. For example, a token translator for a pair ( nt nt ) can be created using the token translator tt-nt
for a single nt as:
: tt-2nt ( i*x nt nt -- j*x ) >r tt-nt r> tt-nt ;
To create a token descriptor td-2nt
in the three components approach, you need to put in a lot more effort, and you cannot reuse td-nt
descriptor.
One possible solution is to don't expose the three components approach in the API and instead provide a special method to create a descriptor from another descriptors.
For td-2nt
it can look as:
tt-nt dup 2 descriptor constant tt-2nt
\ or
td{ tt-nt tt-nt }td constant tt-2nt
It seems, a user never needs to provide three components for a new descriptor since any new descriptor is always based on some already defined descriptors.
But the approach based on the token translators is far simpler.
By the way, a well known word to get xt from nt is name> ( nt -- xt )
(see Forth-83 / "C. Experimental proposal" / "Definition field address conversion operators").
I'm wondering what are the general thoughts about a marker word being extended to kill currently running threads defined after a marker word?
It doesn't matter where a thread is defined (i.e., where a definition created by task
is located in the dictionary).
What does matter is where the code fragments and data fragments that can be used in the thread are located in the dictionary. But this problem is undecidable in the general case.
So, terminating a thread when marker is executed is a very partial solution. This solution is harmful, since users would rely on this solution as if it is a reliable solution, and probably will not think whether a marker will remove some data or code that is used by some running thread, if the corresponding task is defined before the marker.
how does one ensure a data or return stack large enough to allow the task to run?
This problem already exists for a standard program: how does one ensure the data and return stacks are large enough to allow the program to run? There is no a standard API for that at the moment.
The only way for a system is to document the initial size for the stacks, and a system defined way to configure it (if any). And for a program — to document the requirements for the stacks, and use a system-defined method to configure these sizes (if any).
And the same is for thread stack size.
Yes, I agree — in the most places of this proposal the term "cooperative" should be used instead of the term "round-robin".
And of course, an API to configure stacks size can be proposed.
Could somebody provide a correct POSTPONE action that cannot be automated?
A postpone action is automated via 1) a reproduce action, 2) a compile action, and 3) the system's compile,
word.
In some edge cases, these components are not consistent with each other, namely a compile action may generate code that is not compatible with what the system's compile,
generates. In such cases a correct postpone action cannot be automatically generated (see also my post Against a reproducer in a token descriptor).
So between a reproduce (partial postpone) action and a full postpone action the latter one should be chosen.
Personally, I prefer not he descriptor-based approach, but the translator-based approach, which allows to implement a postponing mode. This mode works transparently and automatically everywhere. And it is a far more convenient means than postpone
for user-defined literals.