Digest #247 2023-12-27
[…] words with the same name are called in the order newest-to-oldest
Does this mean the system needs to keep around word definitions even after they are shadowed by a re-definition (thus precluding a simple hash-table implementation of wordlists)...?
The specification of
throw specifies the case when there is no exception frame from the program on the exception stack under the label "there is no exception frame on the exception stack", but systems are free to implement
catch in any way that ensures that programs see the specified behaviour. In case of
throw an efficient implementation is to handle the "no exception frame" case with a system-supplied exception frame so that
throw does not have to distinguish these cases. The suggested reference implementation of
throw does not separate out the "no exception frame" case, and the suggested reference implementation of
quit provides this system-supplied exception frame. What is missing from the reference implementation of
quit is the emptying of the data stack that is specified in
throw when there is no exception frame. I'll do another iteration on the reference implementation to fix this.
188.8.131.52 explicitly mentions colon-sys, do-sys, case-sys, of-sys, orig and dest as system-compilation types, and these live on the control-flow stack (usually the data stack). There are also system-execution types (nest-sys and loop-sys) that live on the return stack.
State just selects whether the text interpreter performs interpretation or compilation semantics, and it has no other meaning in the standard (and, e.g., in Gforth). Programs can switch the state with
[ (more typically
postpone [) around calls to standard words, and if the program switches the state back before the text intrerpreter is reentered again, such switching should not make a difference except for the result of
state @ and the few words (such as
to) where it is specified that performing their interpretation or compilation semantics in a certain state is ambiguous.
In particular, there are no separate rules for manipulating the data stack for different states. The data stack has traditionally been used as control-flow stack because programs usually do not use the data stack for anything else while defining a word, so the restrictions imposed by using the data stack as control-flow stack are not particularly burdensome. But they exist, because programs are allowed to use the data stack while compiling words. And certain uses are not allowed in a standard program, e.g.
5 : foo literal ;
because the colon-sys blocks the access to the value 5. Instead, you have to write
: foo [ 5 ] literal ;
(Of course, for a literal value like 5, you would normally compile it directly, but let's assume that the value is the result of a call to a word or somesuch).
BTW, the term "primitive" is used for words implemented in a lower-level language (typically assembler) than Forth, not for all pre-defined words. And in most Forth systems, the words dealing with the control-flow stack are defined in Forth.
[r327] still misses the clearing of the data stack as discussed in [r1154]. The present version also makes the clearing of the exception stack explicit (the parenthesis could be misunderstood as describing the following code).
One other thing to note here is that
handler (from the reference implementation of
throw) is not reset, because there is never a
throw that goes below the exception frame pushed by the
catch here; because in the reference implementation the return stack is used as the exception stack, no additional work is necessary to clear the exception stack (partially addressing ; however, the specification of
quit also needs to be updated).
RP0 @ RP! ( empty the return stack )
... ( set the input source to the user input device )
['] INTERPRET CATCH
0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
-1 OF ( Aborted ) ENDOF
-2 OF ( display message from ABORT" ) ENDOF
( default ) DUP ." Exception # " .
SP0 @ SP! ( perform the "clear the data stack" action of ABORT )
rp0 are (user) variables containing the addresses that point to the bottom of the data and return stacks, respectively.
A wordlist may be
dynamically allocated in data space, which means it can come from the dictionary space mentioned by
marker (this terminological inconsistency should be fixed). And the wid may be the address of that location, so
marker is allowed to delete wids and programs cannot rely on their continued existence. Is
marker guaranteed to do that? I find no trace of such a guarantee in the specification (but of course that would require the specification of
marker to be updated in the chapter on the search-order wordset).
marker deletes wids, subsequent invocations of
wordlist may or may not reuse deleted wids.
Overall, my take on
marker is that it is a programmer convenience for those cases where exiting the session and restarting is from the beginning (what I usually do) is too inconvenient. Apart from the minimum guarantees required by the standard, implement as little functionality as you want without unduly inconveniencing your users and as much as you can without unnecessarily complicating your system. Or maybe don't implement
marker at all.
Thanks, that makes sense. I guess I had in mind a simple "increasing counter" implementation of
wid's. I'll restate the question as: once a
wid is freed up by
MARKER, can it still count towards the (8, or whatever) WORDLIST limit the system imposes? Or must the system only limit "active" wid's (if at all)?
For example, a system implementing u8int
wid's (as an index into a fixed vector) would be forced, in the latter case, to reuse
wid's after at most 255 paired
MARKER invocations. If the system is allowed to not reuse
wid's, OTOH, then it can declare overflow after 255 such invocations, since it would run out of "fresh"
The CfV says:
Whether duplicated nodes are visited or not by TRAVERSE-WORDLIST is undefined.
The text of the CfV was condensed, and this particular clause was omitted, but IIRC there was no intent to change the meaning.
However, note that if you implement
forget, you may want to keep the shadowed names anyway.
As mentioned, there is no guarantee, so a system could run out of wordlists after 8 calls to
WORDLIST, whether there were markers used or not.
However, if you use a counter for generating wids, its straightforward to implement
marker such that it saves the counter in the generated marker and that the marker restores the counter.
if you implement marker or forget, you may want to keep the shadowed names anyway
Yes, but I can implement
MARKER as a "clojure" around deep copies of
wid-indexed "all-wordlists" vector of word-name -> name-token hashtables
- the "current-wid" pointer to the top of the "all-wordlists" vector (as per the
MARKERcomment https://forth-standard.org/standard/core/MARKER#reply-1159 — I previously thought that some
MARKERusage patterns could leave "holes" in this vector, thus requiring the maintenance of some kind of free-list; upon reflecting this doesn't seem possible), and
- the "search-order" vector of
It's probably still necessary to maintain shadowed name-tokens (or at least shadowed XTs?) since the user could have saved them using
SEARCH-WORDLIST or the