Digest #247 2023-12-27

Contributions

[326] 2023-12-26 18:09:55 lmr wrote:

requestClarification - shadowed names

[…] words with the same name are called in the order newest-to-oldest

Does this mean the system needs to keep around word definitions even after they are shadowed by a re-definition (thus precluding a simple hash-table implementation of wordlists)...?

Replies

[r1154] 2023-12-26 09:08:21 AntonErtl replies:

requestClarification - control stack clearing?

The specification of throw specifies the case when there is no exception frame from the program on the exception stack under the label "there is no exception frame on the exception stack", but systems are free to implement throw and catch in any way that ensures that programs see the specified behaviour. In case of throw an efficient implementation is to handle the "no exception frame" case with a system-supplied exception frame so that throw does not have to distinguish these cases. The suggested reference implementation of throw does not separate out the "no exception frame" case, and the suggested reference implementation of quit provides this system-supplied exception frame. What is missing from the reference implementation of quit is the emptying of the data stack that is specified in throw when there is no exception frame. I'll do another iteration on the reference implementation to fix this.

3.1.5.1 explicitly mentions colon-sys, do-sys, case-sys, of-sys, orig and dest as system-compilation types, and these live on the control-flow stack (usually the data stack). There are also system-execution types (nest-sys and loop-sys) that live on the return stack.

State just selects whether the text interpreter performs interpretation or compilation semantics, and it has no other meaning in the standard (and, e.g., in Gforth). Programs can switch the state with ] and [ (more typically postpone [) around calls to standard words, and if the program switches the state back before the text intrerpreter is reentered again, such switching should not make a difference except for the result of state @ and the few words (such as to) where it is specified that performing their interpretation or compilation semantics in a certain state is ambiguous.

In particular, there are no separate rules for manipulating the data stack for different states. The data stack has traditionally been used as control-flow stack because programs usually do not use the data stack for anything else while defining a word, so the restrictions imposed by using the data stack as control-flow stack are not particularly burdensome. But they exist, because programs are allowed to use the data stack while compiling words. And certain uses are not allowed in a standard program, e.g.

5 : foo literal ;

because the colon-sys blocks the access to the value 5. Instead, you have to write

: foo [ 5 ] literal ;

(Of course, for a literal value like 5, you would normally compile it directly, but let's assume that the value is the result of a call to a word or somesuch).

BTW, the term "primitive" is used for words implemented in a lower-level language (typically assembler) than Forth, not for all pre-defined words. And in most Forth systems, the words dealing with the control-flow stack are defined in Forth.

[r1155] 2023-12-26 09:35:13 AntonErtl replies:

referenceImplementation - The reference implementation is incorrect

[r327] still misses the clearing of the data stack as discussed in [r1154]. The present version also makes the clearing of the exception stack explicit (the parenthesis could be misunderstood as describing the following code).

One other thing to note here is that handler (from the reference implementation of throw) is not reset, because there is never a throw that goes below the exception frame pushed by the catch here; because in the reference implementation the return stack is used as the exception stack, no additional work is necessary to clear the exception stack (partially addressing [259]; however, the specification of quit also needs to be updated).

: QUIT
   RP0 @ RP! ( empty the return stack )
   ... ( set the input source to the user input device )
   POSTPONE [
   BEGIN
     REFILL
   WHILE
     ['] INTERPRET CATCH
     CASE
     0 OF STATE @ 0= IF ." OK" THEN CR ENDOF
     POSTPONE [
     -1 OF ( Aborted )  ENDOF
     -2 OF ( display message from ABORT" ) ENDOF
     ( default ) DUP ." Exception # " .
     ENDCASE
     SP0 @ SP! ( perform the "clear the data stack" action of ABORT )
   REPEAT BYE
;

where sp0 and rp0 are (user) variables containing the addresses that point to the bottom of the data and return stacks, respectively.

Look at this reference implementation together with the one for catch and throw.

[r1156] 2023-12-26 09:58:48 AntonErtl replies:

example - WORDLIST interaction

A wordlist may be dynamically allocated in data space, which means it can come from the dictionary space mentioned by marker (this terminological inconsistency should be fixed). And the wid may be the address of that location, so marker is allowed to delete wids and programs cannot rely on their continued existence. Is marker guaranteed to do that? I find no trace of such a guarantee in the specification (but of course that would require the specification of marker to be updated in the chapter on the search-order wordset).

Consequently, if marker deletes wids, subsequent invocations of wordlist may or may not reuse deleted wids.

Overall, my take on marker is that it is a programmer convenience for those cases where exiting the session and restarting is from the beginning (what I usually do) is too inconvenient. Apart from the minimum guarantees required by the standard, implement as little functionality as you want without unduly inconveniencing your users and as much as you can without unnecessarily complicating your system. Or maybe don't implement marker at all.

[r1157] 2023-12-26 18:22:57 lmr replies:

example - WORDLIST interaction

Thanks, that makes sense. I guess I had in mind a simple "increasing counter" implementation of wid's. I'll restate the question as: once a wid is freed up by MARKER, can it still count towards the (8, or whatever) WORDLIST limit the system imposes? Or must the system only limit "active" wid's (if at all)?

For example, a system implementing u8int wid's (as an index into a fixed vector) would be forced, in the latter case, to reuse wid's after at most 255 paired WORDLIST / MARKER invocations. If the system is allowed to not reuse wid's, OTOH, then it can declare overflow after 255 such invocations, since it would run out of "fresh" wid's.

[r1158] 2023-12-26 18:32:05 AntonErtl replies:

requestClarification - shadowed names

The CfV says:

Whether duplicated nodes are visited or not by TRAVERSE-WORDLIST is undefined.

The text of the CfV was condensed, and this particular clause was omitted, but IIRC there was no intent to change the meaning.

However, note that if you implement marker or forget, you may want to keep the shadowed names anyway.

[r1159] 2023-12-26 18:40:31 AntonErtl replies:

example - WORDLIST interaction

As mentioned, there is no guarantee, so a system could run out of wordlists after 8 calls to WORDLIST, whether there were markers used or not.

However, if you use a counter for generating wids, its straightforward to implement marker such that it saves the counter in the generated marker and that the marker restores the counter.

[r1160] 2023-12-26 19:03:45 lmr replies:

requestClarification - shadowed names

if you implement marker or forget, you may want to keep the shadowed names anyway

Yes, but I can implement MARKER as a "clojure" around deep copies of

the wid-indexed "all-wordlists" vector of word-name -> name-token hashtables
the "current-wid" pointer to the top of the "all-wordlists" vector (as per the MARKER comment https://forth-standard.org/standard/core/MARKER#reply-1159 — I previously thought that some MARKER usage patterns could leave "holes" in this vector, thus requiring the maintenance of some kind of free-list; upon reflecting this doesn't seem possible), and
the "search-order" vector of wid's

It's probably still necessary to maintain shadowed name-tokens (or at least shadowed XTs?) since the user could have saved them using FIND, SEARCH-WORDLIST or the FIND-NAME[-IN] extensions